1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245
|
.\" Man page generated from reStructuredText.
.
.
.nr rst2man-indent-level 0
.
.de1 rstReportMargin
\\$1 \\n[an-margin]
level \\n[rst2man-indent-level]
level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
-
\\n[rst2man-indent0]
\\n[rst2man-indent1]
\\n[rst2man-indent2]
..
.de1 INDENT
.\" .rstReportMargin pre:
. RS \\$1
. nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin]
. nr rst2man-indent-level +1
.\" .rstReportMargin post:
..
.de UNINDENT
. RE
.\" indent \\n[an-margin]
.\" old: \\n[rst2man-indent\\n[rst2man-indent-level]]
.nr rst2man-indent-level -1
.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
.in \\n[rst2man-indent\\n[rst2man-indent-level]]u
..
.TH "IN2CSV" "1" "Aug 16, 2024" "2.2.0" "csvkit"
.SH NAME
in2csv \- in2csv Documentation
.SH DESCRIPTION
.sp
Converts various tabular data formats into CSV.
.sp
Converting fixed width requires that you provide a schema file with the \(dq\-s\(dq option. The schema file should have the following format:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
column,start,length
name,0,30
birthday,30,10
age,40,3
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
The header line is required though the columns may be in any order:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
usage: in2csv [\-h] [\-d DELIMITER] [\-t] [\-q QUOTECHAR] [\-u {0,1,2,3}] [\-b]
[\-p ESCAPECHAR] [\-z FIELD_SIZE_LIMIT] [\-e ENCODING] [\-L LOCALE]
[\-S] [\-\-blanks] [\-\-null\-value NULL_VALUES [NULL_VALUES ...]]
[\-\-date\-format DATE_FORMAT] [\-\-datetime\-format DATETIME_FORMAT]
[\-H] [\-K SKIP_LINES] [\-v] [\-l] [\-\-zero] [\-V]
[\-f {csv,dbf,fixed,geojson,json,ndjson,xls,xlsx}] [\-s SCHEMA]
[\-k KEY] [\-n] [\-\-sheet SHEET] [\-\-write\-sheets WRITE_SHEETS]
[\-\-use\-sheet\-names] [\-\-reset\-dimensions]
[\-\-encoding\-xls ENCODING_XLS] [\-y SNIFF_LIMIT] [\-I]
[FILE]
Convert common, but less awesome, tabular data formats to CSV.
positional arguments:
FILE The CSV file to operate on. If omitted, will accept
input as piped data via STDIN.
optional arguments:
\-h, \-\-help show this help message and exit
\-f {csv,dbf,fixed,geojson,json,ndjson,xls,xlsx}, \-\-format {csv,dbf,fixed,geojson,json,ndjson,xls,xlsx}
The format of the input file. If not specified will be
inferred from the file type.
\-s SCHEMA, \-\-schema SCHEMA
Specify a CSV\-formatted schema file for converting
fixed\-width files. See web documentation.
\-k KEY, \-\-key KEY Specify a top\-level key to look within for a list of
objects to be converted when processing JSON.
\-n, \-\-names Display sheet names from the input Excel file.
\-\-sheet SHEET The name of the Excel sheet to operate on.
\-\-write\-sheets WRITE_SHEETS
The names of the Excel sheets to write to files, or
\(dq\-\(dq to write all sheets.
\-\-use\-sheet\-names Use the sheet names as file names when \-\-write\-sheets
is set.
\-\-reset\-dimensions Ignore the sheet dimensions provided by the XLSX file.
\-\-encoding\-xls ENCODING_XLS
Specify the encoding of the input XLS file.
\-y SNIFF_LIMIT, \-\-snifflimit SNIFF_LIMIT
Limit CSV dialect sniffing to the specified number of
bytes. Specify \(dq0\(dq to disable sniffing entirely, or
\(dq\-1\(dq to sniff the entire file.
\-I, \-\-no\-inference Disable type inference (and \-\-locale, \-\-date\-format,
\-\-datetime\-format, \-\-no\-leading\-zeroes) when parsing
CSV input.
Some command\-line flags only pertain to specific input formats.
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
See also: \fI\%Arguments common to all tools\fP\&.
.sp
\fBNOTE:\fP
.INDENT 0.0
.INDENT 3.5
The \(dqndjson\(dq format refers to \(dqnewline delimited JSON\(dq, as used by many streaming APIs.
.UNINDENT
.UNINDENT
.sp
\fBNOTE:\fP
.INDENT 0.0
.INDENT 3.5
If an XLS looks identical to an XLSX when viewed in Excel, they may not be identical as CSV. For example, XLSX has an integer type, but XLS doesn\(aqt. Numbers that look like integers from an XLS will have decimals in CSV, but those from an XLSX won\(aqt.
.UNINDENT
.UNINDENT
.sp
\fBNOTE:\fP
.INDENT 0.0
.INDENT 3.5
To convert from HTML, consider \fI\%messytables\fP\&.
.UNINDENT
.UNINDENT
.SH EXAMPLES
.sp
Convert the 2000 census geo headers file from fixed\-width to CSV and from latin\-1 encoding to utf8:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv \-e iso\-8859\-1 \-f fixed \-s examples/realdata/census_2000/census2000_geo_schema.csv examples/realdata/census_2000/usgeo_excerpt.upl
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
\fBNOTE:\fP
.INDENT 0.0
.INDENT 3.5
A library of fixed\-width schemas is maintained in the \fBffs\fP project:
.sp
\fI\%https://github.com/wireservice/ffs\fP
.UNINDENT
.UNINDENT
.sp
Convert an Excel .xls file:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv examples/test.xls
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
Standardize the formatting of a CSV file (quoting, line endings, etc.):
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv examples/realdata/FY09_EDU_Recipients_by_State.csv
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
Fetch csvkit\(aqs open issues from the GitHub API, convert the JSON response into a CSV and write it to a file:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
curl https://api.github.com/repos/wireservice/csvkit/issues?state=open | in2csv \-f json \-v
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
Convert a DBase DBF file to an equivalent CSV:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv examples/testdbf.dbf
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
This tool names unnamed headers. To avoid that behavior, run:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv \-\-no\-header\-row examples/test.xlsx | tail \-n +2
.ft P
.fi
.UNINDENT
.UNINDENT
.SH TROUBLESHOOTING
.sp
If an error like the following occurs when providing an input file in CSV or Excel format:
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
ValueError: Row 0 has 11 values, but Table only has 1 columns.
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
Then the input file might have initial rows before the header and data rows. You can skip such rows with \fB\-\-skip\-lines\fP (\fB\-K\fP):
.INDENT 0.0
.INDENT 3.5
.sp
.nf
.ft C
in2csv \-\-skip\-lines 3 examples/test_skip_lines.csv
.ft P
.fi
.UNINDENT
.UNINDENT
.sp
If an XLSX file yields too few rows or too few columns, then the application that created the file might have \fI\%incorrectly set the worksheet\(aqs dimensions\fP\&. Try again with the \fB\-\-reset\-dimensions\fP option.
.SH AUTHOR
Christopher Groskopf and contributors
.SH COPYRIGHT
2016, Christopher Groskopf and James McKinney
.\" Generated by docutils manpage writer.
.
|