File: csvjoin.rst

package info (click to toggle)
csvkit 2.2.0-1
  • links: PTS, VCS
  • area: main
  • in suites: forky, sid
  • size: 40,664 kB
  • sloc: python: 4,924; perl: 1,000; makefile: 131; sql: 4
file content (74 lines) | stat: -rw-r--r-- 3,480 bytes parent folder | download
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
=======
csvjoin
=======

Description
===========

Merges two or more CSV tables together using a method analogous to SQL JOIN operation. By default it performs an inner join, but full outer, left outer, and right outer are also available via flags. Key columns are specified with the -c flag (either a single column which exists in all tables, or a comma-separated list of columns with one corresponding to each). If the columns flag is not provided then the tables will be merged "sequentially", that is they will be merged in row order with no filtering:

.. code-block:: none

   usage: csvjoin [-h] [-d DELIMITER] [-t] [-q QUOTECHAR] [-u {0,1,2,3}] [-b]
                  [-p ESCAPECHAR] [-z FIELD_SIZE_LIMIT] [-e ENCODING] [-L LOCALE]
                  [-S] [--blanks] [--null-value NULL_VALUES [NULL_VALUES ...]]
                  [--date-format DATE_FORMAT] [--datetime-format DATETIME_FORMAT]
                  [-H] [-K SKIP_LINES] [-v] [-l] [--zero] [-V] [-c COLUMNS]
                  [--outer] [--left] [--right] [-y SNIFF_LIMIT] [-I]
                  [FILE [FILE ...]]

   Execute a SQL-like join to merge CSV files on a specified column or columns.

   positional arguments:
     FILE                  The CSV files to operate on. If only one is specified,
                           it will be copied to STDOUT.

   optional arguments:
     -h, --help            show this help message and exit
     -c COLUMNS, --columns COLUMNS
                           The column name(s) on which to join. Should be either
                           one name (or index) or a comma-separated list with one
                           name (or index) per file, in the same order in which
                           the files were specified. If not specified, the two
                           files will be joined sequentially without matching.
     --outer               Perform a full outer join, rather than the default
                           inner join.
     --left                Perform a left outer join, rather than the default
                           inner join. If more than two files are provided this
                           will be executed as a sequence of left outer joins,
                           starting at the left.
     --right               Perform a right outer join, rather than the default
                           inner join. If more than two files are provided this
                           will be executed as a sequence of right outer joins,
                           starting at the right.
     -y SNIFF_LIMIT, --snifflimit SNIFF_LIMIT
                           Limit CSV dialect sniffing to the specified number of
                           bytes. Specify "0" to disable sniffing entirely, or
                           "-1" to sniff the entire file.
     -I, --no-inference    Disable type inference (and --locale, --date-format,
                           --datetime-format, --no-leading-zeroes) when parsing
                           the input.

   Note that the join operation requires reading all files into memory. Don't try
   this on very large files.

See also: :doc:`../common_arguments`.

Examples
========

.. code-block:: bash

   csvjoin -c 1 examples/join_a.csv examples/join_b.csv

Add two empty columns to the right of a CSV:

.. code-block:: bash

   echo "," | csvjoin examples/dummy.csv -

Add a single column to the right of a CSV:

.. code-block:: bash

   echo "new-column" | csvjoin examples/dummy.csv -