1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124
|
CSV module for Ruby
Version 1.2.1
NAKAMURA, Hiroshi
- Introduction
Ruby library to parse or generate data in CSV format.
- Install
$ ruby install.rb
It will install lib/csv.rb to your site_ruby directory such as
/usr/local/lib/ruby/site_ruby/1.6/.
- Uninstall
Delete csv.rb from your site_ruby directory.
- What is 'CSV'?
CSV: Comma Separated Value.
In http://www.wotsit.org/, CSV is defined as follows.
<CSV_file> ::= { <CSV_line> }
<CSV_line> ::= <value> { "," <value> } <spaces_and_tabs> <CRLF>
<value> ::= <spaces_and_tabs>
(
{ <any_text_except_quotas_and_commas_and_smth_else> }
| <single_or_double_quote>
<any_text_save_CRLF_with_corresponding_doubled_quotas>
<the_same_quote>
)
[...]
... and there is some problem with this format:
different database systems have different definitions of the
term <any_text_except_quotas_and_commas_and_smth_else> :)
So, I defined CSV format, in my module, as follows. Fortunately, this is
compatible with Microsoft's Excel(at least Excel '97 or later),
and other applications like spread-sheets, DB, and so on.
Record separator: CR + LF
Field separator: ,(comma) by default; configurable
Quote data like "..." if contains CR, LF, or ,(comma).
Convert " -> "" when quoted.
Field "" means null string. ( ex. some-data,"",some-data )
Field which has no data means NULL. ( ex. some-data,,some-data )
- Usage
Add
require 'csv'
to your code. For more details, see http://rrr.jin.gr.jp/doc/csv/ or
comments in lib/csv.rb .
- Copying
This module is copyrighted free software by NAKAMURA, Hiroshi.
You can redistribute it and/or modify it under the same term as Ruby.
- Author
Name: NAKAMURA, Hiroshi
E-mail: nahi@keynauts.com
- History
January 11, 2003 - version 1.2.1
Let CSV::Reader include Enumerable. This one line adding is the difference.
November 9, 2002 - version 1.2.0
parseBody: change parsing state machine;
allow single "\n" as a record separator. only "\r\n" sequence is allowd.
raise IllegalFormatException when "\r" or "\n" is not quoted with "...".
Thanks to Brad Hilton.
October 26, 2002 - version 1.1.2
Quoting field data with " when the field contains \r or \n. In the past,
it only recognize \r\n sequence as a record separator so it quoted field
data when the field contains \r\n sequence. For interoperability point of
view, I changed this behaviour. Thanks to Brad Hilton who pointed out this
discrepancy.
Reading BOM attached UTF-8 encoded file. It only ignores UTF-8 BOM
sequence at the beginning of input stream. Is there any need to check
other BOM sequences of Unicode CESs?
April 13, 2002 - version 1.1.1
Added a parameter called colSep to specify column separator. Supports TSV,
S(semicolon)SV and so on except CSV.
Feburary 28, 2002 - version 1.1.0
Added high-level user friendly interfaces which uses iterator. See
comment in lib/csv.rb how new interfaces works. Old native interfaces
still alive.
October 3, 2001 - version 1.0.2
Fixed a bug in ColData.colsMatch. Returned true when each data string is
the same even if either data is null.
Number of columns returned from createLine and parseLine was illegal in
boundary condition.
parseLine has always returned empty data as the last column.
Thanks to Shigitani-san.
September 12, 2001 - version 1.0.1
"foo,[EOF]" was illegally parsed as [ "foo", "" ]. It should be parsed as
[ "foo", nil ]. Fixed. Thanks to keiichi matsunaga-san.
September 3, 2000 - version 1.0.0
Official release.
Jul 31, 1999 - version 0.1
Initial version.
|