|
|
Comma-separated ValuesThe comma-separated values (CSV) file format is a tabular data format that has fields separated by the comma character and quoted by the double quote character. If a field's value contains a double quote character it is escaped with a pair of double quote characters. The CSV file format does not require a specific character encoding, byte order or line terminator format. It is often not required by software to have fields quoted unless they contain a comma character. Formal Specifications While no formal specification for CSV exists, there are several informal documents describing the format (1, 2, 3 and 4). MIME Type There are several informal MIME types used for CSV including "application/csv", "text/csv", "text/x-csv", etc. There is an IETF draft seeking formal registration of the "text/csv" type. Example "Chicane", "Love on the Run", "Knight Rider", "This field contains a comma, but it doesn't matter as the field is quoted" "Samuel Barber", "Adagio for Strings", "Classical", "This field contains a double quote character, "", but it doesn't matter as it is escaped" Application support The CSV file format is a very simple data file format that is supported by almost all spreadsheet software such as Excel (Careful: some local versions use semicolon instead of comma!) and Gnumeric. Any programming language that has input/output and string processing functionality will be able to read and write CSV files. CSV is similar in ubiquity for tabular data as ASCII files are for text data. Programming language tools C/C++ Michael Allen's CSV module is small, complete, and robust. Perl With DBI CSV files can be accessed via SQL statements through DBI using a driver such as DBD::CSV or DBD::AnyData. With Regular Expressions CSV files can be manipulated using Perl's built in text processing capabilites. For instance, the following code will convert a comma delimited data into colon delimited data. perl -ne 'print join q(:),(split /,/,$_)' < input.csv > output.csv Java Direct interface CSVReader/Writer provides a simple Java interface to CSV file I/O and is free. The Java CSV Library is an open-source (LGPL) currently in beta. Stephen Ostermiller has released a libraryhttp://ostermiller.org/utils/ExcelCSV.html under the GPL to read and write CSV for Excel. This CSV class is small, complete, and has been widely used in production environments. Ricebridge Java CSV Component is a commercial CSV interface for high-speed, high-volume data handling. JDBC interface CsvJdbc is a read-only JDBC driver released under the LGPL. StelsCSV is a commercial JDBC driver for CSV file databases. It supports much of SQL'92. FOSITEX by i-net software also includes a CSV JDBC driver http://www.inetsoftware.de/English/produkte/FOSITEX/default.htm. On Microsoft Windows one can access a CSV file through SQL using ODBC. See Using CSV Files as Databases and Interacting with Them Using Java. Python Python has a csv module in the standard library since version 2.3. Utilities The csvprint utility will reformat CSV input based on a format string. This can be useful for reordering fields or generating source code or tables as illustrated in the following example: $ csvprint data.csv "\t{ %0, %1, %2, \"%3\" },\n" { 0xC0000008, 0x00060001, NT_STATUS_INVALID_HANDLE, "The handle is invalid." },
|
 |