What is CSV?
CSV (Comma-Separated Values) is a plain text file format that stores tabular data using commas to separate values and newlines to separate records. It is one of the most common formats for data exchange between applications, especially spreadsheets and databases.
Quick Facts
| Full Name | Comma-Separated Values |
|---|---|
| Created | Early 1970s (predates personal computers) |
| Specification | Official Specification |
How It Works
CSV files represent data in a simple, human-readable format where each line is a data record and each record consists of fields separated by commas. While the format seems straightforward, variations exist in handling special characters, quotes, and different delimiters. Fields containing commas, newlines, or quotes are typically enclosed in double quotes. CSV is widely supported by spreadsheet applications like Excel and Google Sheets, databases, and programming languages, making it ideal for data import/export operations.
Key Characteristics
- Plain text format readable by humans and machines
- Uses commas as default field delimiter
- Each line represents one data record
- First row often contains column headers
- Fields with special characters enclosed in double quotes
- No standard specification - implementations may vary
Common Use Cases
- Exporting data from spreadsheets and databases
- Data exchange between different applications
- Importing bulk data into systems
- Simple data storage for small datasets
- Log file formats and data analysis
Example
Loading code...Frequently Asked Questions
How do I handle commas within CSV field values?
When a field value contains a comma, you should enclose the entire field in double quotes. For example: "Smith, John" would be a valid field. Most CSV parsers automatically handle this, but when creating CSV manually, always quote fields containing commas, newlines, or double quotes.
What is the difference between CSV and TSV formats?
CSV (Comma-Separated Values) uses commas as delimiters between fields, while TSV (Tab-Separated Values) uses tab characters. TSV can be advantageous when your data frequently contains commas, as it reduces the need for quoting. Both formats are widely supported by spreadsheet applications and databases.
How should I handle special characters and encoding in CSV files?
CSV files should use UTF-8 encoding for international character support. When opening CSV in Excel, you may need to specify encoding explicitly. For special characters like double quotes within fields, escape them by doubling ("" represents a single quote). Always specify encoding when reading/writing CSV programmatically.
Why does my CSV file look different in Excel versus a text editor?
Excel interprets CSV data and formats it into cells, while text editors show the raw comma-separated text. Excel may also auto-format data (like dates or numbers with leading zeros), which can cause data loss. To preserve data integrity, use 'Import Data' feature in Excel instead of double-clicking the file.
Should CSV files include a header row?
While not technically required, including a header row is strongly recommended as a best practice. Headers make the data self-documenting, easier to understand, and most data processing tools expect them. The header row should contain unique, descriptive column names without special characters.