
Converting CSV to JSON: The Edge Cases That Break Naive Implementations
Converting CSV to JSON seems like a trivial task. Split by newlines, split by commas, use the first row as keys. Ten lines of code. Ship it. Then you encounter a value with a comma inside quotes, a field with an embedded newline, a header with spaces, or a file with inconsistent quoting. Your ten-line solution breaks, and you learn why proper CSV parsing is a solved but non-trivial problem. The CSV specification RFC 4180 defines the CSV format. The rules that trip people up: Fields containing commas must be quoted : "New York, NY" is one field, not two. Fields containing double quotes must be escaped : The value He said "hello" is encoded as "He said ""hello""" . Double quotes inside a quoted field are escaped by doubling them. Fields containing newlines must be quoted : A single field can span multiple lines if it is quoted. This breaks every implementation that splits on newlines first. The header row is optional : Some CSV files have headers, some do not. There is no reliable way to
Continue reading on Dev.to Beginners
Opens in a new tab




