Quotation marks are messing up my CSV

posted Feb 1, 2012, 11:20 AM by Ezra Kenigsberg   [ updated Feb 7, 2012, 2:15 PM ]

#DataTipOfTheDay 2012-02-01

Q: "I exported a CSV from System X and quotation marks in text fields are messing everything up! What do I do?"

A: I would first see if the data can be exported differently. Ideally, it should be exported so that two double-quotes appear every time a double-quote appears within a field's value.

For example, if my table looks like this:

Year    Quotation
1776    Jefferson wrote "life, liberty and the pursuit of happiness."

My CSV export ideally looks like this:

"Year","Quotation"
"1776","Jefferson wrote ""life, liberty and the pursuit of happiness."""

That’s the standard format Microsoft apps (and Salesforce, and Data Loader) use with CSVs.

If the data can’t be exported like that, then the next-best answer would be one of a couple of hacky workarounds, depending on what’s possible:
  • (1) Export the data in a format other than a CSV, then (2) import it into another app (like MSAccess), and finally (3) export the table from MSAccess as a CSV.
  • Replace every double-quote with a curly double-quote before exporting.
  • Replace every double-quote with a single-quote before exporting.
  • Determine how many rows are messed up, and if it’s a small number (say, under a hundred), remove them from the main file, fix them by hand and import them separately.
Comments