leading BOM and quote character in CSV export

leading BOM and quote character in CSV export

jplorentijplorenti Posts: 1Questions: 0Answers: 0
edited February 2010 in TableTools
Hi,

I'm seeing some issues with the CSV export related to the leading BOM and the use of single quotes rather than double quotes. When using gnumeric to open the CSV, I both get the BOM in the first field and get the trailing single quote of each field in each cell of data. Looking around on the internet, it seems that double quotes are more commonly used for enclosing CSV fields (http://en.wikipedia.org/wiki/Comma-separated_values) and it is not recommended to include the BOM in UTF-8 data ( http://en.wikipedia.org/wiki/Byte_order_mark#cite_note-2 ).
I'm not sure if the decisions to include the BOM and use single quotes were due to problems experienced by others, but perhaps you'd consider making these aspects of the export configurable?

Thanks,

John Paul

Replies

  • allanallan Posts: 61,446Questions: 1Answers: 10,054 Site admin
    Hi John,

    The CSV boundary character is already configurable using: TableToolsInit.sCsvBoundary (set this just before you initialise your DataTable).

    Regarding the UTF-8 BOM. Yes perhaps I was being a bit too "correct" when I included that - although since including it I've not had a single report about UTF-8 script characters going haywire - which I frequently got when it wasn't there. I agree - it would be good to make this a configurable option (added to the to-do list). Unfortunately my trial version of Flash has now expired so I can't make this change just now... :-(. I'd had thoughts about trying to compile the swf needed using Flex's mxmlc (which is free) but not really got around to hacking around with that yet. Wish Adobe had an open-source friendly discount...

    Regards,
    Allan
  • djnohadjnoha Posts: 2Questions: 0Answers: 0
    Greetings,

    Just adding my two cents here: if and when the Flash gets tweaked, making the UTF-8 BOM optional would be a good idea. But I'd recommend even more strongly that the default field boundary should be changed to double-quote (though it is trivial to reconfigure, I know). I've seen thousands of CSV files in my day - including a 5-year stint in the e-commerce/DM industry, where CSV files abound - and the TableTools output is the first time I've ever seen a single quote used instead of double-quote. Excel and gnumeric don't consider a single-quote as a quoting character in CSV, but they both do the right thing with double-quotes. There is a (late to the game) RFC for CSV that I think reflects the real-world usage well: http://tools.ietf.org/html/rfc4180

    All that aside... DataTables is really awesome, and TableTools is a quite useful extension as-is.

    Thanks!

    David Noha
    Sonoma Technology, Inc.
  • allanallan Posts: 61,446Questions: 1Answers: 10,054 Site admin
    Hi David,

    Thanks very much for the feedback. The post you've commented on is quite an old one and out-of-date now - TableTools 2 uses the Flex compiler to generate the SWFs, so it's no problem to tweak them now. However, a change to the SWF isn't needed for the BOM in TableTools 2 - it can be controlled with the bBomInc ( http://datatables.net/extras/tabletools/button_options#bBomInc ) parameter.

    As for the CSV boundary - I've just changed the default and committed it in. The TableTools nightly carries this fix now ( http://datatables.net/download/ ) and it will be included in the next release.

    Regards,
    Allan
  • djnohadjnoha Posts: 2Questions: 0Answers: 0
    Thanks for the response Allan! I will check out that bBomInc parameter. It hasn't been a problem yet, but we are working on a cross-platform project and I suspect it may pop up as a compatibility issue in the future.
    (And sorry for adding a comment to an old post - I generally try not to introduce new topics in web forums if there's a non-ancient post on the same subject... but for DataTables, maybe Feb 2010 should be considered ancient. :)
This discussion has been closed.