Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > 1.9 CSV Parsing Issues

Reply
Thread Tools

1.9 CSV Parsing Issues

 
 
Kenny Lam
Guest
Posts: n/a
 
      11-04-2010
I'm currently porting a script to 1.9 and I'm having problems getting
CSV parsing to work. This script worked fine in 1.8.7 and used the
FasterCSV library for parsing. After playing around in the IRB, I have
determined that the current parser seems incapable of handling newlines
as row seperators (a rather basic and important feature).

I tested with a simple file whose contents are:
field1,field2
field3,field4

This file was created using a basic text editor and does not contain any
unorthodox newline characters. Attempting to parse this file results in
the following error:

C:/Ruby192/lib/ruby/1.9.1/csv.rb:1885:in `block (2 levels) in shift':
Unquoted fields do not allow \r or \n (line 1). (CSV::MalformedCSVError)
from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1856:in `each'
from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1856:in `block in shift'
from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1818:in `loop'
from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1818:in `shift'
from C:/Ruby192/lib/ruby/1.9.1/csv.rb:1760:in `each'

The return value of the opened csv file shows row_sep to be "\r\n" which
seems correct. I have tried manually setting the value of row_sep when
calling CSV:pen but I get the same issue.

Once again, I do not have this problem with FasterCSV under 1.8.7 (which
as I understand, is the same code used in 1.9's csv library). I'm using
Ruby 1.9.2p0 on Windows XP. I would greatly appreciate any help.

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
James Edward Gray II
Guest
Posts: n/a
 
      11-04-2010
On Nov 4, 2010, at 1:40 PM, Kenny Lam wrote:

> I'm currently porting a script to 1.9 and I'm having problems getting
> CSV parsing to work.


> I tested with a simple file whose contents are:
> field1,field2
> field3,field4


CSV should definitely handle that data. Indeed it does for me:

$ ruby -v -r csv -e 'p CSV.parse("field1,field2\r\nfield3,field4\r\n")'
ruby 1.9.2dev (2010-04-28 trunk 27536) [x86_64-darwin10.3.0]
[["field1", "field2"], ["field3", "field4"]]

> This file was created using a basic text editor and does not contain any
> unorthodox newline characters.


Can we see exactly what the file does contain, with code like:

$ ruby -e 'p File.read("path/to/file.csv")'

?

James Edward Gray II


 
Reply With Quote
 
 
 
 
Kenny Lam
Guest
Posts: n/a
 
      11-04-2010
File.read shows "field1,field2\nfield3,field4\n"
I have played around with the some of the other methods and have
determined that this problem only seems to occur when using CSV:pen
and then looped through with CSV::each. CSV::foreach and CSV:arse
seem fine. Unfortunately, I need to use CSV:pen because I need a
reference to the opened file object in order to do some file cursor
manipulation.

Other things I have noted is that when running CSV.open('file','r') the
result is show:
<#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
col_sep:"," row_sep:"\r\n" quote_char:"\"">

While CSV.open('test.log','r',:row_sep => '\r\n') shows result:
<#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
col_sep:"," row_sep:"\\r\\n" quote_char:"\"">

The double backslashes make me question if the escape character is being
processed correctly. I am relatively new to Ruby, am I using the
language incorrectly or is this a bug?

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
James Edward Gray II
Guest
Posts: n/a
 
      11-04-2010
On Nov 4, 2010, at 2:26 PM, Kenny Lam wrote:

> File.read shows "field1,field2\nfield3,field4\n"


Great. That's what we expected to see. You are right about the =
content.

> I have played around with the some of the other methods and have
> determined that this problem only seems to occur when using CSV:pen
> and then looped through with CSV::each. CSV::foreach and CSV:arse
> seem fine.


Ah, and let me guess, you always pass a read mode of 'r' to open(), =
right? CSV is clever and it shuts off Ruby's line ending translation on =
Windows using 'rb' if you don't specify a mode. By specify a mode, you =
leave this feature on which allows Ruby to switch \r\n to \n as it did =
with the read above.

> Unfortunately, I need to use CSV:pen because I need a
> reference to the opened file object in order to do some file cursor
> manipulation.


No worries, open() is going to work for you.

> Other things I have noted is that when running CSV.open('file','r') =

the
> result is show:
> <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0
> col_sep:"," row_sep:"\r\n" quote_char:"\"">
>=20
> While CSV.open('test.log','r',:row_sep =3D> '\r\n') shows result:
> <#CSV io_type:File io_path:"/log/test.log" encoding:CP850 lineno:0=20
> col_sep:"," row_sep:"\\r\\n" quote_char:"\"">
>=20
> The double backslashes make me question if the escape character is =

being
> processed correctly. I am relatively new to Ruby, am I using the
> language incorrectly or is this a bug?


You have a misunderstanding of Ruby Strings. Double quotes allow for =
escapes like \r or \n, but single quotes do not. You've set the =
:row_sep to literally slash, r, slash, and n.

I image all you need to do is switch your open() call to:

CSV.open('path/to/file')

The library should take it from there.

Hope that helps.

James Edward Gray II=

 
Reply With Quote
 
Kenny Lam
Guest
Posts: n/a
 
      11-04-2010
Excellent, that works perfectly. Thanks a lot for your help.

--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
James Edward Gray II
Guest
Posts: n/a
 
      11-04-2010
On Nov 4, 2010, at 2:52 PM, Kenny Lam wrote:

> Excellent, that works perfectly. Thanks a lot for your help.


My pleasure.

James Edward Gray II

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
read and write csv file using csv module jliu66 Python 0 10-19-2007 03:12 PM
How to move data from a CSV file to a JTable, and from a JTable to a CSV file ? Tintin92 Java 1 02-14-2007 06:51 PM
Re: csv writerow creates double spaced excel csv files Skip Montanaro Python 0 02-13-2004 08:50 PM
csv writerow creates double spaced excel csv files Michal Mikolajczyk Python 0 02-13-2004 08:38 PM
Perl expression for parsing CSV (ignoring parsing commas when in double quotes) GIMME Perl 2 02-11-2004 05:40 PM



Advertisments