Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Python (http://www.velocityreviews.com/forums/f43-python.html)
-   -   Text file with mixed end-of-line terminations (http://www.velocityreviews.com/forums/t753472-text-file-with-mixed-end-of-line-terminations.html)

Alex van der Spek 08-31-2011 07:37 PM

Text file with mixed end-of-line terminations
 
I have a text file that uses both '\r' and '\r\n' end-of-line terminations.

The '\r' terminates the first 25 lines or so, the remainder is termiated
with '\r\n'

Reading this file like this:

++++++++
for line in open(filename,'r'):
line= #Do whatever needs doing...
++++++++

The first line read is actually a string consiting of the first 25 lines.
The readline() method does the same thing.

Is there a way to make it read one line at a time, regardless of the line
termination?

By the way, the newlines attribute reports None after reading a few lines. I
tried on Linux and Windows. I use the standard binaries as distributed.

Thanks in advance,
Alex van der Spek



Chris Rebert 08-31-2011 07:58 PM

Re: Text file with mixed end-of-line terminations
 
On Wed, Aug 31, 2011 at 12:37 PM, Alex van der Spek <zdoor@xs4all.nl> wrote:
> I have a text file that uses both '\r' and '\r\n' end-of-line terminations.
>
> The '\r' terminates the first 25 lines or so, the remainder is termiated
> with '\r\n'

<snip>
> Is there a way to make it read one line at a time, regardless of the line
> termination?


Universal Newline Support
http://www.python.org/dev/peps/pep-0278/

http://docs.python.org/library/functions.html#open
(Modes involving "U")

Cheers,
Chris

woooee 09-01-2011 07:17 PM

Re: Text file with mixed end-of-line terminations
 
You can use f.read() to read the entire file's contents into a string,
providing the file isn't huge. Then, split on "\r" and replace "\n"
when found.
A simple test:
input_data = "abc\rdef\rghi\r\njkl\r\nmno\r\n"
first_split = input_data.split("\r")
for rec in first_split:
rec = rec.replace("\n", "")
print rec


All times are GMT. The time now is 09:16 AM.

Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.


1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57