Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Fixed length lists from .split()?

Reply
Thread Tools

Fixed length lists from .split()?

 
 
Bob Greschke
Guest
Posts: n/a
 
      01-26-2007
I'm reading a file that has lines like

bcsn; 1000000; 1223
bcsn; 1000001; 1456
bcsn; 1000003
bcsn; 1000010; 4567

The problem is the line with only the one semi-colon.
Is there a fancy way to get Parts=Line.split(";") to make Parts always
have three items in it, or do I just have to check the length of Parts
and loop to add the required missing items (this one would just take
Parts+=[""], but there are other types of lines in the file that have
about 10 "fields" that also have this problem)?

Thanks!

Bob

 
Reply With Quote
 
 
 
 
Duncan Booth
Guest
Posts: n/a
 
      01-26-2007
Bob Greschke <> wrote:

> Is there a fancy way to get Parts=Line.split(";") to make Parts always
> have three items in it, or do I just have to check the length of Parts
> and loop to add the required missing items (this one would just take
> Parts+=[""], but there are other types of lines in the file that have
> about 10 "fields" that also have this problem)?


>>> def nsplit(s, sep, n):

return (s.split(sep) + [""]*n)[:n]

>>> nsplit("bcsn; 1000001; 1456", ";", 3)

['bcsn', ' 1000001', ' 1456']
>>> nsplit("bcsn; 1000001", ";", 3)

['bcsn', ' 1000001', '']
>>>

 
Reply With Quote
 
 
 
 
Bob Greschke
Guest
Posts: n/a
 
      01-26-2007
On 2007-01-26 11:13:56 -0700, Duncan Booth <> said:

> Bob Greschke <> wrote:
>
>> Is there a fancy way to get Parts=Line.split(";") to make Parts always
>> have three items in it, or do I just have to check the length of Parts
>> and loop to add the required missing items (this one would just take
>> Parts+=[""], but there are other types of lines in the file that have
>> about 10 "fields" that also have this problem)?

>
>>>> def nsplit(s, sep, n):

> return (s.split(sep) + [""]*n)[:n]
>
>>>> nsplit("bcsn; 1000001; 1456", ";", 3)

> ['bcsn', ' 1000001', ' 1456']
>>>> nsplit("bcsn; 1000001", ";", 3)

> ['bcsn', ' 1000001', '']


That's fancy enough. I didn't know you could do [""]*n. I never
thought about it before.

Thanks!

Bob

 
Reply With Quote
 
Dennis Lee Bieber
Guest
Posts: n/a
 
      01-27-2007
On Fri, 26 Jan 2007 11:26:46 -0700, Bob Greschke <>
declaimed the following in comp.lang.python:

>
> That's fancy enough. I didn't know you could do [""]*n. I never
> thought about it before.
>

My first thought was getting it from the other side...

>>> def nsplit(st, sp, n):

.... return (st + (sp*n)).split(sp)[:n]
....
>>> nsplit("this;is;a;sample", ";", 10)

['this', 'is', 'a', 'sample', '', '', '', '', '', '']

To the string to be split, append enough separators to ensure the
desired number of fields, perform the split, and return the desired
number of resultant parts.

Of course, if the string is longer than "n", it will only return the
leftmost "n" parts.

>>> nsplit("this;is;a;sample", ";", 4)

['this', 'is', 'a', 'sample']
>>> nsplit("this;is;a;sample", ";", 3)

['this', 'is', 'a']
>>>

--
Wulfraed Dennis Lee Bieber KD6MOG

HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: web-)
HTTP://www.bestiaria.com/
 
Reply With Quote
 
bearophileHUGS@lycos.com
Guest
Posts: n/a
 
      01-27-2007
Duncan Booth:
> def nsplit(s, sep, n):
> return (s.split(sep) + [""]*n)[:n]


Another version, longer:

from itertools import repeat

def nsplit(text, sep, n):
"""
>>> nsplit("bcsn; 1000001; 1456", ";", 3)

['bcsn', ' 1000001', ' 1456']
>>> nsplit("bcsn; 1000001", ";", 3)

['bcsn', ' 1000001', '']
>>> nsplit("bcsn", ";", 3)

['bcsn', '', '']
>>> nsplit("", ".", 4)

['', '', '', '']
>>> nsplit("ab.ac.ad.ae", ".", 2)

['ab', 'ac', 'ad', 'ae']
"""
result = text.split(sep)
nparts = len(result)
result.extend(repeat("", n-nparts))
return result

if __name__ == "__main__":
import doctest
doctest.testmod()

Bye,
bearophile

 
Reply With Quote
 
Steven Bethard
Guest
Posts: n/a
 
      01-30-2007
On Jan 26, 11:07 am, Bob Greschke <b...@passcal.nmt.edu> wrote:
> I'm reading a file that has lines like
>
> bcsn; 1000000; 1223
> bcsn; 1000001; 1456
> bcsn; 1000003
> bcsn; 1000010; 4567
>
> The problem is the line with only the one semi-colon.
> Is there a fancy way to get Parts=Line.split(";") to make Parts always
> have three items in it


In Python 2.5 you can use the .partition() method which always returns
a three item tuple:

>>> text = '''\

.... bcsn; 1000000; 1223
.... bcsn; 1000001; 1456
.... bcsn; 1000003
.... bcsn; 1000010; 4567
.... '''
>>> for line in text.splitlines():

.... bcsn, _, rest = line.partition(';')
.... num1, _, num2 = rest.partition(';')
.... print (bcsn, num1, num2)
....
(' bcsn', ' 1000000', ' 1223')
(' bcsn', ' 1000001', ' 1456')
(' bcsn', ' 1000003', '')
(' bcsn', ' 1000010', ' 4567')
>>> help(str.partition)

Help on method_descriptor:

partition(...)
S.partition(sep) -> (head, sep, tail)

Searches for the separator sep in S, and returns the part before
it,
the separator itself, and the part after it. If the separator is
not
found, returns S and two empty strings.


STeVe

 
Reply With Quote
 
Bob Greschke
Guest
Posts: n/a
 
      02-01-2007
This idiom is what I ended up using (a lot it turns out!):

Parts = Line.split(";")
Parts += (x-len(Parts))*[""]

where x knows how long the line should be. If the line already has
more parts than x (i.e. [""] gets multiplied by a negative number)
nothing seems to happen which is just fine in this program's case.

Bob

 
Reply With Quote
 
George Sakkis
Guest
Posts: n/a
 
      02-02-2007
On Feb 1, 2:40 pm, Bob Greschke <b...@passcal.nmt.edu> wrote:

> This idiom is what I ended up using (a lot it turns out!):
>
> Parts = Line.split(";")
> Parts += (x-len(Parts))*[""]
>
> where x knows how long the line should be. If the line already has
> more parts than x (i.e. [""] gets multiplied by a negative number)
> nothing seems to happen which is just fine in this program's case.
>
> Bob


Here's a more generic padding one liner:

from itertools import chain,repeat

def ipad(seq, minlen, fill=None):
return chain(seq, repeat(fill, minlen-len(seq)))

>>> list(ipad('one;two;three;four'.split(";"), 7, ''))

['one', 'two', 'three', 'four', '', '', '']

>>> tuple(ipad(xrange(1,5), 7))

(1, 2, 3, 4, None, None, None)


George

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
List of lists of lists of lists... =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==?= Python 5 05-15-2006 11:47 AM
Free Fixed-Width/Fixed-Pitch fonts? johnp HTML 4 05-23-2005 06:14 AM
Java HELP: How do you read in a Text File (Fixed Length) H Brown New To It Java 5 11-06-2003 05:57 PM
HELP: How do you read in a Text File (Fixed Length) H Brown New To It Java 6 11-05-2003 12:59 PM
Indexing char files with non-fixed character length Harald Kirsch Java 2 09-04-2003 04:57 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57