Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Splitting a string into substrings of equal size

Reply
Thread Tools

Splitting a string into substrings of equal size

 
 
candide
Guest
Posts: n/a
 
      08-15-2009
Suppose you need to split a string into substrings of a given size (except
possibly the last substring). I make the hypothesis the first slice is at the
end of the string.
A typical example is provided by formatting a decimal string with thousands
separator.


What is the pythonic way to do this ?


For my part, i reach to this rather complicated code:


# ----------------------

def comaSep(z,k=3, sep=','):
z=z[::-1]
x=[z[k*i:k*(i+1)][::-1] for i in range(1+(len(z)-1)/k)][::-1]
return sep.join(x)

# Test
for z in ["75096042068045", "509", "12024", "7", "2009"]:
print z+" --> ", comaSep(z)

# ----------------------

outputting :

75096042068045 --> 75,096,042,068,045
509 --> 509
12024 --> 12,024
7 --> 7
2009 --> 2,009


Thanks
 
Reply With Quote
 
 
 
 
Gabriel Genellina
Guest
Posts: n/a
 
      08-15-2009
En Fri, 14 Aug 2009 21:22:57 -0300, candide <(E-Mail Removed)>
escribió:

> Suppose you need to split a string into substrings of a given size
> (except
> possibly the last substring). I make the hypothesis the first slice is
> at the
> end of the string.
> A typical example is provided by formatting a decimal string with
> thousands
> separator.
>
>
> What is the pythonic way to do this ?


py> import locale
py> locale.setlocale(locale.LC_ALL, '')
'Spanish_Argentina.1252'
py> locale.format("%d", 75096042068045, True)
'75.096.042.068.045'



> For my part, i reach to this rather complicated code:


Mine isn't very simple either:

py> def genparts(z):
.... n = len(z)
.... i = n%3
.... if i: yield z[:i]
.... for i in xrange(i, n, 3):
.... yield z[i:i+3]
....
py> ','.join(genparts("75096042068045"))
'75,096,042,068,045'

--
Gabriel Genellina

 
Reply With Quote
 
 
 
 
Jan Kaliszewski
Guest
Posts: n/a
 
      08-15-2009
15-08-2009 candide <(E-Mail Removed)> wrote:

> Suppose you need to split a string into substrings of a given size
> (except
> possibly the last substring). I make the hypothesis the first slice is
> at the end of the string.
> A typical example is provided by formatting a decimal string with
> thousands separator.


I'd use iterators, especially for longer strings...


import itertools

def separate(text, grouplen=3, sep=','):
"separate('12345678') -> '123,456,78'"
repeated_iterator = [iter(text)] * grouplen
groups = itertools.izip_longest(fillvalue='', *repeated_iterator)
strings = (''.join(group) for group in groups) # gen. expr.
return sep.join(strings)

def back_separate(text, grouplen=3, sep=','):
"back_separate('12345678') -> '12,345,678'"
repeated_iterator = [reversed(text)] * grouplen
groups = itertools.izip_longest(fillvalue='', *repeated_iterator)
strings = [''.join(reversed(group)) for group in groups] # list compr.
return sep.join(reversed(strings))

print separate('12345678')
print back_separate('12345678')

# alternate implementation
# (without "materializing" 'strings' as a list in back_separate):
def separate(text, grouplen=3, sep=','):
"separate('12345678') -> '12,345,678'"
textlen = len(text)
end = textlen - (textlen % grouplen)
repeated_iterator = [iter(itertools.islice(text, 0, end))] * grouplen
strings = itertools.imap(lambda *chars: ''.join(chars),
*repeated_iterator)
return sep.join(itertools.chain(strings, (text[end:],)))

def back_separate(text, grouplen=3, sep=','):
"back_separate('12345678') -> '12,345,678'"
beg = len(text) % grouplen
repeated_iterator = [iter(itertools.islice(text, beg, None))] *
grouplen
strings = itertools.imap(lambda *chars: ''.join(chars),
*repeated_iterator)
return sep.join(itertools.chain((text[:beg],), strings))

print separate('12345678')
print back_separate('12345678')


http://docs.python.org/library/itertools.html#recipes
was the inspiration for me (especially grouper).

Cheers,
*j
--
Jan Kaliszewski (zuo) <(E-Mail Removed)>
 
Reply With Quote
 
Jan Kaliszewski
Guest
Posts: n/a
 
      08-15-2009
15-08-2009 Jan Kaliszewski <(E-Mail Removed)> wrote:

> 15-08-2009 candide <(E-Mail Removed)> wrote:
>
>> Suppose you need to split a string into substrings of a given size
>> (except
>> possibly the last substring). I make the hypothesis the first slice is
>> at the end of the string.
>> A typical example is provided by formatting a decimal string with
>> thousands separator.

>
> I'd use iterators, especially for longer strings...
>
>
> import itertools

[snip]

Err... It's too late for coding... Now I see obvious and simpler variant:

def separate(text, grouplen=3, sep=','):
"separate('12345678') -> '123,456,78'"
textlen = len(text)
end = textlen - (textlen % grouplen)
strings = (text[i:i+grouplen] for i in xrange(0, end, grouplen))
return sep.join(itertools.chain(strings, (text[end:],)))

def back_separate(text, grouplen=3, sep=','):
"back_separate('12345678') -> '12,345,678'"
textlen = len(text)
beg = textlen % grouplen
strings = (text[i:i+grouplen] for i in xrange(beg, textlen, grouplen))
return sep.join(itertools.chain((text[:beg],), strings))

print separate('12345678')
print back_separate('12345678')

--
Jan Kaliszewski (zuo) <(E-Mail Removed)>
 
Reply With Quote
 
Rascal
Guest
Posts: n/a
 
      08-15-2009
I'm bored for posting this, but here it is:

def add_commas(str):
str_list = list(str)
str_len = len(str)
for i in range(3, str_len, 3):
str_list.insert(str_len - i, ',')
return ''.join(str_list)

 
Reply With Quote
 
candide
Guest
Posts: n/a
 
      08-15-2009
Thanks to all for your response. I particularly appreciate Rascal's solution.
 
Reply With Quote
 
Jan Kaliszewski
Guest
Posts: n/a
 
      08-15-2009
Dnia 15-08-2009 o 08:08:14 Rascal <(E-Mail Removed)> wrote:

> I'm bored for posting this, but here it is:
>
> def add_commas(str):
> str_list = list(str)
> str_len = len(str)
> for i in range(3, str_len, 3):
> str_list.insert(str_len - i, ',')
> return ''.join(str_list)


For short strings (for sure most common case) it's ok: simple and clear.
But for huge ones, it's better not to materialize additional list for the
string -- then pure-iterator-sollutions would be better (like Gabriel's or
mine).

Cheers,
*j

--
Jan Kaliszewski (zuo) <(E-Mail Removed)>
 
Reply With Quote
 
Emile van Sebille
Guest
Posts: n/a
 
      08-15-2009
On 8/14/2009 5:22 PM candide said...
> Suppose you need to split a string into substrings of a given size (except
> possibly the last substring). I make the hypothesis the first slice is at the
> end of the string.
> A typical example is provided by formatting a decimal string with thousands
> separator.
>
>
> What is the pythonic way to do this ?


I like list comps...

>>> jj = '1234567890123456789'
>>> ",".join([jj[ii:ii+3] for ii in range(0,len(jj),3)])

'123,456,789,012,345,678,9'
>>>


Emile

 
Reply With Quote
 
Gregor Lingl
Guest
Posts: n/a
 
      08-15-2009

> What is the pythonic way to do this ?
>
>
> For my part, i reach to this rather complicated code:
>
>
> # ----------------------
>
> def comaSep(z,k=3, sep=','):
> z=z[::-1]
> x=[z[k*i:k*(i+1)][::-1] for i in range(1+(len(z)-1)/k)][::-1]
> return sep.join(x)
>
> # Test
> for z in ["75096042068045", "509", "12024", "7", "2009"]:
> print z+" --> ", comaSep(z)
>


Just if you are interested, a recursive solution:

>>> def comaSep(z,k=3,sep=","):

return comaSep(z[:-3],k,sep)+sep+z[-3:] if len(z)>3 else z

>>> comaSep("7")

'7'
>>> comaSep("2007")

'2,007'
>>> comaSep("12024")

'12,024'
>>> comaSep("509")

'509'
>>> comaSep("75096042068045")

'75,096,042,068,045'
>>>


Gregor
 
Reply With Quote
 
Gregor Lingl
Guest
Posts: n/a
 
      08-15-2009

> What is the pythonic way to do this ?
>
>
> For my part, i reach to this rather complicated code:
>
>
> # ----------------------
>
> def comaSep(z,k=3, sep=','):
> z=z[::-1]
> x=[z[k*i:k*(i+1)][::-1] for i in range(1+(len(z)-1)/k)][::-1]
> return sep.join(x)
>
> # Test
> for z in ["75096042068045", "509", "12024", "7", "2009"]:
> print z+" --> ", comaSep(z)
>


Just if you are interested, a recursive solution:

>>> def comaSep(z,k=3,sep=","):

return comaSep(z[:-3],k,sep)+sep+z[-3:] if len(z)>3 else z

>>> comaSep("7")

'7'
>>> comaSep("2007")

'2,007'
>>> comaSep("12024")

'12,024'
>>> comaSep("509")

'509'
>>> comaSep("75096042068045")

'75,096,042,068,045'
>>>


Gregor
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
efficiently splitting up strings based on substrings per Python 7 09-06-2009 07:48 AM
splitting string with unknown number of similar substrings Jan Biel Perl Misc 2 04-02-2004 05:43 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? John Dibling C++ 0 07-19-2003 04:41 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? Mark C++ 0 07-19-2003 04:24 PM
Re: Splitting up the definitions of a class into different files (splitting public from private)? John Ericson C++ 0 07-19-2003 04:03 PM



Advertisments