Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > change only the nth occurrence of a pattern in a string

Reply
Thread Tools

change only the nth occurrence of a pattern in a string

 
 
TP
Guest
Posts: n/a
 
      12-31-2008
Hi everybody,

I would like to change only the nth occurence of a pattern in a string. The
problem with "replace" method of strings, and "re.sub" is that we can only
define the number of occurrences to change from the first one.

>>> v="coucou"
>>> v.replace("o","i",2)

'ciuciu'
>>> import re
>>> re.sub( "o", "i", v,2)

'ciuciu'
>>> re.sub( "o", "i", v,1)

'ciucou'

What is the best way to change only the nth occurence (occurrence number n)?

Why this default behavior? For the user, it would be easier to put re.sub or
replace in a loop to change the first n occurences.

Thanks

Julien
--
python -c "print ''.join([chr(154 - ord(c)) for c in '*9(9&(18%.\
9&1+,\'Z4(55l4('])"

"When a distinguished but elderly scientist states that something is
possible, he is almost certainly right. When he states that something is
impossible, he is very probably wrong." (first law of AC Clarke)
 
Reply With Quote
 
 
 
 
Roy Smith
Guest
Posts: n/a
 
      12-31-2008
In article <(E-Mail Removed)>,
TP <(E-Mail Removed)> wrote:

> Hi everybody,
>
> I would like to change only the nth occurence of a pattern in a string.


It's a little ugly, but the following looks like it works. The gist is to
split the string on your pattern, then re-join the pieces using the
original delimiter everywhere except for the n'th splice. Split() is a
wonderful tool. I'm a hard-core regex geek, but I find that most things I
might have written a big hairy regex for are easier solved by doing split()
and then attacking the pieces.

There may be some fencepost errors here. I got the basics working, and
left the details as an exercise for the reader

This version assumes the pattern is a literal string. If it's really a
regex, you'll need to put the pattern in parens when you call split(); this
will return the exact text matched each time as elements of the list. And
then your post-processing gets a little more complicated, but nothing
that's too bad.

This does a couple of passes over the data, but at least all the operations
are O(n), so the whole thing is O(n).


#!/usr/bin/python

import re

v = "coucoucoucou"

pattern = "o"
n = 2
parts = re.split(pattern, v)
print parts

first = parts[:n]
last = parts[n:]
print first
print last

j1 = pattern.join(first)
j2 = pattern.join(last)
print j1
print j2
print "i".join([j1, j2])
print v
 
Reply With Quote
 
 
 
 
Steven D'Aprano
Guest
Posts: n/a
 
      12-31-2008
On Wed, 31 Dec 2008 15:40:32 +0100, TP wrote:

> Hi everybody,
>
> I would like to change only the nth occurence of a pattern in a string.
> The problem with "replace" method of strings, and "re.sub" is that we
> can only define the number of occurrences to change from the first one.
>
>>>> v="coucou"
>>>> v.replace("o","i",2)

> 'ciuciu'
>>>> import re
>>>> re.sub( "o", "i", v,2)

> 'ciuciu'
>>>> re.sub( "o", "i", v,1)

> 'ciucou'
>
> What is the best way to change only the nth occurence (occurrence number
> n)?


Step 1: Find the nth occurrence.
Step 2: Change it.


def findnth(source, target, n):
num = 0
start = -1
while num < n:
start = source.find(target, start+1)
if start == -1: return -1
num += 1
return start

def replacenth(source, old, new, n):
p = findnth(source, old, n)
if n == -1: return source
return source[] + new + source[p+len(old):]


And in use:

>>> replacenth("abcabcabcabcabc", "abc", "WXYZ", 3)

'abcabcWXYZabcabc'


> Why this default behavior? For the user, it would be easier to put
> re.sub or replace in a loop to change the first n occurences.


Easier than just calling a function? I don't think so.

I've never needed to replace only the nth occurrence of a string, and I
guess the Python Development team never did either. Or they thought that
the above two functions were so trivial that anyone could write them.



--
Steven
 
Reply With Quote
 
Tim Chase
Guest
Posts: n/a
 
      12-31-2008
> I would like to change only the nth occurence of a pattern in
> a string. The problem with "replace" method of strings, and
> "re.sub" is that we can only define the number of occurrences
> to change from the first one.
>
>>>> v="coucou"
>>>> v.replace("o","i",2)

> 'ciuciu'
>>>> import re
>>>> re.sub( "o", "i", v,2)

> 'ciuciu'
>>>> re.sub( "o", "i", v,1)

> 'ciucou'
>
> What is the best way to change only the nth occurence
> (occurrence number n)?


Well, there are multiple ways of doing this, including munging
the regexp to skip over the first instances of a match.
Something like the following untested:

re.sub("((?:[^o]*o){2})o", r"\1i", s)

However, for a more generic solution, you could use something like

import re
class Nth(object):
def __init__(self, n_min, n_max, replacement):
#assert n_min <= n_max, \
# "Hey, look, I don't know what I'm doing!"
if n_max > n_min:
# don't be a dope
n_min, n_max = n_max, n_min
self.n_min = n_min
self.n_max = n_max
self.replacement = replacement
self.calls = 0
def __call__(self, matchobj):
self.calls += 1
if self.n_min <= self.calls <= self.n_max:
return self.replacement
return matchobj.group(0)

s = 'coucoucoucou'
print "Initial:"
print s
print "Just positions 3-4:"
print re.sub('o', Nth(3,4,'i'), s)
for params in [
(1, 1, 'i'), # just the 1st
(1, 2, 'i'), # 1-2
(2, 2, 'i'), # just the 2nd
(2, 3, 'i'), # 2-3
(2, 4, 'i'), # 2-4
(4, 4, 'i'), # just the 4th
]:
print "Nth(%i, %i, %s)" % params
print re.sub('o', Nth(*params), s)

> Why this default behavior?


Can't answer that one, but with so many easy solutions, it's not
been a big concern of mine.

-tkc





 
Reply With Quote
 
Antoon Pardon
Guest
Posts: n/a
 
      01-12-2009
On 2008-12-31, TP <(E-Mail Removed)> wrote:
> Hi everybody,
>
> I would like to change only the nth occurence of a pattern in a string. The
> problem with "replace" method of strings, and "re.sub" is that we can only
> define the number of occurrences to change from the first one.
>
>>>> v="coucou"
>>>> v.replace("o","i",2)

> 'ciuciu'
>>>> import re
>>>> re.sub( "o", "i", v,2)

> 'ciuciu'
>>>> re.sub( "o", "i", v,1)

> 'ciucou'
>
> What is the best way to change only the nth occurence (occurrence number n)?
>
> Why this default behavior? For the user, it would be easier to put re.sub or
> replace in a loop to change the first n occurences.


I would do it as follows:

1) Change the pattern n times to somethings that doesn't occur in your string
2) Change it back n-1 times
3) Change the remaining one to what you want.

>>> v="coucou"
>>> v.replace('o', 'O', 2).replace('O', 'o', 1).replace('O', 'i')

'couciu'

--
Antoon Pardon
 
Reply With Quote
 
MRAB
Guest
Posts: n/a
 
      01-14-2009
Antoon Pardon wrote:
> On 2008-12-31, TP <(E-Mail Removed)> wrote:
>> Hi everybody,
>>
>> I would like to change only the nth occurence of a pattern in a

string. The
>> problem with "replace" method of strings, and "re.sub" is that we

can only
>> define the number of occurrences to change from the first one.
>>
>>>>> v="coucou"
>>>>> v.replace("o","i",2)

>> 'ciuciu'
>>>>> import re
>>>>> re.sub( "o", "i", v,2)

>> 'ciuciu'
>>>>> re.sub( "o", "i", v,1)

>> 'ciucou'
>>
>> What is the best way to change only the nth occurence (occurrence

number n)?
>>
>> Why this default behavior? For the user, it would be easier to put

re.sub or
>> replace in a loop to change the first n occurences.

>
> I would do it as follows:
>
> 1) Change the pattern n times to somethings that doesn't occur in

your string
> 2) Change it back n-1 times
> 3) Change the remaining one to what you want.
>
>>>> v="coucou"
>>>> v.replace('o', 'O', 2).replace('O', 'o', 1).replace('O', 'i')

> 'couciu'
>

Sorry for the last posting, but it did occur to me that str.replace()
could grow another parameter 'start', so it would become:

s.replace(old, new[[, start], end]]) -> string

(In Python 2.x the method doesn't accept keyword arguments, so that
isn't a problem.)

If the possible replacements are numbered from 0, then 'start' is the
first one actually to perform and 'end' the one after the last to perform.

The 2-argument form would be s.replace(old, new) with 'start' defaulting
to 0 and 'end' to None => replacing all occurrences, same as now.

The 3-argument form would be s.replace(old, new, end) with 'start'
defaulting to 0 => equivalent to replacing the first 'end' occurrences,
same as now.

The 4-argument form would be s.replace(old, new, start, end) =>
replacing from the 'start'th to before the 'end'th occurrence,
additional behaviour as requested.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
FAQ 4.28 How do I change the Nth occurrence of something? PerlFAQ Server Perl Misc 0 02-18-2011 11:00 AM
FAQ 4.28 How do I change the Nth occurrence of something? PerlFAQ Server Perl Misc 0 01-12-2011 05:00 PM
Want regex s/// to replace only nth occurrence jerrykrinock@gmail.com Perl Misc 12 07-09-2008 02:04 PM
Algorithm to find nth largest or nth smallest in a range Code4u C++ 4 07-13-2005 03:18 AM
How to take acton upon a pattern of nth occurrence? Ross Perl Misc 15 07-06-2005 11:40 PM



Advertisments