Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > python regex: misbehaviour with "\r" (0x0D) as Newline characterin Unicode Mode

Reply
Thread Tools

python regex: misbehaviour with "\r" (0x0D) as Newline characterin Unicode Mode

 
 
Arian Sanusi
Guest
Posts: n/a
 
      01-27-2008
Hi,

concerning to unicode, "\n", "\r "and "\r\n" (0x000A, 0x000D and
0x000D+0x000A) should be threatened as newline character
at least this is how i understand it:
(http://en.wikipedia.org/wiki/Newline#Unicode)

obviously, the re module does not care, and on unix, only threatens \n
as newline char:

>>> a=re.compile(u"^a",re.U|re.M)
>>> a.search(u"bc\ra")
>>> a.search(u"bc\na")

<_sre.SRE_Match object at 0xb5908fa8>

same thing for $:
>>> b = re.compile(u"c$",re.U|re.M)
>>> b.search(u"bc\r\n")
>>> b.search(u"abc")

<_sre.SRE_Match object at 0xb5908f70>
>>> b.search(u"bc\nde")

<_sre.SRE_Match object at 0xb5908fa8>

is this a known bug in the re module? i couldn't find any issues in the
bug tracker.
Or is this just a user fault and you guys can help me?

arian

p.s.: appears in both python2.4 and 2.5
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
irb misbehaviour with arrow keys on Windows Marvin Gülker Ruby 9 11-13-2010 03:44 PM
Re: python regex: misbehaviour with "\r" (0x0D) as Newline characterin Unicode Mode Fredrik Lundh Python 0 01-27-2008 06:27 PM
re: mmm-mode, python-mode and doctest-mode? John J Lee Python 0 08-07-2007 07:49 PM
re: mmm-mode, python-mode and doctest-mode? Edward Loper Python 0 08-07-2007 08:58 AM
mmm-mode, python-mode and doctest-mode? John J Lee Python 3 12-01-2005 08:35 PM



Advertisments