![]() |
Correct use of Unicode in RegExp
On Thu, 22 Apr 2004 22:36:44 +0100, mike blamires scribbled furiously:
> I am having great difficulty using Unicode characters in a Regular > Expression, I am trying to match extended Unicode characters. > > I am wishing to split a large Dumpfile (containing only JPEGS) I have used > a hex editor to manually extract a file just to show it can be done, so I > know the input is intact. > > Each JPEG starts with the Unicode characters \u00FF \u00D8 \u00FF \u00E1 > and there are plenty of these to be found within the file. > > open(DUMPFILE, "/pathtodumpfile"); > my $line; > while(<DUMPFILE>) { > $line = $line.$_; > } > @files = split(/\x{00FF}\x{00D8}\x{00FF}\x{00E1}/, $line); > > (As you may see from the above style I am relatively inexperienced to the > perl side of programming ;) > > I have tried inserting the Unicode characters in various ways \xFF, \x{FF} > etc. It just doesn't seem to find the pattern. I am at a bit of a loss as > to whether it is my regexp that is wrong, my use of Unicode characters > or use of Extended Unicode characters. > > many thanks for your help. > > cheers > Mike Apologies, incorrect newsgroup first time round. Please see above. cheers Mike |
| All times are GMT. The time now is 01:24 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.