Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > s modifier doesn't seem to work

Reply
Thread Tools

s modifier doesn't seem to work

 
 
fmassion@web.de
Guest
Posts: n/a
 
      08-10-2013
Hi everybody,

I am currently testing a string search over line breaks.

My file is UTF-8 encoded.

This is my test text (with linebreaks at the end):
----------
Das ist ein Beispiel mit 3 Sätzen
Das ist ein 1122-22-11 Format
Hier ist keine Zahl.
Hier ist kein Punkt
nur Text Hier ist nur Text ist aber nur Text
----------

This is a code extract:

foreach $satz (@satz) {
chomp $satz;
if ($satz =~ m/\d(?s)(.*)keine/g) {
$satz =~ s/$&/xxxx/g;
}
print "$satz\n";
}

I would expect the following result for the first three lines:
'Das ist ein Beispiel mit xxxxx Zahl.'

With this search string, I get however no match. I have entered the same expression in UltraEdit (Regex-Perl-Search) and it works correctly.

What is wrong here?
 
Reply With Quote
 
 
 
 
Peter J. Holzer
Guest
Posts: n/a
 
      08-10-2013
On 2013-08-10 09:16, http://www.velocityreviews.com/forums/(E-Mail Removed) <(E-Mail Removed)> wrote:
> I am currently testing a string search over line breaks.

[...]
> This is my test text (with linebreaks at the end):
> ----------
> Das ist ein Beispiel mit 3 Sätzen
> Das ist ein 1122-22-11 Format
> Hier ist keine Zahl.
> Hier ist kein Punkt
> nur Text Hier ist nur Text ist aber nur Text
> ----------

[...]
> if ($satz =~ m/\d(?s)(.*)keine/g) {

[...]
> With this search string, I get however no match.

[...]
> What is wrong here?


Read the section "Modifiers" in perldoc perlre.

hp

--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
 
 
 
George Mpouras
Guest
Posts: n/a
 
      08-10-2013
> I would expect the following result for the first three lines:
> 'Das ist ein Beispiel mit xxxxx Zahl.'
>
> With this search string, I get however no match. I have entered the same expression in UltraEdit (Regex-Perl-Search) and it works correctly.
>
> What is wrong here?
>




while (<DATA>)
{
s/(\d|-|keine)+/xxxx/g;
print "$_"
}

__DATA__
Das ist ein Beispiel mit 3 Sätzen
Das ist ein 1122-22-11 Format
Hier ist keine Zahl.
Hier ist kein Punkt
nur Text Hier ist nur Text ist aber nur Text
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      08-10-2013
On 2013-08-10 11:17, Ben Morrow <(E-Mail Removed)> wrote:
> Quoth "Peter J. Holzer" <(E-Mail Removed)>:
>> On 2013-08-10 09:16, (E-Mail Removed) <(E-Mail Removed)> wrote:
>> > if ($satz =~ m/\d(?s)(.*)keine/g) {

>> [...]
>> > With this search string, I get however no match.

>> [...]
>> > What is wrong here?

>>
>> Read the section "Modifiers" in perldoc perlre.

>
> Read the section '(?adlupimsx-imsx)' in perldoc perlre .


I've cancelled that article. Either I wasn't fast enough or your
Newsserver doesn't honor cancels (without cancel-lock).

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
fmassion@web.de
Guest
Posts: n/a
 
      08-10-2013
I think Ben has the right hint. Indeed I read the file into the array (@satz) and then I go
'foreach $satz (@satz)'
Geaorge's code doesn't work though. It returns the following result for thefirst 3 lines:

Das ist ein Beispiel mit xxxx Sätzen
Das ist ein xxxx Format
Hier ist xxxx Zahl.

The solution is still pending but thanks for the help.

Am Samstag, 10. August 2013 11:16:58 UTC+2 schrieb (E-Mail Removed):
> Hi everybody,
>
>
>
> I am currently testing a string search over line breaks.
>
>
>
> My file is UTF-8 encoded.
>
>
>
> This is my test text (with linebreaks at the end):
>
> ----------
>
> Das ist ein Beispiel mit 3 Sätzen
>
> Das ist ein 1122-22-11 Format
>
> Hier ist keine Zahl.
>
> Hier ist kein Punkt
>
> nur Text Hier ist nur Text ist aber nur Text
>
> ----------
>
>
>
> This is a code extract:
>
>
>
> foreach $satz (@satz) {
>
> chomp $satz;
>
> if ($satz =~ m/\d(?s)(.*)keine/g) {
>
> $satz =~ s/$&/xxxx/g;
>
> }
>
> print "$satz\n";
>
> }
>
>
>
> I would expect the following result for the first three lines:
>
> 'Das ist ein Beispiel mit xxxxx Zahl.'
>
>
>
> With this search string, I get however no match. I have entered the same expression in UltraEdit (Regex-Perl-Search) and it works correctly.
>
>
>
> What is wrong here?


 
Reply With Quote
 
fmassion@web.de
Guest
Posts: n/a
 
      08-10-2013
This works as expected, but I don't quite understand what happens


undef $/;
while (<DATA>) {
chomp;
print "$_<<\n";
s/\d(.*)Zahl/xxxx/sg;
print "\n$_\n"
}
It searches over the first 3 lines and outputs as expected:
'Das ist ein Beispiel mit xxxx'


Am Samstag, 10. August 2013 11:16:58 UTC+2 schrieb (E-Mail Removed):
> Hi everybody,
>
>
>
> I am currently testing a string search over line breaks.
>
>
>
> My file is UTF-8 encoded.
>
>
>
> This is my test text (with linebreaks at the end):
>
> ----------
>
> Das ist ein Beispiel mit 3 Sätzen
>
> Das ist ein 1122-22-11 Format
>
> Hier ist keine Zahl.
>
> Hier ist kein Punkt
>
> nur Text Hier ist nur Text ist aber nur Text
>
> ----------
>
>
>
> This is a code extract:
>
>
>
> foreach $satz (@satz) {
>
> chomp $satz;
>
> if ($satz =~ m/\d(?s)(.*)keine/g) {
>
> $satz =~ s/$&/xxxx/g;
>
> }
>
> print "$satz\n";
>
> }
>
>
>
> I would expect the following result for the first three lines:
>
> 'Das ist ein Beispiel mit xxxxx Zahl.'
>
>
>
> With this search string, I get however no match. I have entered the same expression in UltraEdit (Regex-Perl-Search) and it works correctly.
>
>
>
> What is wrong here?


 
Reply With Quote
 
George Mpouras
Guest
Posts: n/a
 
      08-10-2013
>>
>>
>> What is wrong here?

>


please explain again more detailed the requirements. I can not
understand what you expect
 
Reply With Quote
 
Charles DeRykus
Guest
Posts: n/a
 
      08-10-2013
On 8/10/2013 8:39 AM, (E-Mail Removed) wrote:
> This works as expected, but I don't quite understand what happens
>
>
> undef $/;


> while (<DATA>) {
> chomp;
> print "$_<<\n";
> s/\d(.*)Zahl/xxxx/sg;
> print "\n$_\n"
> }
> It searches over the first 3 lines and outputs as expected:
> 'Das ist ein Beispiel mit xxxx'
>
>


See: perldoc perlvar --> $/

See: perldoc perlretut --> why '.' matches everything but "\n"
or
See: perldoc perlre -> Modifiers --> s Treat string as single line

--
Charles DeRykus
 
Reply With Quote
 
fmassion@web.de
Guest
Posts: n/a
 
      08-11-2013
Am Samstag, 10. August 2013 21:57:07 UTC+2 schrieb Ben Morrow:
> [Please quote properly: that is, put your reply underneath the bit of
>
> text you are replying to. It's also not helpful to keep replying to
>
> yourself; instead you should reply to the article you are, um, replying
>
> to. You appear to be using Google Groups, which has recently started
>
> inserting extra blank lines whenever it quotes something; if you can't
>
> find any way of turning this off you need to remove them by hand before
>
> posting.]
>
>
>
> Quoth (E-Mail Removed):
>
> > Am Samstag, 10. August 2013 11:16:58 UTC+2 schrieb (E-Mail Removed):

>
> > >

>
> > > I am currently testing a string search over line breaks.

>
> [...]
>
> > >

>
> > > This is a code extract:

>
> > >

>
> > > foreach $satz (@satz) {

>
> > > chomp $satz;

>
> > > if ($satz =~ m/\d(?s)(.*)keine/g) {

>
> > > $satz =~ s/$&/xxxx/g;

>
> > > }

>
> > > print "$satz\n";

>
> > > }

>
> > >

>
> > >

>
> > >

>
> > > I would expect the following result for the first three lines:

>
> > > 'Das ist ein Beispiel mit xxxxx Zahl.'

>
> > >

>
> > > With this search string, I get however no match. I have entered the

>
> >

>
> > This works as expected, but I don't quite understand what happens

>
> >

>
> > undef $/;

>
>
>
> This is documented in perldoc perlvar, under $/. Setting $/ to undef
>
> causes <> to read the whole file in one go. This means you now have your
>
> whole file in one string, so the s/// works over multiple lines.
>
>
>
> > while (<DATA>) {

>
>
>
> Since you are reading the whole file, there will only ever be one entry
>
> to loop over, so you don't really need a loop.
>
>
>
> > chomp;

>
>
>
> With $/=undef chomp doesn't do anything.
>
>
>
> > print "$_<<\n";

>
> > s/\d(.*)Zahl/xxxx/sg;

>
> > print "\n$_\n"

>
> > }

>
> > It searches over the first 3 lines and outputs as expected:

>
> > 'Das ist ein Beispiel mit xxxx'

>
>
>
> Since you're only doing one substitution it would be better to use an
>
> ordinary named variable and no loop:
>
>
>
> my $text = <DATA>;
>
> print "$text<<\n";
>
>
>
> $text =~ s/\d(.*)Zahl/xxxx/sg;
>
> print "\n$text\n";
>
>
>
> Ben


[Sorry for not replying properly. I hope this is OK now]

I understand what 'undef $/' does but it seems to be a workaround. Basically my goal is:

1) Read a text in an array
2) Iterate through the variables of the array: 'foreach $satz (@satz)'
3) Test various search and replace Regex (as a matter of fact I am working through the Regex Cookbook of Jan Goyvaerts & Steven Levithan). In this context, one of several tests concerns the s modifier. I just wonder why it isn't possible to search for an expressions which spread over more than one line if I add this modifier. It works in UltraEdit. It works in a few other tools as well but I can't make it function in my perl script. If I use the undefine-workaround, other search expressions (e.g. with $ to mark the end of the string) won't work.

In one of the tools I use (Expresso), I see that the EOL is coded as [CR][LF]. Is this a reason for the problem with the s modifier?
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      08-11-2013
On 2013-08-11 09:49, (E-Mail Removed) <(E-Mail Removed)> wrote:
> Am Samstag, 10. August 2013 21:57:07 UTC+2 schrieb Ben Morrow:
>> [Please quote properly: that is, put your reply underneath the bit of
>>
>> text you are replying to. It's also not helpful to keep replying to
>>
>> yourself; instead you should reply to the article you are, um, replying
>>
>> to. You appear to be using Google Groups, which has recently started
>>
>> inserting extra blank lines whenever it quotes something; if you can't
>>
>> find any way of turning this off you need to remove them by hand before
>>
>> posting.]
>>
>>
>>
>> Quoth (E-Mail Removed):
>>
>> > Am Samstag, 10. August 2013 11:16:58 UTC+2 schrieb (E-Mail Removed):

>>
>> > >

>>
>> > > I am currently testing a string search over line breaks.

>>
>> [...]
>>
>> > >

>>
>> > > This is a code extract:

>>
>> > >

>>
>> > > foreach $satz (@satz) {

>>
>> > > chomp $satz;

>>
>> > > if ($satz =~ m/\d(?s)(.*)keine/g) {

>>
>> > > $satz =~ s/$&/xxxx/g;

>>
>> > > }

>>
>> > > print "$satz\n";

>>
>> > > }

>>
>> > >

>>
>> > >

>>
>> > >

>>
>> > > I would expect the following result for the first three lines:

>>
>> > > 'Das ist ein Beispiel mit xxxxx Zahl.'

>>
>> > >

>>
>> > > With this search string, I get however no match. I have entered the

>>
>> >

>>
>> > This works as expected, but I don't quite understand what happens

>>
>> >

>>
>> > undef $/;

>>

[...]
> [Sorry for not replying properly. I hope this is OK now]


Not really. You are still quoting everything (whether it is relevant or
not) and you haven't removed the empty lines inserted by google. So we
have scroll/read through 130 lines on quotes which may or may not be
relevant. I dare say that not every one of us has the patience.

Do yourself and us a favour, get a real Newsreader and use one of the
free news servers (e.g. albasani).


> I understand what 'undef $/' does but it seems to be a workaround.
> Basically my goal is:
>
> 1) Read a text in an array


What are the elements of the array? Lines?


> 2) Iterate through the variables of the array: 'foreach $satz (@satz)'


So in each iteration of the loop you are looking at one line in
isolation.


> 3) Test various search and replace Regex (as a matter of fact I am
> working through the Regex Cookbook of Jan Goyvaerts & Steven
> Levithan). In this context, one of several tests concerns the s
> modifier. I just wonder why it isn't possible to search for an
> expressions which spread over more than one line if I add this
> modifier.


That's what the /s modifier does. But there have to be actually several
lines in the variable you are looking at for this to work. If the other
lines are in different variables, how can perl know that you would want
to match those other variables, too, especially if to tell it
explicitely to look only at this variable?

> It works in UltraEdit. It works in a few other tools as well


That's because UltraEdit and those other tools treat the whole text as
unit. But your script (not Perl - *your* script) splits it into many
small units and looks at each of them in isolation. None of these small
units matches.

hp


--
_ | Peter J. Holzer | Fluch der elektronischen Textverarbeitung:
|_|_) | Sysadmin WSR | Man feilt solange an seinen Text um, bis
| | | (E-Mail Removed) | die Satzbestandteile des Satzes nicht mehr
__/ | http://www.hjp.at/ | zusammenpaßt. -- Ralph Babel
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Why doesn't it evaluate in RE using "e" modifier ?? Ahmad Perl Misc 2 01-02-2008 03:30 PM
qr// doesn't handle m modifier? adam@irvine.com Perl Misc 5 09-05-2006 09:55 AM
I need modifier key state in an event that doesn't provide access to it. elrod@therod.org Java 1 01-09-2006 03:39 PM
NAT doesn't seem to work on all ports gqmetro@yahoo.com Cisco 1 06-15-2005 08:26 AM
router doesn't seem to work =?Utf-8?B?ZA==?= Wireless Networking 2 06-12-2005 06:45 PM



Advertisments