Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > how to find such strings?

Reply
Thread Tools

how to find such strings?

 
 
mozilla.bugzilla@gmail.com
Guest
Posts: n/a
 
      07-09-2005
hi, greeting,

I am a newer for Perl, here is my question.

This is the text I got from the server,

<form name="ecomm_frm" method="post"
action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
<input type="hidden" name="TARGET" value="Button" />
<input type="hidden" name="ARGUMENT" value="" />
<input type="hidden" name="STATE" value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
/>


How can I extract the value ("wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs") for STATE
from this text ? The length of string for "value" is not a constant.
can you guys help me to figure this out? Thanks


bugzilla.

 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      07-10-2005
wrote:
> I am a newer for Perl,


It serves no good purpose to make that statement everytime you post.

> This is the text I got from the server,
>
> <form name="ecomm_frm" method="post"
> action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
> <input type="hidden" name="TARGET" value="Button" />
> <input type="hidden" name="ARGUMENT" value="" />
> <input type="hidden" name="STATE" value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
> />
>
> How can I extract the value ("wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs") for STATE
> from this text ? The length of string for "value" is not a constant.


There are at least three approaches:

1) Use the substr() and index() functions.

perldoc -f substr
perldoc -f index

The length of the value string doesn't need to be constant for that:

my $ident = 'name="STATE" value="';
my $pos1 = index($text, $ident) + length $ident;
my $pos2 = index $text, '"', $pos1;
print substr($text, $pos1, $pos2-$pos1), "\n";

2) Capture it with a regex in the m// operator.

perldoc perlop (where the m// operator is described)

perldoc perlrequick
perldoc perlretut
perldoc perlre

Chris gave you an example of that.

3) Use a module for parsing HTML

http://search.cpan.org/search?query=HTML+parse

Even if the third approach gives you the most robust code, there is
always a risk that your solution fails if the structure of the document
changes.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
 
A. Sinan Unur
Guest
Posts: n/a
 
      07-10-2005
wrote in news:1120951290.667637.235660
@z14g2000cwz.googlegroups.com:

> I am a newer for Perl,


I guess the correct English would be "I am new to Perl". Please note
that I am a non-native speaker as well. I think correcting persistent
errors in language usage is very important in the learning process.

That said, no one here is interested in whether you are just picking up
Perl, or have written many books on the topic. We are interested in
seeing well thought-out questions, and enjoy answering such questions.
As the posting guidelines also suggest, mentioning experience level in
posts, and non-sensical subject lines do bias some of us (myself
included) toward not answering such posts.

Not to mention that your chosen ID resembles a certain person whose name
I shall not speak

> This is the text I got from the server,


That looks like HTML to me.

> <form name="ecomm_frm" method="post"
> action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
> <input type="hidden" name="TARGET" value="Button" />
> <input type="hidden" name="ARGUMENT" value="" />
> <input type="hidden" name="STATE"

value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
> />
>
> How can I extract the value ("wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs") for
> STATE from this text ?


I would suggest using an HTML parser. There are quite a few such modules
on CPAN.

Note that your chances of getting a useful response increase
exponentially if you post a reasonable amount of code showing your
attempt to first tackle the problem yourself.

Sinan

--
A. Sinan Unur <>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      07-10-2005
Chris Lowth <> wrote in
news:LNZze.26101$:

> wrote:

....
>> <form name="ecomm_frm" method="post"
>> action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
>> <input type="hidden" name="TARGET" value="Button" />
>> <input type="hidden" name="ARGUMENT" value="" />
>> <input type="hidden" name="STATE"
>> value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs" />
>>
>>
>> How can I extract the value ("wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs") for
>> STATE from this text ? The length of string for "value" is not a
>> constant. can you guys help me to figure this out? Thanks

....
> If all your text is in $text, then this should do it..
>
> if ( $text =~ m!<input type="hidden" name="STATE" value="(.*?)"/>!s )
> {
> print "$1\n";
> }


You should use an HTML parser to parse HTML:

#!/usr/bin/perl

use strict;
use warnings;

my $form = do { local $/; <DATA> };

if ( $form =~ m!<input type="hidden" name="STATE" value="(.*?)"/>!s ) {
print "$1\n";
}

__END__
<form name="ecomm_frm" method="post"
action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
<input type="hidden" name="TARGET" value="Button" />
<input type="hidden" name="ARGUMENT" value="" />
<input type="hidden" name="STATE" value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
/>

D:\Home> ttt

D:\Home>

One can, instead, use a proper HTML to parse HTML:

#!/usr/bin/perl

use strict;
use warnings;

use HTML::TokeParser::Simple;

my $form = do { local $/; <DATA> };

my $p = HTML::TokeParser::Simple->new(\$form);

while(my $t = $p->get_token) {
if( $t->is_start_tag('input')
and 'STATE' eq $t->get_attr('name') ) {
print $t->get_attr('value')."\n";
}
}

__END__
<form name="ecomm_frm" method="post"
action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
<input type="hidden" name="TARGET" value="Button" />
<input type="hidden" name="ARGUMENT" value="" />
<input type="hidden" name="STATE" value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
/>

D:\Home> ttt
wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs

> --
> http://www.lowth.com/rope - Scriptable IP packet match logic for
> linux/iptables.


Incidentally, your signature delimiter is incorrect. It should be two
dashes followed a space on a line by itself.

--
A. Sinan Unur <>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html
 
Reply With Quote
 
Brian Wakem
Guest
Posts: n/a
 
      07-10-2005
Chris Lowth wrote:

> wrote:
>> hi, greeting,
>>
>> I am a newer for Perl, here is my question.
>>
>> This is the text I got from the server,
>>
>> <form name="ecomm_frm" method="post"
>> action="process.aspx?c=us&amp;l=en&amp" id="ecomm_frm">
>> <input type="hidden" name="TARGET" value="Button" />
>> <input type="hidden" name="ARGUMENT" value="" />
>> <input type="hidden" name="STATE" value="wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs"
>> />
>>
>>
>> How can I extract the value ("wxMDcyMzEwNzIyO3Q8O2w8aTwwPjs") for STATE
>> from this text ? The length of string for "value" is not a constant.
>> can you guys help me to figure this out? Thanks
>>
>>
>> bugzilla.

>
> If all your text is in $text, then this should do it..
>
> if ( $text =~ m!<input type="hidden" name="STATE" value="(.*?)"/>!s ) {
> print "$1\n";
> }



That regex wont match as I believe there will be a space before the /

I would use:-

if ( $text =~ m!<input type="hidden" name="STATE" value="([^"]+)"!s ) {
print "$1\n";
}

as there may or may not be a space and the / is not guaranteed to be their
either. Of course an HTML parsing module would avoid all of those issues.


--
Brian Wakem

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
How to exclude action of Find::Find::find in subdirectories withknown names? vdvorkin Perl Misc 3 02-14-2011 05:28 AM
How to exclude action of Find::Find::find in subdirectories withknown names? vdvorkin Perl Misc 0 02-10-2011 05:18 PM
Such a pitty! Such a great person gone! WhorryIrressy Wireless Networking 0 01-28-2008 04:24 PM
Find.find does not find orphaned links? Wybo Dekker Ruby 1 11-15-2005 02:50 PM



Advertisments
 



1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57