Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Extract until unquote or EOL

Reply
Thread Tools

Extract until unquote or EOL

 
 
Mats
Guest
Posts: n/a
 
      07-18-2005
Hi!

I've messed with this problem a while now. I want to parse a file for
declarations (ex: NAME = "myname").

I wan't to extract the phrase/text between the two quotes. BUT If the
last quote isn't available (type/user error) then it should extract
until end of line. If no quotes are there at all, it should extract the
whole line (except NAME=). If there are several double quotes, it should
extract between the first two (that i seem to have achived).

My current testscript looks as below:

---
#!/usr/bin/perl -w

use strict;

$_ = 'NAME = "between quotes" not this" nor this';

print $1."\n" if m/\s*NAME\s*=\s*"*(.*?)"|$/s;
---

This prints out as i want:
between quotes

But if i delete the last two doublequotes and just keep the first it
prints nothing. If i delete the first doublequote also, i get an
"uninitialized value" error.

Anybody knows a smooth solution to this?

Mats
 
Reply With Quote
 
 
 
 
John W. Krahn
Guest
Posts: n/a
 
      07-18-2005
Mats wrote:
>
> I've messed with this problem a while now. I want to parse a file for
> declarations (ex: NAME = "myname").
>
> I wan't to extract the phrase/text between the two quotes. BUT If the
> last quote isn't available (type/user error) then it should extract
> until end of line. If no quotes are there at all, it should extract the
> whole line (except NAME=). If there are several double quotes, it should
> extract between the first two (that i seem to have achived).
>
> My current testscript looks as below:
>
> ---
> #!/usr/bin/perl -w
>
> use strict;
>
> $_ = 'NAME = "between quotes" not this" nor this';
>
> print $1."\n" if m/\s*NAME\s*=\s*"*(.*?)"|$/s;
> ---
>
> This prints out as i want:
> between quotes
>
> But if i delete the last two doublequotes and just keep the first it
> prints nothing. If i delete the first doublequote also, i get an
> "uninitialized value" error.
>
> Anybody knows a smooth solution to this?


print "$1\n" if /\s*NAME\s*=\s*"?([^"]+)/;



John
 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      07-18-2005
John W. Krahn wrote:
>
> print "$1\n" if /\s*NAME\s*=\s*"?([^"]+)/;


Nice. Only that the leading \s* can be dropped.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
A. Sinan Unur
Guest
Posts: n/a
 
      07-18-2005
Mats <(E-Mail Removed)> wrote in
news:jlSCe.29331$(E-Mail Removed):

> I wan't to extract the phrase/text between the two quotes. BUT If the
> last quote isn't available (type/user error) then it should extract
> until end of line. If no quotes are there at all, it should extract
> the whole line (except NAME=). If there are several double quotes, it
> should extract between the first two (that i seem to have achived).


....

> print $1."\n" if m/\s*NAME\s*=\s*"*(.*?)"|$/s;


Usually, I find it easier to deal with a literal translation of the
requirements into the relevant index and substr calls:

#!/usr/bin/perl

use strict;
use warnings;

while(<DATA>) {
if( /^\s*NAME\s*=\s*(.*)/ ) {
my $v;
if( (my $i = 1 + index $1, q{"}) ) {
if( -1 < (my $j = index substr($1, $i), q{"}) ) {
$v = substr $1, $i, $j;
} else {
$v = substr $1, $i;
}
} else {
$v = $1;
}
print "$v\n";
}
}

__DATA__
NAME = "between quotes" not this nor this
NAME = no quotation marks so grab all of this
NAME = "solitary quotation mark at the beginning of line, so grab all

--
A. Sinan Unur <(E-Mail Removed)>
(reverse each component and remove .invalid for email address)

comp.lang.perl.misc guidelines on the WWW:
http://mail.augustmail.com/~tadmc/cl...uidelines.html
 
Reply With Quote
 
Mats
Guest
Posts: n/a
 
      07-18-2005
John W. Krahn wrote:
> Mats wrote:
>
>>I've messed with this problem a while now. I want to parse a file for
>>declarations (ex: NAME = "myname").
>>
>>I wan't to extract the phrase/text between the two quotes. BUT If the
>>last quote isn't available (type/user error) then it should extract
>>until end of line. If no quotes are there at all, it should extract the
>>whole line (except NAME=). If there are several double quotes, it should
>>extract between the first two (that i seem to have achived).
>>
>>My current testscript looks as below:
>>
>>---
>>#!/usr/bin/perl -w
>>
>>use strict;
>>
>>$_ = 'NAME = "between quotes" not this" nor this';
>>
>>print $1."\n" if m/\s*NAME\s*=\s*"*(.*?)"|$/s;
>>---
>>
>>This prints out as i want:
>>between quotes
>>
>>But if i delete the last two doublequotes and just keep the first it
>>prints nothing. If i delete the first doublequote also, i get an
>>"uninitialized value" error.
>>
>>Anybody knows a smooth solution to this?

>
>
> print "$1\n" if /\s*NAME\s*=\s*"?([^"]+)/;
>
>
>
> John


Well! Thats a lot less complicated and smarter than i ever thought off
and as bonus it works! I really should be thinking in a more KISS like way.

Thanks!

Mats
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
urllib2.unquote() vs unicode Maciej Bliziński Python 1 03-18-2008 06:24 AM
urllib.unquote + unicode koara Python 1 11-14-2007 06:20 AM
urllib.unquote and unicode George Sakkis Python 11 12-22-2006 03:28 PM
How to extract part of the text (htm) file after start word until end word? Perl Misc 4 05-12-2006 06:51 PM
quote, unquote William Tasso HTML 5 11-12-2003 12:31 AM



Advertisments