Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > How to substitute everything but something?

Reply
Thread Tools

How to substitute everything but something?

 
 
Eric.Medlin@gmail.com
Guest
Posts: n/a
 
      07-19-2006
I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
< include > and < with nothing. But, I want to replace everthing but
what is inside > and <. How can I negate what I have?

 
Reply With Quote
 
 
 
 
Paul Lalli
Guest
Posts: n/a
 
      07-19-2006
(E-Mail Removed) wrote:
> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
> < include > and < with nothing. But, I want to replace everthing but
> what is inside > and <. How can I negate what I have?


TIMTOWTDI

$rawData[$i] =~ s/.*?(>.*<).*/$1/;

$rawData[$i] =~ /(>.*<)/ and $rawData[$i] = $1;

.... and probably others.

Paul Lalli

 
Reply With Quote
 
 
 
 
Ted Zlatanov
Guest
Posts: n/a
 
      07-19-2006
On 19 Jul 2006, http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:

> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
> < include > and < with nothing. But, I want to replace everthing but
> what is inside > and <. How can I negate what I have?


If you are trying to extract text from SGML/HTML/XML/etc. there are
easier ways. The way you are attempting will not work in many common
cases. See 'perldoc -q html' to get started.

In any case. It may help to think of the problem as "extraction" of
what's between '>' and '<', rather than "elimination" of everything
except what's between those two delimiters. I hope I understood your
request correctly.

You could do something like what's below. Again, consider using a
parser specific to your data instead of grabbing text like this.

Ted

#!/usr/bin/perl

use warnings;
use strict;
use Data:umper;

my $text = join '', <DATA>;
my @data = ($text =~ m/>(.*?)</g);
print Dumper \@data;
__DATA__
<html><head></head><body>HTML text here</body></html>
>just text here<

plain text here
<><><>text here<><
 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      07-19-2006
(E-Mail Removed) wrote:
> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
> < include > and < with nothing. But, I want to replace everthing but
> what is inside > and <. How can I negate what I have?


s/.*>//, s/<.*// for $rawData[ $i ];


John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
Ted Zlatanov
Guest
Posts: n/a
 
      07-19-2006
On 19 Jul 2006, (E-Mail Removed) wrote:

(E-Mail Removed) wrote:
>> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
>> < include > and < with nothing. But, I want to replace everthing but
>> what is inside > and <. How can I negate what I have?

>
> $rawData[$i] =~ /(>.*<)/ and $rawData[$i] = $1;


He asked for what's inside > <, so the above should be

$rawData[$i] =~ />(.*)</ and $rawData[$i] = $1;

Also, while the OP didn't specifically say it, he probably wants the
non-greedy match

$rawData[$i] =~ />(.*?)</ and $rawData[$i] = $1;

so the extracted data doesn't have < and > pairs inside it.

Ted
 
Reply With Quote
 
Paul Lalli
Guest
Posts: n/a
 
      07-19-2006
Ted Zlatanov wrote:
> On 19 Jul 2006, (E-Mail Removed) wrote:
>
> (E-Mail Removed) wrote:
> >> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
> >> < include > and < with nothing. But, I want to replace everthing but
> >> what is inside > and <. How can I negate what I have?

> >
> > $rawData[$i] =~ /(>.*<)/ and $rawData[$i] = $1;

>
> He asked for what's inside > <, so the above should be
>
> $rawData[$i] =~ />(.*)</ and $rawData[$i] = $1;


He also said he wants to "negate what I have". The two requirements
are contradictory, as what he has *does* replace > and <, so the
negation of that should *not* replace > and <.

I chose to abide by his final requirement. You chose to abide by his
first. Only the OP knows which one he meant.

> Also, while the OP didn't specifically say it, he probably wants the
> non-greedy match
>
> $rawData[$i] =~ />(.*?)</ and $rawData[$i] = $1;
>
> so the extracted data doesn't have < and > pairs inside it.


Now you're just being a mind reader.

Paul Lalli

 
Reply With Quote
 
Ted Zlatanov
Guest
Posts: n/a
 
      07-19-2006
On 19 Jul 2006, (E-Mail Removed) wrote:

(E-Mail Removed) wrote:
>> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
>> < include > and < with nothing. But, I want to replace everthing but
>> what is inside > and <. How can I negate what I have?

>
> s/.*>//, s/<.*// for $rawData[ $i ];


I think the OP's code will match the biggest >xyz< pair, while your
code will extract the last >xyz< pair. My followup will extract all
the >xyz< data. I don't think the problem as specified can be solved
exactly right, so maybe the OP should help us a little

Ted
 
Reply With Quote
 
Ted Zlatanov
Guest
Posts: n/a
 
      07-19-2006
On 19 Jul 2006, (E-Mail Removed) wrote:

Ted Zlatanov wrote: > On 19 Jul 2006, (E-Mail Removed) wrote: > > (E-Mail Removed) wrote:
>>>> I have $rawData[$i] =~ s/>.*<//; That will replace everthing inside >
>>>> < include > and < with nothing. But, I want to replace everthing but
>>>> what is inside > and <. How can I negate what I have?
>>>
>>> $rawData[$i] =~ /(>.*<)/ and $rawData[$i] = $1;

>>
>> He asked for what's inside > <, so the above should be
>>
>> $rawData[$i] =~ />(.*)</ and $rawData[$i] = $1;

>
> He also said he wants to "negate what I have". The two requirements
> are contradictory, as what he has *does* replace > and <, so the
> negation of that should *not* replace > and <.
>
> I chose to abide by his final requirement. You chose to abide by his
> first. Only the OP knows which one he meant.


Yeah, see my followup to John Krahn, we don't really know what the
requirements are. I didn't read the last requirement the way you did,
obviously.

>> Also, while the OP didn't specifically say it, he probably wants the
>> non-greedy match
>>
>> $rawData[$i] =~ />(.*?)</ and $rawData[$i] = $1;
>>
>> so the extracted data doesn't have < and > pairs inside it.

>
> Now you're just being a mind reader.


Er, you can certainly interpret it that way I read

"everything but what is inside > and <"

as "the first < should terminate 'what is inside'". Confusing
requirements breed confusion, I guess. Sorry for that, as I
perpetuated the confusion.

Ted
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Not Quite Everything for a Theory of Everything fitz VOIP 0 02-28-2010 04:42 PM
Can't see internal email but everything else works for sending/receiving emails.. md Computer Support 0 04-02-2004 07:18 PM
Can see everything but my own sites Tim Cisco 2 12-16-2003 05:05 PM
AS5800 not working with Windows 95 - but works with everything else? Vinny Abello Cisco 14 12-10-2003 05:47 PM



Advertisments