Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Perl Misc (http://www.velocityreviews.com/forums/f67-perl-misc.html)
-   -   Extracting Text (http://www.velocityreviews.com/forums/t886825-extracting-text.html)

Jake Gottlieb 06-10-2004 04:19 PM

Extracting Text
 
I am trying to extract lines with:

GO:0009986

out of:


ENSG00000113494.3 AAA60174.1 GO:0009123 5618 216638_s_at
ENSG00000113494.3 AAD32032.1 GO:0009345 5618 216638_s_at
ENSG00000113494.3 AAK32703.1 GO:0009764 5618 216638_s_at
ENSG00000113494.3 AAH59392.1 GO:0009986 5618 216638_s_at

ENSG00000113494.3 AAA60174.1 GO:0009986 206346_at
ENSG00000113494.3 AAD32032.1 GO:0009867 206346_at
ENSG00000113494.3 AAK32703.1 GO:0004567 206346_at
ENSG00000113494.3 AAH59392.1 GO:0000678 206346_at

ENSG00000113494.3 AAA60174.1 GO:0009986 211917_s_at
ENSG00000113494.3 AAD32032.1 GO:0009986 211917_s_at
ENSG00000113494.3 AAK32703.1 GO:0005764 211917_s_at
ENSG00000113494.3 AAH59392.1 GO:0009986 211917_s_at

ENSG00000113494.3 AAA60174.1 GO:0009986 210476_s_at
ENSG00000113494.3 AAD32032.1 GO:0003765 210476_s_at
ENSG00000113494.3 AAK32703.1 GO:0009986 210476_s_at
ENSG00000113494.3 AAH59392.1 GO:0005876 210476_s_at

I have been trying to write a program for it, but can't seem to do it.
If someone could help, I would be very appreciative (I am sure it's
really easy, but Perl is new to me).

Thanks

Paul Lalli 06-10-2004 04:23 PM

Re: Extracting Text
 
On Thu, 10 Jun 2004, Jake Gottlieb wrote:

> I am trying to extract lines with:
>
> GO:0009986
>
> out of:
>
>
> ENSG00000113494.3 AAA60174.1 GO:0009123 5618 216638_s_at
> ENSG00000113494.3 AAD32032.1 GO:0009345 5618 216638_s_at
> ENSG00000113494.3 AAK32703.1 GO:0009764 5618 216638_s_at
> ENSG00000113494.3 AAH59392.1 GO:0009986 5618 216638_s_at
>
> ENSG00000113494.3 AAA60174.1 GO:0009986 206346_at
> ENSG00000113494.3 AAD32032.1 GO:0009867 206346_at
> ENSG00000113494.3 AAK32703.1 GO:0004567 206346_at
> ENSG00000113494.3 AAH59392.1 GO:0000678 206346_at
>
> ENSG00000113494.3 AAA60174.1 GO:0009986 211917_s_at
> ENSG00000113494.3 AAD32032.1 GO:0009986 211917_s_at
> ENSG00000113494.3 AAK32703.1 GO:0005764 211917_s_at
> ENSG00000113494.3 AAH59392.1 GO:0009986 211917_s_at
>
> ENSG00000113494.3 AAA60174.1 GO:0009986 210476_s_at
> ENSG00000113494.3 AAD32032.1 GO:0003765 210476_s_at
> ENSG00000113494.3 AAK32703.1 GO:0009986 210476_s_at
> ENSG00000113494.3 AAH59392.1 GO:0005876 210476_s_at
>
> I have been trying to write a program for it, but can't seem to do it.
> If someone could help, I would be very appreciative (I am sure it's
> really easy, but Perl is new to me).


Show us what you've written so far, so we can help you to see why it
"doesn't work". You've shown us the input and we can deduce the desired
output. Now show us your code, and what output it gave, so we may see how
it doesn't meet your specifications.

Paul Lalli

Gunnar Hjalmarsson 06-10-2004 04:25 PM

Re: Extracting Text
 
Jake Gottlieb wrote:
> I am trying to extract lines with:
>
> GO:0009986


<snip>

> I have been trying to write a program for it, but can't seem to do
> it. If someone could help, I would be very appreciative (I am sure
> it's really easy, but Perl is new to me).


http://learn.perl.org/

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


Jake Gottlieb 06-11-2004 07:35 AM

Re: Extracting Text
 
Gunnar Hjalmarsson <noreply@gunnar.cc> wrote in message news:<2irgdrFqh08hU1@uni-berlin.de>...
> Jake Gottlieb wrote:
> > I am trying to extract lines with:
> >
> > GO:0009986

>
> <snip>
>
> > I have been trying to write a program for it, but can't seem to do
> > it. If someone could help, I would be very appreciative (I am sure
> > it's really easy, but Perl is new to me).

>
> http://learn.perl.org/


Here is my code. I am sure its wrong, and would be greatful if someone
could correct and complete it. I would like to extract lines from the
original code, and put them into another text file. I have been trying
for a while:

while (<file.txt>) {
$line = $_;
$yes = (index $line, 'GO:000');
if ($yes > -1) {
print "YES : $line";
}
if ($line =~ /ENSG\d+.\d\s+\S+\s+GO:\d{7}\s+\d+\s+/){
print "La GO! $line \n";
}
}

Peter Hickman 06-11-2004 08:35 AM

Re: Extracting Text
 
Jake Gottlieb wrote:
> Here is my code. I am sure its wrong, and would be greatful if someone
> could correct and complete it. I would like to extract lines from the
> original code, and put them into another text file. I have been trying
> for a while:
>
> while (<file.txt>) {
> $line = $_;
> $yes = (index $line, 'GO:000');
> if ($yes > -1) {
> print "YES : $line";
> }
> if ($line =~ /ENSG\d+.\d\s+\S+\s+GO:\d{7}\s+\d+\s+/){
> print "La GO! $line \n";
> }
> }


If all you want is to display lines that contain the string GO:0009986 then this
will do the trick.

[peter@wasabi xxx]$ cat prog
#!/usr/bin/perl -w

use strict;
use warnings;

while ( my $line = <> ) {
next unless $line =~ m/\s+GO:0009986\s+/;

print $line;
}
[peter@wasabi xxx]$

Basically it reads data from standard input and skips if the line does not match
the regex otherwise it prints it to standard output.

[peter@wasabi xxx]$ perl prog file.txt
ENSG00000113494.3 AAH59392.1 GO:0009986 5618 216638_s_at
ENSG00000113494.3 AAA60174.1 GO:0009986 206346_at
ENSG00000113494.3 AAA60174.1 GO:0009986 211917_s_at
ENSG00000113494.3 AAD32032.1 GO:0009986 211917_s_at
ENSG00000113494.3 AAH59392.1 GO:0009986 211917_s_at
ENSG00000113494.3 AAA60174.1 GO:0009986 210476_s_at
ENSG00000113494.3 AAK32703.1 GO:0009986 210476_s_at
[peter@wasabi xxx]$

I'm not too sure what all the $yes stuff in your code was for and <file.txt> is
not how you open or handle a file but you got the idea of regex although it
would seem to be over specified for the problem.

Anno Siegel 06-11-2004 08:56 AM

Re: Extracting Text
 
Peter Hickman <peter@semantico.com> wrote in comp.lang.perl.misc:
> Jake Gottlieb wrote:


[...]

> If all you want is to display lines that contain the string GO:0009986
> then this
> will do the trick.
>
> [peter@wasabi xxx]$ cat prog
> #!/usr/bin/perl -w
>
> use strict;
> use warnings;
>
> while ( my $line = <> ) {
> next unless $line =~ m/\s+GO:0009986\s+/;

^ ^
The "+"es make no difference here.

> print $line;
> }


That can be simplified to

/\sGO:0009986\s/ and print while <>;

Anno

Gunnar Hjalmarsson 06-11-2004 09:24 AM

Re: Extracting Text
 
Jake Gottlieb wrote:
> Here is my code. I am sure its wrong,


Please be more specific about the problem. You'd better study the
posting guidelines for this group:

http://mail.augustmail.com/~tadmc/cl...uidelines.html

> and would be greatful if someone could correct and complete it. I
> would like to extract lines from the original code, and put them
> into another text file.


Below please find a couple of comments. If you want to write something
to another file, you should open that file for writing...

> while (<file.txt>) {


That does not open the file for reading. This does:

open my $fh, '< file.txt' or die $!;
while (<$fh>) {

See

perldoc -f open

> $line = $_;
> $yes = (index $line, 'GO:000');


You should have

use strict;
use warnings;

in the beginning of the program, and declare the variables you introduce:

my $line = $_;
my $yes = (index $line, 'GO:000');
----^^

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl


Tore Aursand 06-11-2004 10:24 AM

Re: Extracting Text
 
On Fri, 11 Jun 2004 00:35:59 -0700, Jake Gottlieb wrote:
> while (<file.txt>) {


That doesn't read from "file.txt". This one does (untested);

open( FH, '<', 'file.txt' ) or die "$!\n";
while ( <FH> ) {
# ...
}

> $line = $_;
> $yes = (index $line, 'GO:000');
> if ($yes > -1) {
> print "YES : $line";
> }
> if ($line =~ /ENSG\d+.\d\s+\S+\s+GO:\d{7}\s+\d+\s+/){
> print "La GO! $line \n";
> }
> }


If you are sure that you can match on 'GO:000', you're on the right track
using 'index'. But you don't need any regular expressions (untested);

open( FH, '<', 'file.txt' ) or die "$!\n";
while ( <FH> ) {
next unless ( index($_, 'GO:000') >= 0 );
print;
}
close( FH );

Also: Be sure to 'use strict' and 'use warnings' in your script(s).


--
Tore Aursand <tore@aursand.no>
"Poor management can increase software costs more rapidly than any
other factor." (Barry Boehm)

John Bokma 06-11-2004 10:48 AM

Re: Extracting Text
 
Tore Aursand wrote:

> next unless ( index($_, 'GO:000') >= 0 );


index($_, 'GO:000') > -1 or next;

--
John MexIT: http://johnbokma.com/mexit/
personal page: http://johnbokma.com/
Experienced Perl programmer available: http://castleamber.com/
Happy Customers: http://castleamber.com/testimonials.html

Anno Siegel 06-11-2004 11:08 AM

Re: Extracting Text
 
John Bokma <postmaster@castleamber.com> wrote in comp.lang.perl.misc:
> Tore Aursand wrote:
>
> > next unless ( index($_, 'GO:000') >= 0 );

>
> index($_, 'GO:000') > -1 or next;


1 + index $_, 'GO:000' or next;

Anno


All times are GMT. The time now is 05:48 PM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.