Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > how to extract columns like awk $1 $5

Reply
Thread Tools

how to extract columns like awk $1 $5

 
 
Anand S Bisen
Guest
Posts: n/a
 
      01-07-2005
Hi

Is there a simple way to extract words speerated by a space in python
the way i do it in awk '{print $4 $5}' . I am sure there should be some
but i dont know it.

Thanks
n00b


 
Reply With Quote
 
 
 
 
beliavsky@aol.com
Guest
Posts: n/a
 
      01-07-2005
It takes a few more lines in Python, but you can do something like

for text in open("file.txt","r"):
words = text.split()
print words[4],words[5]
(assuming that awk starts counting from zero -- I forget).

 
Reply With Quote
 
 
 
 
Jeremy Sanders
Guest
Posts: n/a
 
      01-07-2005
On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:

> Is there a simple way to extract words speerated by a space in python
> the way i do it in awk '{print $4 $5}' . I am sure there should be some
> but i dont know it.


mystr = '1 2 3 4 5 6'
parts = mystr.split()
print parts[3:5]

Jeremy

 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      01-07-2005
In article <(E-Mail Removed)>,
Anand S Bisen <(E-Mail Removed)> wrote:
>Hi
>
>Is there a simple way to extract words speerated by a space in python
>the way i do it in awk '{print $4 $5}' . I am sure there should be some
>but i dont know it.


Something along the lines of:

words = input.split()
print words[4], words[5]
 
Reply With Quote
 
Paul Rubin
Guest
Posts: n/a
 
      01-07-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) (Roy Smith) writes:
> Something along the lines of:
>
> words = input.split()
> print words[4], words[5]


That throws an exception if there are fewer than 6 fields, which might
or might not be what you want.
 
Reply With Quote
 
Dan Valentine
Guest
Posts: n/a
 
      01-08-2005
On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:

> Is there a simple way to extract words speerated by a space in python
> the way i do it in awk '{print $4 $5}' . I am sure there should be some
> but i dont know it.


i guess it depends on how faithfully you want to reproduce awk's behavior
and options.

as several people have mentioned, strings have the split() method for
simple tokenization, but blindly indexing into the resulting sequence
can give you an out-of-range exception. out of range indexes are no
problem for awk; it would just return an empty string without complaint.

note that the index bases are slightly different: python sequences
start with index 0, while awk's fields begin with $1. there IS a $0,
but it means the entire unsplit line.

the split() method accepts a separator argument, which can be used to
replicate awk's -F option / FS variable.

so, if you want to closely approximate awk's behavior without fear of
exceptions, you could try a small function like this:


def awk_it(instring,index,delimiter=" "):
try:
return [instring,instring.split(delimiter)[index-1]][max(0,min(1,index))]
except:
return ""


>>> print awk_it("a b c d e",0)

a b c d e

>>> print awk_it("a b c d e",1)

a

>>> print awk_it("a b c d e",5)

e

>>> print awk_it("a b c d e",6)



- dan
 
Reply With Quote
 
Roy Smith
Guest
Posts: n/a
 
      01-08-2005
Dan Valentine <(E-Mail Removed)> wrote:

> On Fri, 07 Jan 2005 12:15:48 -0500, Anand S Bisen wrote:
>
> > Is there a simple way to extract words speerated by a space in python
> > the way i do it in awk '{print $4 $5}' . I am sure there should be some
> > but i dont know it.

>
> i guess it depends on how faithfully you want to reproduce awk's behavior
> and options.
>
> as several people have mentioned, strings have the split() method for
> simple tokenization, but blindly indexing into the resulting sequence
> can give you an out-of-range exception. out of range indexes are no
> problem for awk; it would just return an empty string without complaint.


It's pretty easy to create a list type which has awk-ish behavior:

class awkList (list):
def __getitem__ (self, key):
try:
return list.__getitem__ (self, key)
except IndexError:
return ""

l = awkList ("foo bar baz".split())
print "l[0] = ", repr (l[0])
print "l[5] = ", repr (l[5])

-----------

Roy-Smiths-Computerlay$ ./awk.py
l[0] = 'foo'
l[5] = ''

Hmmm. There's something going on here I don't understand. The ref
manual (3.3.5 Emulating container types) says for __getitem__(), "Note:
for loops expect that an IndexError will be raised for illegal indexes
to allow proper detection of the end of the sequence." I expected my
little demo class to therefore break for loops, but they seem to work
fine:

>>> import awk
>>> l = awk.awkList ("foo bar baz".split())
>>> l

['foo', 'bar', 'baz']
>>> for i in l:

.... print i
....
foo
bar
baz
>>> l[5]

''

Given that I've caught the IndexError, I'm not sure how that's working.
 
Reply With Quote
 
Carl Banks
Guest
Posts: n/a
 
      01-08-2005
Roy Smith wrote:
> Hmmm. There's something going on here I don't understand. The ref
> manual (3.3.5 Emulating container types) says for __getitem__(),

"Note:
> for loops expect that an IndexError will be raised for illegal

indexes
> to allow proper detection of the end of the sequence." I expected my


> little demo class to therefore break for loops, but they seem to work


> fine:
>
> >>> import awk
> >>> l = awk.awkList ("foo bar baz".split())
> >>> l

> ['foo', 'bar', 'baz']
> >>> for i in l:

> ... print i
> ...
> foo
> bar
> baz
> >>> l[5]

> ''
>
> Given that I've caught the IndexError, I'm not sure how that's

working.


The title of that particular section is "Emulating container types",
which is not what you're doing, so it doesn't apply here. For built-in
types, iterators are at work. The list iterator probably doesn't even
call getitem, but accesses the items directly from the C structure.
--
CARL BANKS

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: How include a large array? Edward A. Falk C Programming 1 04-04-2013 08:07 PM
Matching block of text awk-like: /---/,/---/ A. Farber Perl Misc 9 06-03-2009 04:23 PM
Re: how to extract columns like awk $1 $5 Craig Ringer Python 0 01-07-2005 05:34 PM
using perl on the command line, like sed or awk gorda Perl Misc 3 10-21-2003 01:50 PM
using perl on the command line, like sed or awk gorda Perl 2 10-21-2003 06:38 AM



Advertisments