Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Python parser

Reply
Thread Tools

Python parser

 
 
Clarendon
Guest
Posts: n/a
 
      03-02-2009
Can somebody recommend a good parser that can be used in Python
programs? I need a parser with large grammar that can cover a large
amount of random texts.

Thank you very much.
 
Reply With Quote
 
 
 
 
Lie Ryan
Guest
Posts: n/a
 
      03-02-2009
Clarendon wrote:
> Can somebody recommend a good parser that can be used in Python
> programs?


Do you want parser that can parse python source code or parser that
works in python? If the latter, pyparsing is a popular choice. Ply is
another. There are many choice:
http://nedbatchelder.com/text/python-parsers.html

For simple parsing, the re module might be enough.

> I need a parser with large grammar that can cover a large
> amount of random texts.


Random text? Uh... what's the purpose of parsing random text?
 
Reply With Quote
 
 
 
 
Clarendon
Guest
Posts: n/a
 
      03-02-2009
Thank you, Lie and Andrew for your help.

I have studied NLTK quite closely but its parsers seem to be only for
demo. It has a very limited grammar set, and even a parser that is
supposed to be "large" does not have enough grammar to cover common
words like "I".

I need to parse a large amount of texts collected from the web (around
a couple hundred sentences at a time) very quickly, so I need a parser
with a broad scope of grammar, enough to cover all these texts. This
is what I mean by 'random'.

An advanced programmer has advised me that Python is rather slow in
processing large data, and so there are not many parsers written in
Python. He recommends that I use Jython to use parsers written in
Java. What are your views about this?

Thank you very much.



 
Reply With Quote
 
Robert Kern
Guest
Posts: n/a
 
      03-02-2009
On 2009-03-02 16:14, Clarendon wrote:
> Thank you, Lie and Andrew for your help.
>
> I have studied NLTK quite closely but its parsers seem to be only for
> demo. It has a very limited grammar set, and even a parser that is
> supposed to be "large" does not have enough grammar to cover common
> words like "I".
>
> I need to parse a large amount of texts collected from the web (around
> a couple hundred sentences at a time) very quickly, so I need a parser
> with a broad scope of grammar, enough to cover all these texts. This
> is what I mean by 'random'.
>
> An advanced programmer has advised me that Python is rather slow in
> processing large data, and so there are not many parsers written in
> Python. He recommends that I use Jython to use parsers written in
> Java. What are your views about this?


Let me clarify your request: you are asking for a parser of the English
language, yes? Not just parsers in general? Not many English-language parsers
are written in *any* language.

AFAIK, there is no English-language parser written in Python beyond those
available in NLTK. There are probably none (in any language) which will robustly
parse all of the grammatically correct English texts you will encounter by
scraping the web, much less all of the incorrect English you will encounter.

Python can be rather slow for certain kinds of processing of large volumes (and
really quite speedy for others). In this case, it's neither here nor there; the
algorithms are reasonably slow in any language.

You may try your luck with link-grammar, which is implemented in C:

http://www.abisource.com/projects/link-grammar/

Or The Stanford Parser, implemented in Java:

http://nlp.stanford.edu/software/lex-parser.shtml

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

 
Reply With Quote
 
Gabriel Genellina
Guest
Posts: n/a
 
      03-04-2009
En Tue, 03 Mar 2009 22:39:19 -0200, Alan G Isaac <(E-Mail Removed)>
escribió:

> This reminds me: the SimpleParse developers ran into
> some troubles porting to Python 2.6. It would be
> great if someone could give them a hand.


Do you mean the simpleparser project in Sourceforge? Latest alpha released
in 2003? Or what?

--
Gabriel Genellina

 
Reply With Quote
 
Kay Schluehr
Guest
Posts: n/a
 
      03-04-2009
On 2 Mrz., 23:14, Clarendon <(E-Mail Removed)> wrote:
> Thank you, Lie and Andrew for your help.
>
> I have studied NLTK quite closely but its parsers seem to be only for
> demo. It has a very limited grammar set, and even a parser that is
> supposed to be "large" does not have enough grammar to cover common
> words like "I".
>
> I need to parse a large amount of texts collected from the web (around
> a couple hundred sentences at a time) very quickly, so I need a parser
> with a broad scope of grammar, enough to cover all these texts. This
> is what I mean by 'random'.
>
> An advanced programmer has advised me that Python is rather slow in
> processing large data, and so there are not many parsers written in
> Python. He recommends that I use Jython to use parsers written in
> Java. What are your views about this?
>
> Thank you very much.


You'll most likely need a GLR parser.

There is

http://www.lava.net/~newsham/pyggy/

which I tried once and found it to be broken.

Then there is the Spark toolkit

http://pages.cpsc.ucalgary.ca/~aycock/spark/

I checked it out years ago and found it was very slow.

Then there is bison which can be used with a %glr-parser declaration
and PyBison bindings

http://www.freenet.org.nz/python/pybison/

Bison might be solid and fast. I can't say anything about the quality
of the bindings though.
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
import parser does not import parser.py in same dir on win Joel Hedlund Python 2 11-11-2006 03:46 PM
import parser does not import parser.py in same dir on win Joel Hedlund Python 0 11-11-2006 11:34 AM
XML Parser VS HTML Parser ZOCOR Java 11 10-05-2004 01:58 PM
XMLparser: Difference between parser.setErrorHandler() vs. parser.setContentHandler() Bernd Oninger Java 0 06-09-2004 01:26 AM
XMLparser: Difference between parser.setErrorHandler() vs. parser.setContentHandler() Bernd Oninger XML 0 06-09-2004 01:26 AM



Advertisments