![]() |
summarize text
hello list,
does anyone know of a library which permits to summarise text? i've been looking at nltk but haven't found anything yet. any help would be very welcome. thank you all in advance, robin |
Re: summarize text
> does anyone know of a library which permits to summarise text?
> i've been looking at nltk but haven't found anything yet. any > help would be very welcome. Well, summarizing text is one of those things that generally takes a brain-cell or two to do. Automating the process would require doing it either smartly (some sort of neural-net/NLP/Markov-chain technology, which is a non-trivial task--something one might consider braving in the 3rd or 4th-year of a university computer-science program), or doing it fairly dumbly. As an example of a "dumb" solution, you can use regexps to trim off the first few words and the last few words and call that a "summary": >>> import re >>> r = re.compile(r'^(.{8}.*?\b)\s.*\s(\b.{8}.*?)', re.DOTALL) >>> s = """This is the first line .... and it has a second line .... and a third line .... and the last line is the fourth line.""" >>> result = r.sub(r"\1...\2",s.strip()) >>> result 'This is the...fourth line.' You can adjust the "{8}" portions for more or less leader/trailing context characters. The regexp might need a bit of tweaking for somewhat short strings, but if they're fairly short, one might not need to summarize them ;) -tkc |
Re: summarize text
robin wrote: > hello list, > > does anyone know of a library which permits to summarise text? i've > been looking at nltk but haven't found anything yet. any help would be unclear what you're asking, maybe look at: http://www.cs.waikato.ac.nz/~ml/weka/index.html http://www.kdnuggets.com/software/suites.html http://www.ailab.si/orange http://mallet.cs.umass.edu/index.php/Main_Page http://minorthird.sourceforge.net/ http://www.dia.uniroma3.it/db/roadRunner/ http://www.lemurproject.org/ |
Re: summarize text
thanks for all your replies. lemur looks pretty interesting!
robin gene tani wrote: > robin wrote: > > hello list, > > > > does anyone know of a library which permits to summarise text? i've > > been looking at nltk but haven't found anything yet. any help would be > > unclear what you're asking, maybe look at: > http://www.cs.waikato.ac.nz/~ml/weka/index.html > > http://www.kdnuggets.com/software/suites.html > http://www.ailab.si/orange > > http://mallet.cs.umass.edu/index.php/Main_Page > http://minorthird.sourceforge.net/ > http://www.dia.uniroma3.it/db/roadRunner/ > > http://www.lemurproject.org/ |
Re: summarize text
.... sorry, I thought you said "summarize Proust".
:) |
| All times are GMT. The time now is 11:03 PM. |
Powered by vBulletin®. Copyright ©2000 - 2013, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.