Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Text mining in Python

Thread Tools

Text mining in Python

Posts: n/a
Hello everyone,

I need to do the following:

(0. transform words in a document into word roots)

1. analyze a set of documents to see which words are highly frequent

2. detect clusters of those highly frequent words

3. map the clusters to some "special" keywords

4. rank the documents on clusters and "top n" most frequent words

5. provide search that would rank documents according to whether search
words were "special" cluster keywords or frequent words

Is there some good open source engine out there that would be suitable
to the task at hand? Anybody has experience with them?

Now, I do now about NLTK and Python bindings to UIMA. The thing is, I do
not know if those are good for the above task. If somebody has
experience with those or other and would be able to say if they're good
for this, please post.


Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: Text mining in Python Robert Kern Python 0 03-10-2010 07:05 PM
text mining projects name?? Muzammil C++ 1 11-04-2008 04:46 AM
Python libraries for log mining and event abstraction? (possibly OT) felciano Python 0 06-24-2008 10:39 PM
Python good for data mining? Jens Python 22 11-09-2007 06:09 PM
Data Mining Web Pages ASP .Net 0 06-01-2004 07:15 AM