"surf" <> writes:
> Some web sites like Yahoo have tons of people in their personals,
> but the ability to search the profiles of people is so horribly
> limited. Has anyone written some kind of Perl interface or some way
> to do more advanced searches of personals ?
I haven't for Yahoo, but I once wrote my own system for some other
personals sites that had much worse search and display capabilities.
The mining program broke down like this:
Get a list of new profiles since the last run. (This part varies a
lot from site to site.)
Foreach profile, check it against my requirements (age, weight,
whatever)
If it matches, save the profile text to a file (generally with the
extraneous ads and stuff stripped). Parse out the person's vital
statistics and save them to a database. Fetch any pictures
belonging to the profile and save them with a naming scheme matching
them to the profile, and add links to them to the profile.
This resulted in a nice database on my local system holding only the
important information, an html file for each profile, and a set of
images. Then I had a second program I'd run to look through them and
delete the ones I wasn't interested in.
None of it was that complicated. The hardest part was parsing through
the HTML mess each site used on its profile pages -- figuring out what
tags and text I could count on staying the same in every profile so I
could parse out the right info.
--
Aaron --
http://360.yahoo.com/aaron_baugher