Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Complicated email parse or text extraction and database insertion

Reply
Thread Tools

Complicated email parse or text extraction and database insertion

 
 
code_worthy@bellsouth.net
Guest
Posts: n/a
 
      08-15-2005
I am trying to strip some data out of numerous emails and place it in
my database. I know that this seems as if it has been done before.
But, this is a little different. First, the numerous emails all have a
set of data that needs to be extracted and inserted into the database.
Some of the data in the email is id, name, address, city, state, zip,
company, etc. The catch is that the date is formated and presented
differently in each email. Take into consideration the following email
examples:
- excert from email #1
ID:.............. 12345
Name:............ JOHN DOE
Address:......... PO BOX 9999
City:............ Somecity
State:........... CA
Zip Code:........ 90210
================================================== =============

Company Information:

1.:-
Company Name:....... Perl N PHP Scripts Welcome

- excert from email #2
Full Name -- Doe, John
Address -- PO BOX 9999
City -- Somecity St -- California
Zip -- 90210
Company Name -- Perl N PHP Scripts Welcome
ID -- 12345

- excert from email #3
Name.....Address.....City.....State.....Zip.....Id entification
Number.....Company
John Doe.....PO Box
9999.....Somecity.....CA.....90210.....12345.....P erl N PHP Scripts
Welcome

- excert from email #4

Name.........Address.........City.........State... ..Zip.......Identification
Number.....Company
JOHN DOE.....PO BOX
9999.....SOMECITY.....CA........90210.....12345... ..................Perl
N PHP Scripts Welcome

Can anyone help me with either scripts that have already been
developed or suggestions on how to go about striping out the needed
information from emails with out knowing their format or order of the
data? THANKS IN ADVANCE.

 
Reply With Quote
 
 
 
 
Gunnar Hjalmarsson
Guest
Posts: n/a
 
      08-15-2005
http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Can anyone help me with either scripts that have already been
> developed or suggestions on how to go about striping out the needed
> information from emails with out knowing their format or order of the
> data?


No.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl
 
Reply With Quote
 
 
 
 
Matt Garrish
Guest
Posts: n/a
 
      08-16-2005

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
>I am trying to strip some data out of numerous emails and place it in
> my database. I know that this seems as if it has been done before.
> But, this is a little different. First, the numerous emails all have a
> set of data that needs to be extracted and inserted into the database.
> Some of the data in the email is id, name, address, city, state, zip,
> company, etc. The catch is that the date is formated and presented
> differently in each email.
>


You're asking to find patterns where there are none (or you haven't looked
hard enough yet to distinguish them). The two options that spring to mind
are: 1) to write a script that can process the most common formats and use
it to batch process as many emails as you can; and/or 2) clean up the data
manually first (e.g., convert to xml).

Matt


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Database Database Database Database scott93727@gmail.com Computer Information 0 09-27-2012 02:43 AM
DataBase DataBase DataBase DataBase scott93727@gmail.com Computer Information 0 09-26-2012 09:40 AM
psycopg2 insertion and reading binary data to PostgreSQL database(bytea datatype) romap@libero.it Python 0 03-01-2011 10:57 AM
insertion, extraction, and streams Christopher C++ 1 01-28-2008 07:20 AM
bit insertion and extraction Vince C++ 5 06-28-2005 01:08 PM



Advertisments