Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Re: Regex Generator From Multiple Files

Reply
Thread Tools

Re: Regex Generator From Multiple Files

 
 
MRAB
Guest
Posts: n/a
 
      01-06-2009
James Pruitt wrote:
> I am looking for a way given a number of files, say 3, that represent
> technical support tickets in the same format to generate regular
> expressions for the different fields automatically.
>
> An example from of one line from each file:
> Date: 12/30/2008 Room: 457 Building: Main
> Date: 12/31/2008 Room: A21 Building: Annex
> Date: 1/4/2009 Room: L69 Building: Library
>
> The program would then, possibly using the python diff library, generate
> the regular expression needed to parse out different fields. In this
> case it might return a tuple like
> ("^Date:[\w]+(.*)[\w]+Room","Room:[\w]+(.*)[\w]+Building","Building:[\w]+(.*)[\w]+$")
> that would match each of the fields based on the common data and sort of
> assume that what doesn't change between them is data we are looking for.
>

Why not just assume that each field consists of a word terminated by a
colon, then some text, then the next field or the end of the line?
 
Reply With Quote
 
 
 
 
Jeremy.Chen
Guest
Posts: n/a
 
      01-06-2009
On Jan 6, 8:48*am, MRAB <(E-Mail Removed)> wrote:
> James Pruitt wrote:
> > I am looking for a way given a number of files, say 3, that represent
> > technical support tickets in the same format to generate regular
> > expressions for the different fields automatically.

>
> > An example from of one line from each file:
> > Date: 12/30/2008 Room: 457 Building: Main
> > Date: 12/31/2008 Room: A21 Building: Annex
> > Date: 1/4/2009 Room: L69 Building: Library

>
> > The program would then, possibly using the python diff library, generate
> > the regular expression needed to parse out different fields. In this
> > case it might return a tuple like
> > ("^Date:[\w]+(.*)[\w]+Room","Room:[\w]+(.*)[\w]+Building","Building:[\w]+(.**)[\w]+$")
> > that would match each of the fields based on the common data and sort of
> > assume that what doesn't change between them is data we are looking for..

>
> Why not just assume that each field consists of a word terminated by a
> colon, then some text, then the next field or the end of the line?- Hide quoted text -
>
> - Show quoted text -


do you mean the sub method?
-------------
re.sub( r'(?i)(example)',self.captureRegxp,content )
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
UART with fractional baudrate generator ? Or fractional baudrate generator alone Martin Maurer VHDL 3 04-19-2006 01:26 PM
subtle side effect of generator/generator expression bonono@gmail.com Python 9 10-16-2005 06:42 PM
Text files read multiple files into single file, and then recreate the multiple files googlinggoogler@hotmail.com Python 4 02-13-2005 05:44 PM
generator function within a generator function doesn't execute? TheDustbustr Python 1 07-25-2003 10:45 AM



Advertisments