Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Building several parsing modules

Reply
Thread Tools

Building several parsing modules

 
 
Robert Neville
Guest
Posts: n/a
 
      03-18-2007
Basically, I want to create a table in html, xml, or xslt; with any
number of regular expressions; a script (Perl or Python) which reads
each table row (regex and replacement); and performs the replacement
on any file name, folder, or text file (e.g. css, php, html). For
example, I often rename my mp3 (files); the folder holding the mp3
files; and replace these renamed values in a playlist/m3u/xml file.

The table should hold clean regular expressions with minimal escaping.
The regular expressions would incorporate multiple lines and complex
expressions (e.i. symbolic grouping, back referencing, negative
lookahead). The table would serve as a preset file for any replacement
task. It also contains short description column for each regular
expression. The table could contain 1 to 1000 regular expressions; and
the input file could have 1000 to ten thousand lines as well. SED
would become messy here.

I am just starting out with building the logic and pseudo-code. I am
hoping for any examples where these libraries have been applied. Links
and guides would help since I am just starting out with the language.
I need suggestions and examples on reading input by line; managing
large data sets; iterating through an xml/html structure; and various
parsing techniques.

I built a solution in VBScript and VBA, but it had several limitations
like operating on one platform and did not have full Perl regular
expression support. In addition, it is attached to an Access database.
The solution would parse and add headers to the data. It would parse
the data with the headers and insert it into a table. It had over
fifteen modules for repetitive parsing tasks to build a importable
data set. VBScript Regexes are not as powerful as Perl or even sed.

This request is large, yet someone with command of the language could
give guidance on the basic framework to kickstart my efforts.
Basically, I need someone to say start here; then proceed to this
function; then look into these libraries; so on.
 
Reply With Quote
 
 
 
 
Diez B. Roggisch
Guest
Posts: n/a
 
      03-19-2007
Robert Neville wrote:

> Basically, I want to create a table in html, xml, or xslt; with any
> number of regular expressions; a script (Perl or Python) which reads
> each table row (regex and replacement); and performs the replacement
> on any file name, folder, or text file (e.g. css, php, html). For
> example, I often rename my mp3 (files); the folder holding the mp3
> files; and replace these renamed values in a playlist/m3u/xml file.


<snip/>

Don't do it. Just write python for the task at hand - if it involves regular
expressions, use the re module if you must, but lots of stuff can be done
with simpler, less confusing means like string.split and the like.

The result should be a small few-liner. You are way better off with that,
especially when you have to take constraints into account like moon phase
or the like - you then have the full power of python at your hand, instead
of inventing some wicked table-based "language" that you code exceptions
into.

Diez
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
RDOC: several related modules in several C files Victor \Zverok\ Shepelev Ruby 3 03-16-2007 04:15 PM
Same problem building several modules on XP using Visual Studio .NET kz Perl Misc 0 02-12-2004 10:13 AM
Accessing and updating global variables among several modules Fuming Wang Python 7 07-17-2003 08:05 PM
Several issues with building a user control Shannon Cayze ASP .Net Building Controls 1 07-05-2003 09:57 AM



Advertisments