Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Searching for text

Reply
Thread Tools

Searching for text

 
 
robinsiebler
Guest
Posts: n/a
 
      08-29-2006
I have a batch of files that I am trying to search for specific text in
a specific format. Each file contains several items I want to search
for.

Here is a snippet from the file:
....
/FontName /ACaslonPro-Semibold def
/FontInfo 7 dict dup begin
/Notice (Copyright 2000 Adobe Systems Incorporated. All Rights
Reserved.Adobe Caslon is either a registered trademark or a trademark
of Adobe Systems Incorporated in the United States and/or other
countries.) def
/Weight (Semibold) def
/ItalicAngle 0 def
/FSType 8 def
....

I want to search the file until I find '/FontName /ACaslonPro-Semibold'
and then jump forward 7 lines where I expect to find '/FSType 8'. I
then want to continue searching from *that* point forward for the next
FontName/FSType pair. Unfortunately, I haven't been able to figure out
how to do this in Python, although I could do it fairly easily in a
batch file. Would someone care to enlighten me?

 
Reply With Quote
 
 
 
 
Tim Chase
Guest
Posts: n/a
 
      08-29-2006
> I want to search the file until I find '/FontName /ACaslonPro-Semibold'
> and then jump forward 7 lines where I expect to find '/FSType 8'. I
> then want to continue searching from *that* point forward for the next
> FontName/FSType pair. Unfortunately, I haven't been able to figure out
> how to do this in Python, although I could do it fairly easily in a
> batch file. Would someone care to enlighten me?


found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False


or, you could

sed -n '/\/FontName \/ACaslonPro-Semibold/,/\/FSType 8/{/\/FSType
8/p}'

You omit what you want to do with the results when you find
them...or what should happen when they both appear on the same
line (though you hint that they're a couple lines apart, you
don't define this as a "this is always the case" sort of scenario)

-tkc


 
Reply With Quote
 
 
 
 
robinsiebler
Guest
Posts: n/a
 
      08-29-2006

> You omit what you want to do with the results when you find
> them...or what should happen when they both appear on the same
> line (though you hint that they're a couple lines apart, you
> don't define this as a "this is always the case" sort of scenario)


I don't do anything, per se. I just need to verify that I find the
FontName/FSType pair. And they *always* have to be in the same
location in relation to each other, i.e. they should never appear on
the same line or any closer/farther from each other.

 
Reply With Quote
 
robinsiebler
Guest
Posts: n/a
 
      08-29-2006
The other thing I failed to mention is that I need to ensure that I
find the fsType *before* I find the next FontName.

 
Reply With Quote
 
Tim Chase
Guest
Posts: n/a
 
      08-29-2006
> The other thing I failed to mention is that I need to ensure that I
> find the fsType *before* I find the next FontName.


found_fontname = False
font_search = '/FontName /ACaslonPro-Semibold'
type_search = '/FSType 8'
for line in file('foo.txt'):
if font_search in line:
if found_fontname:
print "Uh, oh!"
else:
found_fontname = True
if found_fontname and type_search in line:
print 'doing something with %s' % line
# reset to look for font_search
found_fontname = False

and look for it to report "Uh, oh!" where it has found another
"/FontName /ACaslonPro-Semibold".

You can reduce your font_search to just '/FontName' if that's all
you care about, or if you just want any '/FontName' inside an
'/ACaslonPro-SemiBold' block, you can tweak it to be something like

for line in file('foo.txt'):
if found_fontname and '/FontName' in line:
print "Uh, oh!"
if font_search in line:
found_fontname = True

-tkc



 
Reply With Quote
 
Simon Forman
Guest
Posts: n/a
 
      08-29-2006
robinsiebler wrote:
> The other thing I failed to mention is that I need to ensure that I
> find the fsType *before* I find the next FontName.


Given these requirements, I'd formulate the script something like this:


f = open(filename)

NUM_LINES_BETWEEN = 7

Fo = '/FontName /ACaslonPro-Semibold'
FS = '/FSType 8'


def checkfile(f):
# Get a (index, line) generator on the file.
G = enumerate(f)

for i, line in G:

# make sure we don't find a FSType
if FS in line:
print 'Found FSType without FontName %i' % i
return False

# Look for FontName.
if Fo in line:
print 'Found FontName at line %i' % i

try:

# Check the next 7 lines for NO FSType
# and NO FontName
n = NUM_LINES_BETWEEN
while n:
i, line = G.next()

if FS in line:
print 'Found FSType prematurely at %i' % i
return False

if Fo in line:
print "Found '%s' before '%s' at %i" % \
(Fo, FS, i)
return False
n =- 1

# Make sure there's a FSType.
i, line = G.next()

if FS in line:
print 'Found FSType at %i' % i

elif Fo in line:
print "Found '%s' instead of '%s' at %i" % \
(Fo, FS, i)
return False

else:
print 'FSType not found at %i' % i
return False

except StopIteration:
print 'File ended before FSType found.'
return False

return True


if checkfile(f):
# File passes...
pass


Be sure to close your file object when you're done with it. And you
might want fewer or different print statements.

HTH

Peace,
~Simon

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Google search result to be URL-limited when searching site, but notwhen searching Web stumblng.tumblr Javascript 1 02-04-2008 09:01 AM
text searching a dynamic ASP.NET site? Brian Henry ASP .Net 3 04-15-2006 02:34 AM
Searching text in ASP.net jty202 ASP .Net 3 01-18-2005 11:09 AM
Searching and replacing multiple strings in text Chris Java 5 10-11-2003 05:12 AM
Searching text files hivie C++ 3 07-08-2003 03:42 PM



Advertisments