Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Ruby > Simple regex question.

Reply
Thread Tools

Simple regex question.

 
 
Peter Bailey
Guest
Posts: n/a
 
      06-26-2009
Hello.
I need to parse through thousands of TIFF files and do some re-naming.
These files have underscores in them followed by a sequential number. I
need to grab just the "root" of the filename, without the underscore or
the numbers.
Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
puts file
What I get with this is:
ehs-g7917741_01.tiff
Why doesn't it give me my root filename?
Thanks,
Peter
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Tim Hunter
Guest
Posts: n/a
 
      06-26-2009
Peter Bailey wrote:
> Hello.
> I need to parse through thousands of TIFF files and do some re-naming.
> These files have underscores in them followed by a sequential number. I
> need to grab just the "root" of the filename, without the underscore or
> the numbers.
> Dir.chdir("L:/infocontiffs/ehs-g7917741")
> files = Dir.glob("*.tiff")
> file = files[0]
> puts file
> file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
> puts file
> What I get with this is:
> ehs-g7917741_01.tiff
> Why doesn't it give me my root filename?
> Thanks,
> Peter


Is this what you want?

while fname = DATA.gets
m = fname.match /(.*?)_\d+\.tiff/
if m
puts "Match: '#{m[1]}'"
else
puts "No match: #{fname}"
end
end

__END__
ehs-g7917741_01.tiff
asadsasd_12345.tiff
ljhkjhkh_1_2_3.tiff
xxxx__1.tiff
xxxx_.tiff
xxxx.tiff
xxxx
_.tiff
_01.tiff
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
 
 
 
Brian Candler
Guest
Posts: n/a
 
      06-26-2009
Peter Bailey wrote:
> Hello.
> I need to parse through thousands of TIFF files and do some re-naming.
> These files have underscores in them followed by a sequential number. I
> need to grab just the "root" of the filename, without the underscore or
> the numbers.
> Dir.chdir("L:/infocontiffs/ehs-g7917741")
> files = Dir.glob("*.tiff")
> file = files[0]
> puts file
> file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")


The argument "#{$1}" is expanded once, before gsub even executes. You
probably want the block form:

file = file.sub(/^(.*)_\d+\.tiff/) { $1 }
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Peter Bailey
Guest
Posts: n/a
 
      06-26-2009
Tim Hunter wrote:
> Peter Bailey wrote:
>> Hello.
>> I need to parse through thousands of TIFF files and do some re-naming.
>> These files have underscores in them followed by a sequential number. I
>> need to grab just the "root" of the filename, without the underscore or
>> the numbers.
>> Dir.chdir("L:/infocontiffs/ehs-g7917741")
>> files = Dir.glob("*.tiff")
>> file = files[0]
>> puts file
>> file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
>> puts file
>> What I get with this is:
>> ehs-g7917741_01.tiff
>> Why doesn't it give me my root filename?
>> Thanks,
>> Peter

>
> Is this what you want?
>
> while fname = DATA.gets
> m = fname.match /(.*?)_\d+\.tiff/
> if m
> puts "Match: '#{m[1]}'"
> else
> puts "No match: #{fname}"
> end
> end
>
> __END__
> ehs-g7917741_01.tiff
> asadsasd_12345.tiff
> ljhkjhkh_1_2_3.tiff
> xxxx__1.tiff
> xxxx_.tiff
> xxxx.tiff
> xxxx
> _.tiff
> _01.tiff


Well, you gave me a good idea, using match. Here's what I did, and, it
worked. Thank you very much, Tim.

Dir.chdir("L:/infocontiffs/ehs-g7917741")
files = Dir.glob("*.tiff")
file = files[0]
puts file
file = file.match(/^(.*)_[0-9]+\.tiff/)
#file = file.to_i
puts $1
#end
gives me:
ehs-g7917741_01.tiff
ehs-g7917741

Program exited with code 0
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
David A. Black
Guest
Posts: n/a
 
      06-26-2009
Hi --

On Fri, 26 Jun 2009, Peter Bailey wrote:

> Hello.
> I need to parse through thousands of TIFF files and do some re-naming.
> These files have underscores in them followed by a sequential number. I
> need to grab just the "root" of the filename, without the underscore or
> the numbers.
> Dir.chdir("L:/infocontiffs/ehs-g7917741")
> files = Dir.glob("*.tiff")
> file = files[0]
> puts file
> file = file.gsub(/^(.*)_[0-9]+\.tiff/, "#{$1}")
> puts file
> What I get with this is:
> ehs-g7917741_01.tiff
> Why doesn't it give me my root filename?


Here's another good use of the string[//] technique:

>> file = "ehs-g7917741_01.tiff"

=> "ehs-g7917741_01.tiff"
>> file[/[^_]+/] # match non-underscore characters

=> "ehs-g7917741"


David

--
David A. Black / Ruby Power and Light, LLC
Ruby/Rails consulting & training: http://www.rubypal.com
Now available: The Well-Grounded Rubyist (http://manning.com/black2)
"Ruby 1.9: What You Need To Know" Envycasts with David A. Black
http://www.envycasts.com

 
Reply With Quote
 
Peter Bailey
Guest
Posts: n/a
 
      06-26-2009

Beautiful. Thanks.
--
Posted via http://www.ruby-forum.com/.

 
Reply With Quote
 
Robert Klemme
Guest
Posts: n/a
 
      06-26-2009
2009/6/26 David A. Black <(E-Mail Removed)>:
> On Fri, 26 Jun 2009, Peter Bailey wrote:


> Here's another good use of the string[//] technique:
>
>>> file =3D "ehs-g7917741_01.tiff"

>
> =3D> "ehs-g7917741_01.tiff"
>>>
>>> file[/[^_]+/] =A0 =A0 =A0# match non-underscore characters

>
> =3D> "ehs-g7917741"


Combining all the good suggestions this is probably what I'd do:

files =3D Dir.glob("L:/infocontiffs/ehs-g7917741/*.tiff")
files.each do |f|
base =3D File.basename f
root =3D base[/^([^_]+)_\d+\.tiff$/, 1]

if base
# rename or whatever
else
$stderr.puts "Dunno what to do with #{f}"
end
end

The reason I left in the matching of underscores and digits is to be
sure that the complete name matches the pattern that we required in
order to detect other files that might accidentally have been placed
in that directory.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
String Pattern Matching: regex and Python regex documentation Xah Lee Java 1 09-22-2006 07:11 PM
Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine? =?Utf-8?B?SmViQnVzaGVsbA==?= ASP .Net 2 10-22-2005 02:43 PM
Java regex imposture re: Perl regex compatibility a_c_Attlee@yahoo.com Java 2 05-06-2005 12:16 AM
perl regex to java regex Rick Venter Java 5 11-06-2003 10:55 AM



Advertisments