Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > Regex for special chars..

Reply
Thread Tools

Regex for special chars..

 
 
NurAzije
Guest
Posts: n/a
 
      04-18-2006
Hi,
I need a regular expresion which will take all chars from a string
which can be used for files naming on linux, something which will
filter the string from any char which is not allowed to be in a regular
file name..
I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
me everything that is not in this combination..
Thank you

regards,
Nur

 
Reply With Quote
 
 
 
 
Jürgen Exner
Guest
Posts: n/a
 
      04-18-2006
NurAzije wrote:
> I need a regular expresion which will take all chars from a string
> which can be used for files naming on linux, something which will
> filter the string from any char which is not allowed to be in a
> regular file name..
> I think the allowed ones are a-zA-Z0-9


There are definitely many more. As far as I remember any character even
including line break and CR can be used. Exception being the forward slash
because that is reserved as the directory separator.
But why don't you ask in a NG that actually deals with Linux? BTW: _WHICH_
Linux file system? AFAIR there are about half a dozen.

> I need a regex that will filter
> me everything that is not in this combination..


REs don't filter, they match.


jue


 
Reply With Quote
 
 
 
 
Anno Siegel
Guest
Posts: n/a
 
      04-18-2006
NurAzije <(E-Mail Removed)> wrote in comp.lang.perl.misc:
> Hi,
> I need a regular expresion which will take all chars from a string
> which can be used for files naming on linux, something which will
> filter the string from any char which is not allowed to be in a regular
> file name..
> I think the allowed ones are a-zA-Z0-9 I need a regex that will filter
> me everything that is not in this combination..


You can use a lot more characters in file names.

To check a string for occurrence of a set of characters use tr///:

if ( $str !~ tr/a-zA-Z0-9//c ) { # string is okay

Anno
--
If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers.
 
Reply With Quote
 
Brian McCauley
Guest
Posts: n/a
 
      04-18-2006

Jürgen Exner wrote:
> NurAzije wrote:
> > I need a regular expresion which will take all chars from a string
> > which can be used for files naming on linux, something which will
> > filter the string from any char which is not allowed to be in a
> > regular file name..
> > I think the allowed ones are a-zA-Z0-9

>
> There are definitely many more. As far as I remember any character even
> including line break and CR can be used. Exception being the forward slash
> because that is reserved as the directory separator.


You also can't use NUL (ie. character 0) because the POSIX API uses
NUL-terminated strings.

> But why don't you ask in a NG that actually deals with Linux?


Hmmm.... do think he'd get a very positive reception?

 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      04-18-2006
NurAzije <(E-Mail Removed)> wrote:

> I need a regular expresion which will take all chars from a string
> which can be used for files naming on linux,



warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested


There are only 2 ASCII characters that are not allowed in
filenames on the *nix filesystems that I've seen.


--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Ilya Zakharevich
Guest
Posts: n/a
 
      04-19-2006
[A complimentary Cc of this posting was sent to
Tad McClellan
<(E-Mail Removed)>], who wrote in article <(E-Mail Removed)>:

> warn "'$fname' has illegal chars\n" if $fname =~ m#/|\000#; # untested


> There are only 2 ASCII characters that are not allowed in
> filenames on the *nix filesystems that I've seen.


The convenience of ASCII is that there are so many of the standards to
choose from... So if you consider OS X filesystem as *nix, things
quickly go down the drain (UTF-8 encoding *enforced* on the file
system level).

Hope this helps,
Ilya
 
Reply With Quote
 
NurAzije
Guest
Posts: n/a
 
      04-19-2006
I have a script which will take the string from the DB, then compare
the string with this REGEX and replace every char which is not from the
allowed ascii with "_", then name a file with the new string.. for
example:
"asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
I need the right regex that will mark everything not allowed..
Thank you..

 
Reply With Quote
 
John W. Krahn
Guest
Posts: n/a
 
      04-19-2006
NurAzije wrote:
> I have a script which will take the string from the DB, then compare
> the string with this REGEX and replace every char which is not from the
> allowed ascii with "_", then name a file with the new string.. for
> example:
> "asjiuel,dpdsš3898d*?jn" to "asjiuel_dpds_3898d__jn"
> I need the right regex that will mark everything not allowed..


What do you mean by "allowed ascii"? ',', '*' and '?' are ASCII. And why do
you think that you need to use a regular expression?

my $string = q[asjiuel,dpdsš3898d*?jn];

$string =~ tr/a-zA-Z0-9/_/c;

print "$string\n";



John
--
use Perl;
program
fulfillment
 
Reply With Quote
 
NurAzije
Guest
Posts: n/a
 
      04-19-2006
Hi,
this $string =~ tr/a-zA-Z0-9/_/c; will do the oposite thing I need, I
need something to replace everything not in a-zA-Z0-9 to _ ..
I ment with allowed the ones I can use to name a file..

 
Reply With Quote
 
NurAzije
Guest
Posts: n/a
 
      04-19-2006
Thank you guys, I have found it:
[^a-z|A-Z|0-9]
thank you anyway..

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Re: ignore special characters in python regex Gabriel Genellina Python 0 07-21-2009 09:56 AM
ignore special characters in python regex Astan Chee Python 2 07-21-2009 06:01 AM
How make regex that means "contains regex#1 but NOT regex#2" ?? seberino@spawar.navy.mil Python 3 07-01-2008 03:06 PM
Special Report: How special are you? Death from Above MCSE 2 03-19-2007 07:22 PM
Special editions and Deluxe special edition dvd question. Rclrk43 DVD Video 8 12-29-2004 07:32 PM



Advertisments