Velocity Reviews

Velocity Reviews (http://www.velocityreviews.com/forums/index.php)
-   Java (http://www.velocityreviews.com/forums/f30-java.html)
-   -   Making a string, file-safe (file-encode??) (http://www.velocityreviews.com/forums/t620826-making-a-string-file-safe-file-encode.html)

adamorn@gmail.com 06-17-2008 06:54 PM

Making a string, file-safe (file-encode??)
 
I was wondering if there was a quick way to ensure that a filename is
a safe.

What I mean is that if I am creating a file from a string variable, I
want to ensure that the file will actually be able to be created. So
if it contains a "?", then clearly I would want to eliminate it.

I know that there is something like URL encode that encodes strings
for use in urls, but is there another function that works similarly
for strings for files that I want to create?

Thanks!

Stefan Ram 06-17-2008 07:03 PM

Re: Making a string, file-safe (file-encode??)
 
adamorn@gmail.com writes:
>I know that there is something like URL encode that encodes strings
>for use in urls, but is there another function that works similarly
>for strings for files that I want to create?


The GPL library ram.jar contains a class to convert an
arbitrary Unicode string to a string of only uppercase latin
letters and digits. This intended to convert any text to a
text acceptable accross most file systems as a filename.

http://www.purl.org/stefan_ram/pub/filode


Daniel Pitts 06-17-2008 07:13 PM

Re: Making a string, file-safe (file-encode??)
 
adamorn@gmail.com wrote:
> I was wondering if there was a quick way to ensure that a filename is
> a safe.
>
> What I mean is that if I am creating a file from a string variable, I
> want to ensure that the file will actually be able to be created. So
> if it contains a "?", then clearly I would want to eliminate it.
>
> I know that there is something like URL encode that encodes strings
> for use in urls, but is there another function that works similarly
> for strings for files that I want to create?
>
> Thanks!

? is not invalid on all system, linux handles it perfectly. The
characters that are invalid are system specific, and some systems don't
have limitations at all.

The only portable way to handle this is to catch exceptions and report
them to the user.
--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

adamorn@gmail.com 06-17-2008 07:35 PM

Re: Making a string, file-safe (file-encode??)
 
On Jun 17, 3:13*pm, Daniel Pitts
<newsgroup.spamfil...@virtualinfinity.net> wrote:
> adam...@gmail.com wrote:
> > I was wondering if there was a quick way to ensure that a filename is
> > a safe.

>
> > What I mean is that if I am creating afilefrom a string variable, I
> > want to ensure that thefilewill actually be able to be created. *So
> > if it contains a "?", then clearly I would want to eliminate it.

>
> > I know that there is something like URLencodethat encodes strings
> > for use in urls, but is there another function that works similarly
> > for strings for files that I want to create?

>
> > Thanks!

>
> ? is not invalid on all system, linux handles it perfectly. The
> characters that are invalid are system specific, and some systems don't
> have limitations at all.
>
> The only portable way to handle this is to catch exceptions and report
> them to the user.
> --
> Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>



ah, but Im actually pulling the filename from a variable that the user
does not set...

Roedy Green 06-17-2008 08:41 PM

Re: Making a string, file-safe (file-encode??)
 
On Tue, 17 Jun 2008 11:54:43 -0700 (PDT), adamorn@gmail.com wrote,
quoted or indirectly quoted someone who said :

>I was wondering if there was a quick way to ensure that a filename is
>a safe.


see http://mindprod.com/jgloss/filenames.html for some thoughts on the
problem.
--

Roedy Green Canadian Mind Products
The Java Glossary
http://mindprod.com

Daniel Pitts 06-17-2008 08:57 PM

Re: Making a string, file-safe (file-encode??)
 
adamorn@gmail.com wrote:
> On Jun 17, 3:13 pm, Daniel Pitts
> <newsgroup.spamfil...@virtualinfinity.net> wrote:
>> adam...@gmail.com wrote:
>>> I was wondering if there was a quick way to ensure that a filename is
>>> a safe.
>>> What I mean is that if I am creating afilefrom a string variable, I
>>> want to ensure that thefilewill actually be able to be created. So
>>> if it contains a "?", then clearly I would want to eliminate it.
>>> I know that there is something like URLencodethat encodes strings
>>> for use in urls, but is there another function that works similarly
>>> for strings for files that I want to create?
>>> Thanks!

>> ? is not invalid on all system, linux handles it perfectly. The
>> characters that are invalid are system specific, and some systems don't
>> have limitations at all.
>>
>> The only portable way to handle this is to catch exceptions and report
>> them to the user.
>> --
>> Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

>
>
> ah, but Im actually pulling the filename from a variable that the user
> does not set...

Then make sure the variable is being set by something that doesn't add
invalid characters. Details might help us better help you.


--
Daniel Pitts' Tech Blog: <http://virtualinfinity.net/wordpress/>

Tom Anderson 06-17-2008 11:45 PM

Re: Making a string, file-safe (file-encode??)
 
On Tue, 17 Jun 2008, Eric Sosman wrote:

> adamorn@gmail.com wrote:
>> I was wondering if there was a quick way to ensure that a filename is
>> a safe.
>>
>> What I mean is that if I am creating a file from a string variable, I
>> want to ensure that the file will actually be able to be created. So
>> if it contains a "?", then clearly I would want to eliminate it.

>
> The "alphabets" for file names vary from system to system,
> and there are systems on which '?' is perfectly legal. So your
> "clearly" isn't really all that clear ...


Oh come on, this is ridiculous. The only safe and sane thing to do is to
target the common set of valid filenames - so exclude ?, /, \, , *, ",
etc. Surely this is blindingly obvious? This is not a complicated
question, it's quite clear what the OP wants to know, and you're not
helping anyone by making a mountain out of a molehill.

The answer to the question, though, is no - there's no library method that
checks if a filename is safe, or escapes one to make it safe, at least
none that i know of. However, it wouldn't be too hard to write a regular
expression to validate filenames, or a sequence of replace calls to
replace dangerous characters with safe versions.

Roedy's advice is pretty good:

http://mindprod.com/jgloss/filenames.html

I'd be tempted to go wild and insist that filenames contain only letters,
digits, underscores, dashes and full stops, and don't have a punctuation
symbol as the first character. If a user came up with a good reason to use
some other character, i'd happily consider adding it, but until then, keep
it simple, keep it safe.

> In general, though, you can't guarantee that a file will be creatable
> just by examining its name. On one widespread system, "D:\\README.TXT"
> is a perfectly valid file name but you are unlikely to succeed in
> creating a new file on a CD-ROM ... Or you may lack permission to create
> files in some folders, or the file system may be full, or ...


True. And completely unconnected to what the OP asked.

tom

--
Judge Dredd. Found dead. Face down in Snoopy's bed.

RedGrittyBrick 06-18-2008 09:24 AM

Re: Making a string, file-safe (file-encode??)
 
Tom Anderson wrote:
> On Tue, 17 Jun 2008, Eric Sosman wrote:
>
>> adamorn@gmail.com wrote:
>>> I was wondering if there was a quick way to ensure that a filename is
>>> a safe.
>>>
>>> What I mean is that if I am creating a file from a string variable, I
>>> want to ensure that the file will actually be able to be created. So
>>> if it contains a "?", then clearly I would want to eliminate it.

>>
>> The "alphabets" for file names vary from system to system,
>> and there are systems on which '?' is perfectly legal. So your
>> "clearly" isn't really all that clear ...

>
> Oh come on, this is ridiculous. The only safe and sane thing to do is to
> target the common set of valid filenames - so exclude ?, /, \, , *, ",
> etc. Surely this is blindingly obvious? This is not a complicated
> question, it's quite clear what the OP wants to know, and you're not
> helping anyone by making a mountain out of a molehill.


$ perl file.pl
'aaa*bbb?ccc.txt' written.
'aaa*bbb?ccc.txt' contains ...
Hello File


$ ls -l aaa*
-rw-rw-r-- 1 rgb rgb 11 Jun 18 10:24 aaa*bbb?ccc.txt


$ cat file.pl
#!/usr/bin/perl
#
use strict;
use warnings;

my $filename = 'aaa*bbb?ccc.txt';
open my $fh, '>', $filename
or die "can't write '$filename' because $!\n";
print $fh "Hello File\n";
close $fh;
print "'$filename' written.\n";


open my $fh2, '<', $filename
or die "can't read '$filename' because $!\n";
print "'$filename' contains ...\n";
while (<$fh2>) {
print;
}
close $fh2;


I was too lazy to write it in Java. Sorry :-)

--
RGB

RedGrittyBrick 06-18-2008 09:27 AM

Re: Making a string, file-safe (file-encode??)
 
Lew wrote:
> Lew wrote:
>>> And why are we forbidding lower-case letters?

>
> Eric Sosman wrote:
>> Because of file systems that don't support them.
>> ISO 9660 (aka HSFS), for example.
>>
>> http://en.wikipedia.org/wiki/ISO_9660

>
> But doesn't everyone use the Joliet extensions?
>


It is only a few days since I received a ISO 9660 CD without Joliet
extensions. So no.

The originator admitted he'd made a mistake though.

--
RGB

Hendrik Maryns 06-18-2008 10:41 AM

Re: Making a string, file-safe (file-encode??)
 
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Lew schreef:
| Tom Anderson wrote:
|> Oh come on, this is ridiculous. The only safe and sane thing to do is
|> to target the common set of valid filenames - so exclude ?, /, \, , *,
|> ", etc. Surely this is blindingly obvious? This is not a complicated
|> question, it's quite clear what the OP wants to know, and you're not
|> helping anyone by making a mountain out of a molehill.
|
| Many people's situation differs, and they are fine with using those
| characters in file names, even from Java, so no, the common subset is
| not the only "safe and sane thing to do".

I’ve been using names like ‘(∃y)(y∈--).mona’ and ‘E1 x (E1 y (& (& (>+ x
y) (cat x NF)) (cat y PX))).gta’ without problems, on Linux. Haven’t
been able to test my program on Windows until now, though, since I
haven’t managed to compile the JNI on it. But since these are files the
user doesn’t need to care about, it would be no problem to use ‘safe’
names once it turns out not to work. So I guess I’m interested in this
routine as well.

Cheers, H.
- --
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFIWOZGe+7xMGD3itQRAtk9AJ9pO7Jq+4xiZ6OVo+bKC7 nDtOUmhQCaAzRa
Xvwp/f5t86JNCp5zEGDqapw=
=bOPT
-----END PGP SIGNATURE-----


All times are GMT. The time now is 08:05 AM.

Powered by vBulletin®. Copyright ©2000 - 2014, vBulletin Solutions, Inc.
SEO by vBSEO ©2010, Crawlability, Inc.