Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > stat() help

Reply
Thread Tools

stat() help

 
 
rudy.martono@gmail.com
Guest
Posts: n/a
 
      08-04-2006
Hi,

I am writing a JNI function that receives jstring filename and return
the created date based on stat function.

The issue is when I am supposed to handle a Unicode filename.
For example:
Εχ.txt ==> "\u0395\u03c7.txt"

Please correct me if I am wrong.

function header:
JNIEXPORT jlong JNICALL Java_getAccessedDate
(JNIEnv * env, jclass obj, jstring filename)

using the GetStringUTFRegion, I am able to translate the jstring
filename into UTF-8 format.

(*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
where
filename is the parameter jstring
rtn is (char *)
and len is (*env)->GetStringLength(env, filename)

When I print it out, it looks like that it gives the right value.
But I double check the value back by using fopento see whether the file
exists or not , and it returns NULL.
Therefore, I assume stat will return 0, but it returns 724466048.

I am still not familiar with Unicode or UTF-8.
Does UTF-8 need 2 bytes per character?
If that is true, then I should use wchar_t instead of char, _wfopen (to
detect whether the file exists), and _wstat (to get the file's info).

Thank you,

Rudy

 
Reply With Quote
 
 
 
 
rudy.martono@gmail.com
Guest
Posts: n/a
 
      08-04-2006
I set a flag if stat returns -1.
So it looks like it is coming from the translation......



http://www.velocityreviews.com/forums/(E-Mail Removed) wrote:
> Hi,
>
> I am writing a JNI function that receives jstring filename and return
> the created date based on stat function.
>
> The issue is when I am supposed to handle a Unicode filename.
> For example:
> Εχ.txt ==> "\u0395\u03c7.txt"
>
> Please correct me if I am wrong.
>
> function header:
> JNIEXPORT jlong JNICALL Java_getAccessedDate
> (JNIEnv * env, jclass obj, jstring filename)
>
> using the GetStringUTFRegion, I am able to translate the jstring
> filename into UTF-8 format.
>
> (*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
> where
> filename is the parameter jstring
> rtn is (char *)
> and len is (*env)->GetStringLength(env, filename)
>
> When I print it out, it looks like that it gives the right value.
> But I double check the value back by using fopento see whether the file
> exists or not , and it returns NULL.
> Therefore, I assume stat will return 0, but it returns 724466048.
>
> I am still not familiar with Unicode or UTF-8.
> Does UTF-8 need 2 bytes per character?
> If that is true, then I should use wchar_t instead of char, _wfopen (to
> detect whether the file exists), and _wstat (to get the file's info).
>
> Thank you,
>
> Rudy


 
Reply With Quote
 
 
 
 
Roland de Ruiter
Guest
Posts: n/a
 
      08-04-2006
On 4-8-2006 23:26, (E-Mail Removed) wrote:
> Hi,
>
> I am writing a JNI function that receives jstring filename and return
> the created date based on stat function.
>
> The issue is when I am supposed to handle a Unicode filename.
> For example:
> Εχ.txt ==> "\u0395\u03c7.txt"
>
> Please correct me if I am wrong.
>
> function header:
> JNIEXPORT jlong JNICALL Java_getAccessedDate
> (JNIEnv * env, jclass obj, jstring filename)
>
> using the GetStringUTFRegion, I am able to translate the jstring
> filename into UTF-8 format.
>
> (*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
> where
> filename is the parameter jstring
> rtn is (char *)
> and len is (*env)->GetStringLength(env, filename)
>
> When I print it out, it looks like that it gives the right value.
> But I double check the value back by using fopento see whether the file
> exists or not , and it returns NULL.
> Therefore, I assume stat will return 0, but it returns 724466048.
>
> I am still not familiar with Unicode or UTF-8.
> Does UTF-8 need 2 bytes per character?
> If that is true, then I should use wchar_t instead of char, _wfopen (to
> detect whether the file exists), and _wstat (to get the file's info).
>
> Thank you,
>
> Rudy
>

UTF-8 is a variable-length character encoding requiring 1, 2, 3 or 4
bytes per character. See <http://en.wikipedia.org/wiki/UTF-8>.

JNI however uses a so-called modified form of UTF-8, which, among other
differences, only uses 1, 2 or 3 bytes per character. See
<http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16542>
<http://java.sun.com/j2se/1.5.0/docs/api/java/io/DataInput.html#modified-utf-8>

The UTF-8 bytes of the string Εχ.txt are (hexadecimal notation):
ce 95 | cf 87 | 2e | 74 | 78 | 74
Ε \u0395: 2 bytes: ce 95
χ \u03c7: 2 bytes: cf 87
.. \u002e: 1 byte: 2e
t \u0074: 1 byte: 74
x \u0078: 1 byte: 78

Probably stat/wstat and fopen/wfopen expect a fixed size char as
filename parameter. Which encoding do they expect?
--
Regards,

Roland
 
Reply With Quote
 
Bill Medland
Guest
Posts: n/a
 
      08-04-2006
(E-Mail Removed) wrote:

> Hi,
>
> I am writing a JNI function that receives jstring filename and return
> the created date based on stat function.
>
> The issue is when I am supposed to handle a Unicode filename.
> For example:
> ??.txt ==> "\u0395\u03c7.txt"
>
> Please correct me if I am wrong.
>
> function header:
> JNIEXPORT jlong JNICALL Java_getAccessedDate
> (JNIEnv * env, jclass obj, jstring filename)
>
> using the GetStringUTFRegion, I am able to translate the jstring
> filename into UTF-8 format.
>
> (*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
> where
> filename is the parameter jstring
> rtn is (char *)
> and len is (*env)->GetStringLength(env, filename)
>
> When I print it out, it looks like that it gives the right value.
> But I double check the value back by using fopento see whether the file
> exists or not , and it returns NULL.
> Therefore, I assume stat will return 0, but it returns 724466048.
>
> I am still not familiar with Unicode or UTF-8.
> Does UTF-8 need 2 bytes per character?
> If that is true, then I should use wchar_t instead of char, _wfopen (to
> detect whether the file exists), and _wstat (to get the file's info).
>
> Thank you,
>
> Rudy


Presumably since you mention _wfopen and _wstat you are talking about a
Microsoft Windows platform. As far as I know Windows does not normally use
UTF8 for filenames. Your best bet, on Windows, would probably be to use
the wide format functions and GetStringChars.

(Subtle complication; if you are not on Windows then watch out for jchar
possibly not matching wchar_t which might well be 4 bytes wide)

--
Bill Medland
 
Reply With Quote
 
rudy.martono@gmail.com
Guest
Posts: n/a
 
      08-07-2006
Well,

I am not sure about the encoded part. The filename can be anything.
Basically I want to be able to retrieve the date created from it.

Is it correct to convert the jstring filename into wide character
everytime, and use _wstat to get the date created?

What I have changed the code so that it uses GetStringChars( env,
filename, NULL )
to get the Unicode value instead of GetStringUTFChars.

jchar* file = (*env)->GetStringChars( env, filename, NULL )

and use WideCharToMultiByte function

WideCharToMultiByte( CP_ACP, 0, (LPCWSTR)filename,
(*env)->GetStringLength(env, filename)*2,
new_filename,
((*env)->GetStringLength(env,
filename)*2+1), NULL, NULL )

I test it with sampletest_ù.txt, and it works.

when I test it again with Εχ.txt, i get Εχ.txt

Thank you,

Rudy

Roland de Ruiter wrote:
> On 4-8-2006 23:26, (E-Mail Removed) wrote:
> > Hi,
> >
> > I am writing a JNI function that receives jstring filename and return
> > the created date based on stat function.
> >
> > The issue is when I am supposed to handle a Unicode filename.
> > For example:
> > Εχ.txt ==> "\u0395\u03c7.txt"
> >
> > Please correct me if I am wrong.
> >
> > function header:
> > JNIEXPORT jlong JNICALL Java_getAccessedDate
> > (JNIEnv * env, jclass obj, jstring filename)
> >
> > using the GetStringUTFRegion, I am able to translate the jstring
> > filename into UTF-8 format.
> >
> > (*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
> > where
> > filename is the parameter jstring
> > rtn is (char *)
> > and len is (*env)->GetStringLength(env, filename)
> >
> > When I print it out, it looks like that it gives the right value.
> > But I double check the value back by using fopento see whether the file
> > exists or not , and it returns NULL.
> > Therefore, I assume stat will return 0, but it returns 724466048.
> >
> > I am still not familiar with Unicode or UTF-8.
> > Does UTF-8 need 2 bytes per character?
> > If that is true, then I should use wchar_t instead of char, _wfopen (to
> > detect whether the file exists), and _wstat (to get the file's info).
> >
> > Thank you,
> >
> > Rudy
> >

> UTF-8 is a variable-length character encoding requiring 1, 2, 3 or 4
> bytes per character. See <http://en.wikipedia.org/wiki/UTF-8>.
>
> JNI however uses a so-called modified form of UTF-8, which, among other
> differences, only uses 1, 2 or 3 bytes per character. See
> <http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16542>
> <http://java.sun.com/j2se/1.5.0/docs/api/java/io/DataInput.html#modified-utf-8>
>
> The UTF-8 bytes of the string Εχ.txt are (hexadecimal notation):
> ce 95 | cf 87 | 2e | 74 | 78 | 74
> Ε \u0395: 2 bytes: ce 95
> χ \u03c7: 2 bytes: cf 87
> . \u002e: 1 byte: 2e
> t \u0074: 1 byte: 74
> x \u0078: 1 byte: 78
>
> Probably stat/wstat and fopen/wfopen expect a fixed size char as
> filename parameter. Which encoding do they expect?
> --
> Regards,
>
> Roland


 
Reply With Quote
 
rudy.martono@gmail.com
Guest
Posts: n/a
 
      08-07-2006
I think I have found the solution.
Someone posted the same question, and the solution is using memcpy to
copy the value between jchar* and wchar_t.

I will do more testing and post the result.

Thank you,

Rudy

(E-Mail Removed) wrote:
> Well,
>
> I am not sure about the encoded part. The filename can be anything.
> Basically I want to be able to retrieve the date created from it.
>
> Is it correct to convert the jstring filename into wide character
> everytime, and use _wstat to get the date created?
>
> What I have changed the code so that it uses GetStringChars( env,
> filename, NULL )
> to get the Unicode value instead of GetStringUTFChars.
>
> jchar* file = (*env)->GetStringChars( env, filename, NULL )
>
> and use WideCharToMultiByte function
>
> WideCharToMultiByte( CP_ACP, 0, (LPCWSTR)filename,
> (*env)->GetStringLength(env, filename)*2,
> new_filename,
> ((*env)->GetStringLength(env,
> filename)*2+1), NULL, NULL )
>
> I test it with sampletest_ù.txt, and it works.
>
> when I test it again with Εχ.txt, i get Εχ.txt
>
> Thank you,
>
> Rudy
>
> Roland de Ruiter wrote:
> > On 4-8-2006 23:26, (E-Mail Removed) wrote:
> > > Hi,
> > >
> > > I am writing a JNI function that receives jstring filename and return
> > > the created date based on stat function.
> > >
> > > The issue is when I am supposed to handle a Unicode filename.
> > > For example:
> > > Εχ.txt ==> "\u0395\u03c7.txt"
> > >
> > > Please correct me if I am wrong.
> > >
> > > function header:
> > > JNIEXPORT jlong JNICALL Java_getAccessedDate
> > > (JNIEnv * env, jclass obj, jstring filename)
> > >
> > > using the GetStringUTFRegion, I am able to translate the jstring
> > > filename into UTF-8 format.
> > >
> > > (*env)->GetStringUTFRegion(env, filename, 0, len, rtn);
> > > where
> > > filename is the parameter jstring
> > > rtn is (char *)
> > > and len is (*env)->GetStringLength(env, filename)
> > >
> > > When I print it out, it looks like that it gives the right value.
> > > But I double check the value back by using fopento see whether the file
> > > exists or not , and it returns NULL.
> > > Therefore, I assume stat will return 0, but it returns 724466048.
> > >
> > > I am still not familiar with Unicode or UTF-8.
> > > Does UTF-8 need 2 bytes per character?
> > > If that is true, then I should use wchar_t instead of char, _wfopen (to
> > > detect whether the file exists), and _wstat (to get the file's info).
> > >
> > > Thank you,
> > >
> > > Rudy
> > >

> > UTF-8 is a variable-length character encoding requiring 1, 2, 3 or 4
> > bytes per character. See <http://en.wikipedia.org/wiki/UTF-8>.
> >
> > JNI however uses a so-called modified form of UTF-8, which, among other
> > differences, only uses 1, 2 or 3 bytes per character. See
> > <http://java.sun.com/j2se/1.5.0/docs/guide/jni/spec/types.html#wp16542>
> > <http://java.sun.com/j2se/1.5.0/docs/api/java/io/DataInput.html#modified-utf-8>
> >
> > The UTF-8 bytes of the string Εχ.txt are (hexadecimal notation):
> > ce 95 | cf 87 | 2e | 74 | 78 | 74
> > Ε \u0395: 2 bytes: ce 95
> > χ \u03c7: 2 bytes: cf 87
> > . \u002e: 1 byte: 2e
> > t \u0074: 1 byte: 74
> > x \u0078: 1 byte: 78
> >
> > Probably stat/wstat and fopen/wfopen expect a fixed size char as
> > filename parameter. Which encoding do they expect?
> > --
> > Regards,
> >
> > Roland


 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help Help Help Pentax S5i Help needed (Please) The Martian Digital Photography 14 06-20-2008 07:56 AM
HELP - HELP - HELP =?Utf-8?B?S2ltb24gSWZhbnRpZGlz?= ASP .Net 4 03-09-2006 12:46 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 1 07-16-2004 01:12 PM
HELP WANTED HELP WANTED HELP WANTED Harvey ASP .Net 0 07-16-2004 10:00 AM
HELP! HELP! HELP! Opening Web Application Project Error =?Utf-8?B?dHJlbGxvdzQyMg==?= ASP .Net 0 02-20-2004 05:16 PM



Advertisments