Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Java > Hashing function different values on different OS ?

Reply
Thread Tools

Hashing function different values on different OS ?

 
 
Lawrence
Guest
Posts: n/a
 
      02-17-2007
Hi all, I use a simple function to create a hash of a file using sha
for
an utility i'm writing.

The function is here :
public static String digest(File file) throws
FileNotFoundException, IOException, NoSuchAlgorithmException {
MessageDigest sha;
sha = MessageDigest.getInstance("sha");
DigestInputStream din = new DigestInputStream(new
BufferedInputStream(new FileInputStream(file)),sha);


while (din.read() != -1){}
din.close();

return sha.digest().toString();

}

I send a file over a network (LAN) between a mac and a windows
computer, both using my application.
I sent zip files, mp3s, jpegs, bmps, txt, tiff, gif, and videos and it
all worked perfectly, but the
outcoming hash is different for the same file.
How weird is that ?Maybe the name of the file matters ?It shouldn't.

 
Reply With Quote
 
 
 
 
Luc The Perverse
Guest
Posts: n/a
 
      02-17-2007
"Lawrence" <(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> Hi all, I use a simple function to create a hash of a file using sha
> for
> an utility i'm writing.
>
> The function is here :
> public static String digest(File file) throws
> FileNotFoundException, IOException, NoSuchAlgorithmException {
> MessageDigest sha;
> sha = MessageDigest.getInstance("sha");
> DigestInputStream din = new DigestInputStream(new
> BufferedInputStream(new FileInputStream(file)),sha);
>
>
> while (din.read() != -1){}
> din.close();
>
> return sha.digest().toString();
>
> }
>
> I send a file over a network (LAN) between a mac and a windows
> computer, both using my application.
> I sent zip files, mp3s, jpegs, bmps, txt, tiff, gif, and videos and it
> all worked perfectly, but the
> outcoming hash is different for the same file.
> How weird is that ?Maybe the name of the file matters ?It shouldn't.


IF you weren't using java I'd say it could be an endian problem.

Test small file and dump hex to screen and compare it.

I'm intrigued

--
LTP




 
Reply With Quote
 
 
 
 
Richter~9.6
Guest
Posts: n/a
 
      02-17-2007
On Feb 17, 5:21 am, "Lawrence" <(E-Mail Removed)> wrote:
> Hi all, I use a simple function to create a hash of a file using sha
> for
> an utility i'm writing.
>
> The function is here :
> public static String digest(File file) throws
> FileNotFoundException, IOException, NoSuchAlgorithmException {
> MessageDigest sha;
> sha = MessageDigest.getInstance("sha");
> DigestInputStream din = new DigestInputStream(new
> BufferedInputStream(new FileInputStream(file)),sha);
>
> while (din.read() != -1){}
> din.close();
>
> return sha.digest().toString();
>
> }
>
> I send a file over a network (LAN) between a mac and a windows
> computer, both using my application.
> I sent zip files, mp3s, jpegs, bmps, txt, tiff, gif, and videos and it
> all worked perfectly, but the
> outcoming hash is different for the same file.
> How weird is that ?Maybe the name of the file matters ?It shouldn't.


Have you tried zipping up the contents before moving it and unzipping
it on the target machine?

Regards,
Richard

 
Reply With Quote
 
Alex Hunsley
Guest
Posts: n/a
 
      02-17-2007
Lawrence wrote:
> Hi all, I use a simple function to create a hash of a file using sha
> for
> an utility i'm writing.
>
> The function is here :
> public static String digest(File file) throws
> FileNotFoundException, IOException, NoSuchAlgorithmException {
> MessageDigest sha;
> sha = MessageDigest.getInstance("sha");
> DigestInputStream din = new DigestInputStream(new
> BufferedInputStream(new FileInputStream(file)),sha);
>
>
> while (din.read() != -1){}
> din.close();
>
> return sha.digest().toString();
>
> }
>
> I send a file over a network (LAN) between a mac and a windows
> computer, both using my application.


Like Luc, I was suspecting endian problems for a moment, but Java's
standard streams assume network byte order (big endian), so Java
operating at both ends should match up ok.
Could it be something to do with how MessageDigest may be doing any seeding?
lex


> I sent zip files, mp3s, jpegs, bmps, txt, tiff, gif, and videos and it
> all worked perfectly, but the
> outcoming hash is different for the same file.
> How weird is that ?Maybe the name of the file matters ?It shouldn't.
>

 
Reply With Quote
 
Eric Sosman
Guest
Posts: n/a
 
      02-17-2007
Lawrence wrote:
> Hi all, I use a simple function to create a hash of a file using sha
> for
> an utility i'm writing.
>
> The function is here :
> public static String digest(File file) throws
> FileNotFoundException, IOException, NoSuchAlgorithmException {
> MessageDigest sha;
> sha = MessageDigest.getInstance("sha");
> DigestInputStream din = new DigestInputStream(new
> BufferedInputStream(new FileInputStream(file)),sha);
>
>
> while (din.read() != -1){}
> din.close();
>
> return sha.digest().toString();
>
> }
>
> I send a file over a network (LAN) between a mac and a windows
> computer, both using my application.
> I sent zip files, mp3s, jpegs, bmps, txt, tiff, gif, and videos and it
> all worked perfectly, but the
> outcoming hash is different for the same file.
> How weird is that ?Maybe the name of the file matters ?It shouldn't.


Have you examined the way you "send the file" over the
network? Note that Mac and Windows use different conventions
to mark the ends of lines in text files, so "the same" text
will be represented by different byte sequences on the two
machines. Transport mechanisms like FTP make the conversion
automatically, so you may not have noticed it happening.

--
Eric Sosman
http://www.velocityreviews.com/forums/(E-Mail Removed)lid
 
Reply With Quote
 
Paul Tomblin
Guest
Posts: n/a
 
      02-17-2007
In a previous article, Eric Sosman <(E-Mail Removed)> said:
>network? Note that Mac and Windows use different conventions
>to mark the ends of lines in text files, so "the same" text
>will be represented by different byte sequences on the two
>machines. Transport mechanisms like FTP make the conversion
>automatically, so you may not have noticed it happening.


Just to expand on that a bit, if you transfer using ftp and tell it that
the file is ascii, it will convert the ends of lines, and if you tell it
that it's binary it won't. Some ftp clients auto-detect what you're
sending and set the binary/ascii flag correctly, but many don't, and if
you send a binary file without telling it that it's binary, it will end up
badly corrupted.


--
Paul Tomblin <(E-Mail Removed)> http://blog.xcski.com/
The way NT mounts filesystems is something I'd expect to find in a
barnyard or on a stock-breeding farm.
-- Mike Andrews
 
Reply With Quote
 
Lawrence
Guest
Posts: n/a
 
      02-17-2007
To answer your question let me explain.
I transfer the file using my own java program, I use simple chunks of
bytes and I save them to new files.
Since both client & server are in java and written by me I believe
there shoulodn't be
any endian problem of any sort.
At the end the program is pretty simple, I make a hash code, i send
the hash code with some other info
such as file name and file size, then the clients connects back and
request the file by sending the hash, i check on
a hashmap the file, i send it via chunks of bytes.
I do check that if the chunk is not fulled by the InputStream i write
only the read data, on both client and server.
When the transfer is completed the client checks that the file
received has the same hash that the server initially stated.
This is always false.
For any file type.
But I tried many types and including dmg disk images or rar files,
jpegs, videos, zip and they all work afterwards.
I'm going to send a very small file and check on both sides the hex
prints.
Will let you know ..

On Feb 17, 2:54 pm, (E-Mail Removed) (Paul Tomblin) wrote:
> In a previous article, Eric Sosman <(E-Mail Removed)> said:
>
> >network? Note that Mac and Windows use different conventions
> >to mark the ends of lines in text files, so "the same" text
> >will be represented by different byte sequences on the two
> >machines. Transport mechanisms like FTP make the conversion
> >automatically, so you may not have noticed it happening.

>
> Just to expand on that a bit, if you transfer using ftp and tell it that
> the file is ascii, it will convert the ends of lines, and if you tell it
> that it's binary it won't. Some ftp clients auto-detect what you're
> sending and set the binary/ascii flag correctly, but many don't, and if
> you send a binary file without telling it that it's binary, it will end up
> badly corrupted.


 
Reply With Quote
 
Lawrence
Guest
Posts: n/a
 
      02-17-2007
On Feb 17, 5:27 pm, "Lawrence" <(E-Mail Removed)> wrote:
> To answer your question let me explain.
> I transfer the file using my own java program, I use simple chunks of


Sorry for the bad quoting before.
I just tried with a hex editor to open a file send on both sides,
and they are equal.
So the problem is in the function.
For a file that has inside the 4 characters "CIAO" hex [ 43 49 41
4F ]
on MAC the hash is [B@425743
For the same file, on a Windows machine is [B@472d48

Done again on a mac is [B@238016.
Done again on the windows machine is [B@3ae941

I don't understand .. how is this possible ?

Maybe there is something wrong to having an array of bytes to string ?
The statement that returns in the method i posed.

Thanks folks




 
Reply With Quote
 
Lothar Kimmeringer
Guest
Posts: n/a
 
      02-17-2007
Lawrence wrote:

> return sha.digest().toString();


byte[].toString doesn't work the way you think.
You have to do something like this:

byte[] digest = sha.digest();
StringBuffer sb = new StringBuffer();
for (int i = 0; i < digest.length; i++){
if ((digest[i] & 0xff) < 16){
sb.append("0");
}
sb.append(Integer.toHexString(digest[i] & 0xff);
sb.append(" ");
}
return sb.toString();

I wrote this by hand without checking for errors, so the
correct result might be different.

BTW: When reading or writing data, don't use Streams or
Readers/Writers that convert data like PrintStreams
or InputStreamReader/OutputStreamWriter.


Regards, Lothar
--
Lothar Kimmeringer E-Mail: (E-Mail Removed)
PGP-encrypted mails preferred (Key-ID: 0x8BC3CD81)

Always remember: The answer is forty-two, there can only be wrong
questions!
 
Reply With Quote
 
Lawrence
Guest
Posts: n/a
 
      02-17-2007
On Feb 17, 6:08 pm, Lothar Kimmeringer <(E-Mail Removed)>
wrote:
> Lawrence wrote:
> > return sha.digest().toString();

>
> byte[].toString doesn't work the way you think.
> You have to do something like this:
>
> byte[] digest = sha.digest();
> StringBuffer sb = new StringBuffer();
> for (int i = 0; i < digest.length; i++){
> if ((digest[i] & 0xff) < 16){
> sb.append("0");
> }
> sb.append(Integer.toHexString(digest[i] & 0xff);
> sb.append(" ");}
>
> return sb.toString();
>
> I wrote this by hand without checking for errors, so the
> correct result might be different.


Cool, I though that an array to string will always return the same
value but
i forgot that arrays are objects that have other things such as
references when they do
toString ..
I will test your code (but I need to have a look back to shift
operator and bit wise and)

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Help on hashing multiple keys and values Adam Adam Ruby 8 04-17-2011 04:30 PM
Hashing VALUES to C-Structs Brian Schröder Ruby 1 08-28-2005 10:18 PM
Hashing function for integer coordinates Owen Jacobson Java 3 05-26-2005 12:29 PM
Hashing across different types Nate Smith Ruby 5 08-19-2004 03:53 AM
Hashing across different types Nate Smith Ruby 0 08-18-2004 03:51 PM



Advertisments