Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Python > Reading the first MB of a binary file

Reply
Thread Tools

Reading the first MB of a binary file

 
 
Max Leason
Guest
Posts: n/a
 
      01-25-2009
Hi,

I'm attempting to read the first MB of a binary file and then do a md5
hash on it so that i can find the file later despite it being moved or
any file name changes that may have been made to it. These files are
large (350-1400MB) video files and i often located on a different
computer and I figure that there is a low risk for generating the same
hash between two files. The problem occurs in the read command which
returns all \x00s. Any ideas why this is happening?

Code:
>>>>open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)

b'\x00\x00\x00\x00\x00\x00....\x00'
 
Reply With Quote
 
 
 
 
MRAB
Guest
Posts: n/a
 
      01-25-2009
Max Leason wrote:
> Hi,
>
> I'm attempting to read the first MB of a binary file and then do a
> md5 hash on it so that i can find the file later despite it being
> moved or any file name changes that may have been made to it. These
> files are large (350-1400MB) video files and i often located on a
> different computer and I figure that there is a low risk for
> generating the same hash between two files. The problem occurs in the
> read command which returns all \x00s. Any ideas why this is
> happening?
>
> Code:
>>>>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)

> b'\x00\x00\x00\x00\x00\x00....\x00'
>

You're reading the first 1024 bytes. Perhaps the first 1024 bytes of the
file _are_ all zero!

Try reading more and checking those, eg:

>>> SIZE = 1024 ** 2
>>> open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(SIZE) ==

b'\x00' * SIZE
 
Reply With Quote
 
 
 
 
Marc 'BlackJack' Rintsch
Guest
Posts: n/a
 
      01-25-2009
On Sun, 25 Jan 2009 08:37:07 -0800, Max Leason wrote:

> I'm attempting to read the first MB of a binary file and then do a md5
> hash on it so that i can find the file later despite it being moved or
> any file name changes that may have been made to it. These files are
> large (350-1400MB) video files and i often located on a different
> computer and I figure that there is a low risk for generating the same
> hash between two files. The problem occurs in the read command which
> returns all \x00s. Any ideas why this is happening?
>
> Code:
>>>>>open("Chuck.S01E01.HDTV.XViD-YesTV.avi", "rb").read(1024)

> b'\x00\x00\x00\x00\x00\x00....\x00'


As MRAB says, maybe the first 1024 actually *are* all zero bytes. Wild
guess: That's a file created by a bittorrent client which preallocates
the files and that file above isn't downloaded completely yet!?

Ciao,
Marc 'BlackJack' Rintsch
 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Newbie: working with binary files/extract png from a binary file Jim Ruby 6 12-24-2013 08:09 AM
[Python3] Reading a binary file and wrtiting the bytes verbatim in an utf-8 file Python 6 04-25-2010 07:46 AM
Open file, get first line, delete first line close file Richard Schneeman Ruby 16 08-26-2008 11:54 PM
writing binary file (ios::binary) Ron Eggler C++ 9 04-28-2008 08:20 AM
stripping the first byte from a binary file rvr Python 11 07-11-2007 10:10 PM



Advertisments