Velocity Reviews - Computer Hardware Reviews

Velocity Reviews > Newsgroups > Programming > Perl > Perl Misc > search and replace in a binary file

Reply
Thread Tools

search and replace in a binary file

 
 
Rafal Konopka
Guest
Posts: n/a
 
      07-22-2006
I need to search and replace some strings in a binary file. When I try
something like this the code below, it works fine. The thing is that
I'll need to use replacements that have more or fewer characters (like
150 replaced with 20, etc. I know it requires some hairy bitwise
shifts but I have no idea how to do it.

TIA,

Rafal

&edit_up('myfile');

sub edit_up {

my ($infile) = @_;

undef $/;
open(F,$infile) || die "$infile: $!";
binmode(F);
my $OUT = "test\\";
if (!-d $OUT) {mkdir($OUT,07770);}

open(OF,">$OUT" . $infile);
binmode(OF);

while (read(F, $buf, 1024)) {
$buf =~ s/\[150\]/[100]/g;
print OF $buf;
}
close(F);
close(OF);
}
 
Reply With Quote
 
 
 
 
Tad McClellan
Guest
Posts: n/a
 
      07-22-2006
Rafal Konopka <(E-Mail Removed)> wrote:
> I need to search and replace some strings in a binary file. When I try
> something like this the code below, it works fine. The thing is that
> I'll need to use replacements that have more or fewer characters (like
> 150 replaced with 20, etc. I know it requires some hairy bitwise
> shifts but I have no idea how to do it.



If you read from one file, write to another file, and then rename
the 2nd file, then it requires no trickery at all.

Perl can do this for you, see "-i" in perlrun.pod and $^I in perlvar.pod,
though you might have to figure out how to binmode() the ARGV and ARGVOUT
filehandles.


> &edit_up('myfile');



edit_up('myfile');


You should not use ampersands on subroutine calls unless you know what
using ampersands on subroutine calls does, and what it does is what
you want to do. See perlsub.pod.


> sub edit_up {
>
> my ($infile) = @_;
>
> undef $/;



local $/;

would be better...

.... but $/ is not used for input via read() anyway, so there is no
need to set it to anything in particular.


> open(F,$infile) || die "$infile: $!";



You check the return value from open(). Good. Very Good.


> binmode(F);
> my $OUT = "test\\";



It is good style to use single quotes on strings unless you want one
of the two extra things that double quotes give you (variable
interpolation and/or backslash escapes).

my $OUT = 'test\\';

If the pathname is not destined for a Windows "shell", as in this case,
then using forward slashes in paths is a Good Idea too:

my $OUT = 'test/';


> if (!-d $OUT) {mkdir($OUT,07770);}

^^^^^
^^^^^ those are some mighty
^^^^^ funny-looking permissions...


That will fail if $OUT is a file or pipe or link or ...

You probably want to test for existence rather than for directory-ness:

if ( !-e $OUT ) { mkdir($OUT,0777) }

or written more clearly:

mkdir $OUT, 0777 unless -e $OUT;


> open(OF,">$OUT" . $infile);



Now you are no longer checking the return value from open(). Not So Good.

It is now apparent that you _are_ reading from one file and writing to another,
so different lengths should not be a problem.

Did you try it with different lengths and experience a problem?


> binmode(OF);
>
> while (read(F, $buf, 1024)) {



You probably need to handle the case where your to-be-replaced value
is broken across buffer boundaries...


> $buf =~ s/\[150\]/[100]/g;

^^^^^^

That search string looks suspiciously non-binary to me.


> print OF $buf;
> }
> close(F);
> close(OF);
> }



--
Tad McClellan SGML consulting
http://www.velocityreviews.com/forums/(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
 
 
 
Rafal Konopka
Guest
Posts: n/a
 
      07-22-2006
On Sat, 22 Jul 2006 09:56:44 -0500, Tad McClellan
<(E-Mail Removed)> wrote:

>Rafal Konopka <(E-Mail Removed)> wrote:
>>[...]

>If you read from one file, write to another file, and then rename
>the 2nd file, then it requires no trickery at all.
>
>Perl can do this for you, see "-i" in perlrun.pod and $^I in perlvar.pod,
>though you might have to figure out how to binmode() the ARGV and ARGVOUT
>filehandles.
>
>
>> &edit_up('myfile');

>
>
> edit_up('myfile');
>
>
>You should not use ampersands on subroutine calls unless you know what
>using ampersands on subroutine calls does, and what it does is what
>you want to do. See perlsub.pod.
>


I've been using them (ampersands) all my life , but I'll check out
perlsub.pod


>... but $/ is not used for input via read() anyway, so there is no
>need to set it to anything in particular.


OK

>> if (!-d $OUT) {mkdir($OUT,07770);}

> ^^^^^
> ^^^^^ those are some mighty
> ^^^^^ funny-looking permissions...
>
>

typo

>Did you try it with different lengths and experience a problem?


Yes, that's the issue. The moment I replaced 150 with 20, I couldn't
open the file in the application.

>You probably need to handle the case where your to-be-replaced value
>is broken across buffer boundaries...


Exactly!

>
>> $buf =~ s/\[150\]/[100]/g;

> ^^^^^^
>
>That search string looks suspiciously non-binary to me.


it's just an example. Some of the replacements will be ascii strings
(like the one above) and some will be binary characters (e.g
chr(176)). The file itself is a binary file.

So how do I go about replacing 1 character with, say two or two
character with 1?

Thanks for your suggestions.

Rafal
 
Reply With Quote
 
Tad McClellan
Guest
Posts: n/a
 
      07-22-2006
Rafal Konopka <(E-Mail Removed)> wrote:
> On Sat, 22 Jul 2006 09:56:44 -0500, Tad McClellan
><(E-Mail Removed)> wrote:
>
>>Rafal Konopka <(E-Mail Removed)> wrote:
>>>[...]

>>If you read from one file, write to another file, and then rename
>>the 2nd file, then it requires no trickery at all.



>>Did you try it with different lengths and experience a problem?

>
> Yes, that's the issue.



Did perl make the different length changes or not?


> The moment I replaced 150 with 20, I couldn't
> open the file in the application.

^^^^^^^^^^^^^^^^^^

That does not answer the question above.

Can you see the changes with a file dump or binary editor?


>>You probably need to handle the case where your to-be-replaced value
>>is broken across buffer boundaries...

>
> Exactly!



You say that as if it was mentioned in your original, it wasn't, I
was just pointing out that you may have more than one problem to
work on.



> So how do I go about replacing 1 character with, say two or two
> character with 1?



I think you aleady know how, by outputting 2 characters instead
of 1.

My guess is that perl is making the changes that you need, but that
those changes are incompatible with your unnamed "application".



--
Tad McClellan SGML consulting
(E-Mail Removed) Perl programming
Fort Worth, Texas
 
Reply With Quote
 
Rafal Konopka
Guest
Posts: n/a
 
      07-22-2006
On Sat, 22 Jul 2006 13:38:31 -0500, Tad McClellan
<(E-Mail Removed)> wrote:

>Rafal Konopka <(E-Mail Removed)> wrote:
>> On Sat, 22 Jul 2006 09:56:44 -0500, Tad McClellan
>><(E-Mail Removed)> wrote:
>>


>>>Did you try it with different lengths and experience a problem?

>>
>> Yes, that's the issue.

>
>
>Did perl make the different length changes or not?
>
>
>> The moment I replaced 150 with 20, I couldn't
>> open the file in the application.

> ^^^^^^^^^^^^^^^^^^
>That does not answer the question above.
>
>Can you see the changes with a file dump or binary editor?
>


Yes, I can see the changes in the dump file

I really know nothing about binary files. Having tried the character
for character replacvement successfully, I tried asymmetric
replacemtns. While I could see them in the dump file, I could no
longer open the file in the application.

>I think you aleady know how, by outputting 2 characters instead
>of 1.


>My guess is that perl is making the changes that you need, but that
>those changes are incompatible with your unnamed "application".


Essentially, it all boils down to this: imagine I have to replace
"Jon" with "Jonathan" and conversely "William" with "Billy" in a Word
document? The straight-forward search and replace is not going to
work, so how do I do it?

Rafal
 
Reply With Quote
 
Peter J. Holzer
Guest
Posts: n/a
 
      07-22-2006
On Sat, 22 Jul 2006 12:44:55 -0400, Rafal Konopka wrote:
> it's just an example. Some of the replacements will be ascii strings
> (like the one above) and some will be binary characters (e.g
> chr(176)). The file itself is a binary file.
>
> So how do I go about replacing 1 character with, say two or two
> character with 1?


There is no general solution. You need to know the format of the file
and take care to preserve the format when making changes. For example,
many binary format use length fields. If you change the length of a
record, you have to update the length field, too. Some file formats also
use checksums to detect corruption - then you need to recompute the
checksum, too.

Many file formats are documented at http://www.wotsit.org/default.asp
If the file in question is in a proprietary format you may need to ask
the vendor for information or reverse engineer it.

hp


--
_ | Peter J. Holzer | > Wieso sollte man etwas erfinden was nicht
|_|_) | Sysadmin WSR | > ist?
| | | (E-Mail Removed) | Was sonst wäre der Sinn des Erfindens?
__/ | http://www.hjp.at/ | -- P. Einstein u. V. Gringmuth in desd

 
Reply With Quote
 
Martijn Lievaart
Guest
Posts: n/a
 
      07-23-2006
On Sat, 22 Jul 2006 15:15:35 -0400, Rafal Konopka wrote:

> Essentially, it all boils down to this: imagine I have to replace
> "Jon" with "Jonathan" and conversely "William" with "Billy" in a Word
> document? The straight-forward search and replace is not going to
> work, so how do I do it?


Either:

1) Open the document as a COM/.NET object, do a search and replace.

2) Reverse engineer the binary format of the file and figure out what else
has to change.

M4
--
Redundancy is a great way to introduce more single points of failure.

 
Reply With Quote
 
 
 
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Binary tree search vs Binary search Bogdan C Programming 22 10-21-2010 09:46 PM
beginner with C : quick search or binary search help needed with forand while bpascal123@googlemail.com C Programming 9 07-03-2009 08:00 PM
Help understand probems - Binary Search and Sequenital Search Timmy C++ 5 07-09-2007 02:41 PM
search & replace in binary file Ike Java 1 11-30-2006 04:02 AM
Binary Search to search linearizer table? Andy C Programming 1 11-25-2003 04:40 AM



Advertisments