medd
2010-06-15 11:30:43 UTC
Hi,
I need to locate a given sequence of bytes within a binary file. I do
not manage to do it efficiently, and I wanted to ask if somebody here
has a clue.
I saw that there are no functions in IDL to look for a given sequence
within a byte array, but there are very powerful functions to look for
a sequence within a string using regular expressions. This is what I
tried:
fcontent = BYTARR((FILE_INFO(fn)).size, /NOZERO) ;Variable where to
read in the file
OPENU, unit, fn, /GET_LUN;, /SWAP_ENDIAN
READU, unit, fcontent
IF(STREGEX(STRING(fcontent), STRING(sequence_searched)) LT 0) THEN
print, 'sequence not found'
This works!! ... But only as long as the file does not contain a byte
with the value 0 (which, too bad!, it does...)
After looking a while, I found in this forum (message "Null terminated
strings") and in the IDL help that a string is truncated as soon as
this value is found. This explains why this method fails. But it does
not propose solutions... :(
Do you know some smart workaround? Or do you know other efficient ways
in IDL to locate a sequence of bytes within a binary file?
Thanks!
PS. I thought about replacing all 0's by 1's, but it is a really dirty
solution, which might find the sequence at the wrong place in case
there is a similar sequence which really contains a 1 instead...
I need to locate a given sequence of bytes within a binary file. I do
not manage to do it efficiently, and I wanted to ask if somebody here
has a clue.
I saw that there are no functions in IDL to look for a given sequence
within a byte array, but there are very powerful functions to look for
a sequence within a string using regular expressions. This is what I
tried:
fcontent = BYTARR((FILE_INFO(fn)).size, /NOZERO) ;Variable where to
read in the file
OPENU, unit, fn, /GET_LUN;, /SWAP_ENDIAN
READU, unit, fcontent
IF(STREGEX(STRING(fcontent), STRING(sequence_searched)) LT 0) THEN
print, 'sequence not found'
This works!! ... But only as long as the file does not contain a byte
with the value 0 (which, too bad!, it does...)
After looking a while, I found in this forum (message "Null terminated
strings") and in the IDL help that a string is truncated as soon as
this value is found. This explains why this method fails. But it does
not propose solutions... :(
Do you know some smart workaround? Or do you know other efficient ways
in IDL to locate a sequence of bytes within a binary file?
Thanks!
PS. I thought about replacing all 0's by 1's, but it is a really dirty
solution, which might find the sequence at the wrong place in case
there is a similar sequence which really contains a 1 instead...