Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › C: looking for patterns in binary files
New Posts  All Forums:Forum Nav:

C: looking for patterns in binary files

post #1 of 3
Thread Starter 
I'm working on a small project in C where I have to parse a binary file of undocumented file format. As I'm quite new to C I have two questions to some more experienced programmers.

The first seems to be an easy one. How do I extract all the strings from the binary file and put them into an array? Basically I am looking for a simple implementation of strings program in C.

When I open the binary file in any text editor I get a lot of rubbish with some readable strings mixed in. I can extract this strings using strings in the command line. Now I'd like to do something similar in C, like in the pseudocode below:

Code:
while (!EOF) {
     if (string found) {
          put it into array[i]
          i++
       }
     return i;
}
The second problem is a little bit more complicated and is, I believe, the proper way of achieving the same thing. When I look at the file in HEX editor it's easy to notice some patterns. For example before each string there is a byte of value 02 (0x02) followed by the length of the string and the string itself. For example 02 18 52 4F 4F 54 4B 69 57 69 4B 61 4B 69 is a string with the string part in bold.

Now the function I'm trying to create would work like this:
Code:
while(!EOF) {
     for(i=0; i<buffer_size; ++i) {
          if(buffer[i] hex value == 02) {
               int n = read the next byte;
               string = read the next n bytes as char;
               put string into array;
          }
     }
}
Thanks for any pointers.
buka
(17 items)
 
  
Reply
buka
(17 items)
 
  
Reply
post #2 of 3
I'm assuming you're talking about the strings thats part of binutils. In that case its gpl - source is there and waiting for you. On debian and derivatives do 'apt-get source binutils' and find the strings.c file.
Rena
(13 items)
 
  
CPUMotherboardGraphicsRAM
AMD Phenom II x6 1055T Asus M4A88TD-M/USB3 XFX 8800gt 512mb 2x2GB 1333MHz 
Hard DriveOptical DriveOSMonitor
250gb WD + 1tb WD + 2tb WD LG GSA-H55L Aptosid amd64 Samsung Syncmaster 216BW 
KeyboardPowerMouse
Genius Thermaltake TR2 W0379RU 500W Generic Microsoft Mouse 
  hide details  
Reply
Rena
(13 items)
 
  
CPUMotherboardGraphicsRAM
AMD Phenom II x6 1055T Asus M4A88TD-M/USB3 XFX 8800gt 512mb 2x2GB 1333MHz 
Hard DriveOptical DriveOSMonitor
250gb WD + 1tb WD + 2tb WD LG GSA-H55L Aptosid amd64 Samsung Syncmaster 216BW 
KeyboardPowerMouse
Genius Thermaltake TR2 W0379RU 500W Generic Microsoft Mouse 
  hide details  
Reply
post #3 of 3
Thread Starter 
I've actually looked at strings.c source, but their implementation is like 1000 lines of code. What I eventually ended up doing was writing a function isString() which checks if the given byte has a value within the range of printable ASCII letters and then if three or more consecutive bytes are letters treat them as a string.

I think I'm close to solving the second problem, with the main issue being to actually crack the file format.

Thanks for help anyway. +rep
buka
(17 items)
 
  
Reply
buka
(17 items)
 
  
Reply
New Posts  All Forums:Forum Nav:
  Return Home
  Back to Forum: Coding and Programming
Overclock.net › Forums › Software, Programming and Coding › Coding and Programming › C: looking for patterns in binary files