As a standard most image files have a header or marker segment that will allow you to determine the image type. You can always look at the file extension but it is not a reliable way to determine the image type, especially if the file extension has been changed. So, you have to read the image header to determine and verify the real image format. BMP, PCX, JPEG, FLI/FLC, and AVI files include headers that define the image size, number of colors, and other information needed to display the image. Here is the header information you need to look for to identify some of the more common image formats.
In the tables that follow, we’ll assume all offsets start at zero, all field sizes are in bytes, and all numeric values are stored with the least significant byte first.
Bitmap (BMP)
The first two bytes of the Bitmap file and header will identify its image format.
Byte | 1 | 2 |
Hex | 42 | 4D |
Char | B | M |
Windows BMP files begin with a 54-byte header:
offset | size | description |
0 | 2 | signature, must be 4D42 hex |
2 | 4 | size of BMP file in bytes (unreliable) |
6 | 2 | reserved, must be zero |
8 | 2 | reserved, must be zero |
10 | 4 | offset to start of image data in bytes |
14 | 4 | size of BITMAPINFOHEADER structure, must be 40 |
18 | 4 | image width in pixels |
22 | 4 | image height in pixels |
26 | 2 | number of planes in the image, must be 1 |
28 | 2 | number of bits per pixel (1, 4, 8, or 24) |
30 | 4 | compression type (0=none, 1=RLE-8, 2=RLE-4) |
34 | 4 | size of image data in bytes (including padding) |
38 | 4 | horizontal resolution in pixels per meter (unreliable) |
42 | 4 | vertical resolution in pixels per meter (unreliable) |
46 | 4 | number of colors in image, or zero |
50 | 4 | number of important colors, or zero |
OS/2 BMP files begin with a 26-byte header:
offset | size | description |
0 | 2 | signature, must be 4D42 hex |
2 | 4 | size of BMP file in bytes (unreliable) |
6 | 2 | reserved, must be zero |
8 | 2 | reserved, must be zero |
10 | 4 | offset to start of image data in bytes |
14 | 4 | size of BITMAPCOREHEADER structure, must be 12 |
18 | 2 | image width in pixels |
20 | 2 | image height in pixels |
22 | 2 | number of planes in the image, must be 1 |
24 | 2 | number of bits per pixel (1, 4, 8, or 24) |
PCX files begin with a 128-byte header:
offset | size | description |
0 | 1 | manufacturer byte, must be 10 decimal |
1 | 1 | PCX version number |
0 = PC Paintbrush version 2.5 | ||
2 = PC Paintbrush 2.8 with palette information | ||
3 = PC Paintbrush 2.8 without palette information | ||
4 = PC Paintbrush for Windows | ||
5 = PC Paintbrush 3.0 or later, PC Paintbrush Plus | ||
2 | 1 | run length encoding byte, must be 1 |
3 | 1 | number of bits per pixel per bit plane |
4 | 8 | image limits in pixels: Xmin, Ymin, Xmax, Ymax |
12 | 2 | horizontal dots per inch when printed (unreliable) |
14 | 2 | vertical dots per inch when printed (unreliable) |
16 | 48 | 16-color palette (16 RGB triples between 0-255) |
64 | 1 | reserved, must be zero |
65 | 1 | number of bit planes |
66 | 2 | video memory bytes per image row |
68 | 2 | 16-color palette interpretation (unreliable) |
0 = color or b&w, 1 = grayscale | ||
70 | 2 | horizontal screen resolution – 1 (unreliable) |
72 | 2 | vertical screen resolution – 1 (unreliable) |
74 | 54 | reserved, must be zero |
JPEG Header Format:
Strictly speaking, JPEG files do not have formal headers, but fg_jpeghead() and fgi_jpeghead() return relevant information from the file’s start of frame segment. We call it a header for consistency with other image file formats. The first part to look at is the first two bytes of the file. The hex values FF D8 will identify the start of the image file. This is often enough to know that you have an actual JPEG file. The next two bytes are the Application marker typically FF E0. This marker can change depending on the application used to modify or save the image. I have seen this marker as FF E1 when pictures were created by Canon digital cameras. The next two bytes are skipped. Read the next five bytes to identify specifically the application marker. This would typically be 4A 46 49 46 (JFIF) and 00 to terminate the string. So, it is FF D8 FF E0 <skip two bytes> 4A 46 49 46 00
Byte | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
Hex | FF | D8 | FF | E0 | Skip | Skip | 4A | 46 | 49 | 46 | 00 |
Char | ÿ | Ø | ÿ | à | Skip | Skip | J | F | I | F |
offset | size | description |
0 | 2 | JPEG SOI marker (FFD8 hex) |
2 | 2 | image width in pixels |
4 | 2 | image height in pixels |
6 | 1 | number of components (1 = grayscale, 3 = RGB) |
7 | 1 | horizontal/vertical sampling factors for component 1 |
8 | 1 | sampling factors for component 2 (if RGB) |
9 | 1 | sampling factors for component 3 (if RGB) |
TIFF (Tag Image File Format):
The image header for a TIFF image is a fixed 8 byte segment always occurring at the beginning of the file. To ensure TIFF images can be read properly by PC’s (Intel processors) and Macintosh computers the header must indicate a byte order which in this case is the first two bytes of the file. The first two bytes will either be hex 49 49 (II) for Intel format or 4D 4D (MM) for the Macintosh integer format which was based on Motorola processors. The next byte is 2A (decimal 42). This number should never change. The number 42 was used because of it’s significant meaning in mathematics, religion, science, and other fields. Ignore the rest of the header. For our purpose of identifying the image format these bytes are all that we need to look at.
Byte | 1 | 2 | 3 |
Hex | 49 | 49 | 2A |
Char | I | I | * |
GIF (Graphics Interchange Format)
A GIF file is a compressed image format. It uses lossless data compression which is also used in zip and gzip functions. Lossless data compression ensures that there is no data loss or image degradation. GIF files are largely used for animated images and in the early years of the internet you would be hard pressed to find a website not using some form of animated GIF file.
To identify the GIF file read the first three bytes of the file.
Byte | 1 | 2 | 3 |
Hex | 47 | 49 | 46 |
Char | G | I | F |
AVI Header Format:
AVI files contain a 56-byte header, starting at offset 32 within the file.
offset | size | description |
0 | 4 | time delay between frames in microseconds |
4 | 4 | data rate of AVI data |
8 | 4 | padding multiple size, typically 2048 |
12 | 4 | parameter flags |
16 | 4 | number of video frames |
20 | 4 | number of preview frames |
24 | 4 | number of data streams (1 or 2) |
28 | 4 | suggested playback buffer size in bytes |
32 | 4 | width of video image in pixels |
36 | 4 | height of video image in pixels |
40 | 4 | time scale, typically 30 |
44 | 4 | data rate (frame rate = data rate / time scale) |
48 | 4 | starting time, typically 0 |
52 | 4 | size of AVI data chunk in time scale units |
Examples
This is a basic example for checking the header file for the above mentioned file formats. This code will not compile on its own.
VerifyImageType PROCEDURE (STRING fileNameAndPath, *STRING myErrorDescription)
imageFileExt STRING(4), AUTO
imageFileName STRING(MAX_PATH_AND_FILENAME_LENGTH), AUTO
IMAGE_FILE FILE, DRIVER(‘DOS’),Name(imageFileName),PRE(IMFIL)
Record RECORD
fileBuffer STRING(10)
END
END
CODE
imageFileName = fileNameAndPath
myErrorDescription = ”
!Extract the file extension – not a built in function
IF ~GetfileExtension(imageFileName, imageFileExt)
!Report Error
myErrorDescription = ‘Error getting file extension’
Return(FALSE)
END
Open(IMAGE_FILE, ReadOnly)
IF ErrorCode()
!Report Error
myErrorDescription = ‘Error opening ‘ & Clip(imageFileName)
Return(FALSE)
END
Set(IMAGE_FILE)
Next(IMAGE_FILE)
IF ErrorCode()
myErrorDescription = ‘No data found in image file’
Return(FALSE)
END
CASE Upper(Clip(imageFileExt))
OF ‘JPG’
OROF ‘JPEG’
IF IMFIL:fileBuffer[1:2] = ‘<0FFh,0D8h>’
!Success, File was recognized, Perform some task.
Return(TRUE)
ELSE
!Report error and Perform some task
myErrorDescription = ‘Image header does not match the file extension’
Return(FALSE)
END
OF ‘BMP’
IF IMFIL:fileBuffer[1:2] = ‘<042h,04Dh>’
!Success, File was recognized, Perform some task.
Return(TRUE)
ELSE
!Report error and Perform some task
myErrorDescription = ‘Image header does not match the file extension’
Return(FALSE)
END
OF ‘TIFF’
OROF ‘TIF’
IF IMFIL:fileBuffer[1:3] = ‘<049h,049h,02Ah>’
!Success, File was recognized, Perform some task.
Return(TRUE)
ELSE
!Report error and Perform some task
myErrorDescription = ‘Image header does not match the file extension’
Return(FALSE)
END
OF ‘GIF’
IF IMFIL:fileBuffer[1:3] = ‘<047h,049h,046h>’
!Success, File was recognized, Perform some task.
Return(TRUE)
ELSE
!Report error and Perform some task
myErrorDescription = ‘Image header does not match the file extension’
Return(FALSE)
END
ELSE
!Handle unsupported images if needed.
myErrorDescription = ‘Error, non supported image extension encountered’
Return(FALSE)
END
Filed under: C++, VC++ | Tagged: AVI, BMP, File, Formats, GIF, Header, Image, JPEG, PCX, TIFF |
Resources such as the one you mentioned here will be extremely helpful to myself! I will publish a hyperlink to this page on my personal blog. I am certain my site visitors will uncover that fairly valuable.