Decompression and validation

Ok, we’ve got the data compressed but it’s not really of much use to us unless we can decompress it again so lets move on to dealing with decompression.
A quick look through the manual again shows us that the decompression method is called uncompress and has the following function signature:

int uncompress (Bytef *dest, uLongf *destLen, const Bytef *source, uLong sourceLen);

This matches the compress function signature and as such makes porting the call to VB simple:

Private Declare Function uncompress Lib "ZLibWAPI.dll" ( _
ByRef dest As Any, ByRef destLen As Long, _
ByRef source As Any, ByVal sourceLen As Long) As Long

We’ll go on from the previous chapter’s code, and simply decompress the compressed data buffer (assuming the compression operation succeeded.)
ZLib requires the output buffer to be large enough to store all the decompressed data, so whenever you compress data with the library you must also store the size of the original buffer.

Note; the function returns Z_BUF_ERROR if the supplied buffer was too small so it is possible to decompress a ZLib compressed buffer without knowing its uncompressed size, by sending it various buffer sizes until it no longer returns Z_BUF_ERROR. This approach is very inefficient since it means the data has to be decompressed each time so should only be used as a last resort, in most cases you should already know the length of the decompressed buffer.

In this case we know the size of the decompressed data (the file data) so we can simply allocate the buffer:

Dim DecompressBuf() As Byte, DecompressLen As Long

...

DecompressLen = FileSize
ReDim DecompressBuf(0 To (DecompressLen - 1)) As Byte

Now simply call the decompression method:

RetVal = uncompress(DecompressBuf(0), DecompressLen, CompressBuf(0), CompressLen)
If (RetVal = Z_OK) Then
Debug
.Print "Decompression succeeded, result size: " & DecompressLen & " bytes"
Else
Debug
.Print "Decompress failed.."
End If

Assuming all went well the size of the decompressed buffer should be the same as the original fie size.

To further verify that the data is indeed correct we can use what’s known as a cyclical redundancy check or CRC which takes a buffer and performs some mathematics on each byte to get a final result. If even one byte of the two buffers differs then the CRC checks will be different which allows us to detect the validity of the data.

The ZLib library exposes two CRC methods, the first performs a full CRC check on the data, where as the second performs a much quicker (but less accurate) check. If you want to find out more about how these two methods work then the full source code is available, you’ll find the full CRC method implemented in crc32.c and it’s corresponding header file, and the Adler CRC method in adler.c
The two function signatures can be found near the bottom of the programmer’s manual:

uLong adler32 (uLong adler, const Bytef *buf, uInt len);
uLong crc32 (uLong crc, const Bytef *buf, uInt len);

Again these are pretty simple to port to VB, each taking an initial value then a pointer to a data buffer and its length:

Private Declare Function adler32 Lib "ZLibWAPI.dll" ( _
ByVal adler As Long, ByRef buf As Any, ByVal length As Long) As Long
Private Declare Function
crc32 Lib "ZLibWAPI.dll" ( _
ByVal crc As Long, ByRef buf As Any, ByVal length As Long) As Long

The way these methods work is to take an initial value and use that as a base to calculate the rest of the CRC from the given buffer. The reason for this is it allows CRC calculation of multi-part buffers rather than having to send the entire thing in one go. The only problem here is that what initial value do we start the CRC buffer on for the first piece of data we sent to it, does it even matter? The answer depends on what you’re using the check for, if you simply want to check inside your own application then as long as you specify the same initial value for both the source CRC check and the destination CRC check then it really doesn’t matter which initial value you specify. If however you’re receiving the CRC as calculated by another application (common in things such as network transfer where data is prone to ‘go missing’ or get corrupted) then you must be sure to specify the same initial value as the other application. While you could get the application to send its initial CRC value to you, there is no reason that that data wouldn’t get corrupted but luckily there is a better way.

By calling the functions and specifying a null pointer to the buffer, it will simply return its preferred initial value so as long as the other application is using this too then you know you’re starting from the correct value.

For this test we’ll use the full CRC method, however the Adler method works in exactly the same way, so go ahead and find the initial value:

Dim FileCRC As Long

...

' Get initial value
FileCRC = crc32(0, ByVal 0&, 0)

With the version of the library I’m using, the initial value of the CRC is zero, however it’s always best to get the library to tell you rather than hard coding it since this could (but shouldn’t) change in future versions.
Once you have the initial value you can send it the rest of the file buffer to compute its CRC:

FileCRC = crc32(FileCRC, FileData(0), FileSize)

Now calculate the CRC of the decompressed buffer (I’ll be using a slightly more condensed version by getting the initial value inline) and compare them:

Dim DecompressCRC As Long

...

DecompressCRC = crc32(crc32(0, ByVal 0&, 0), DecompressBuf(0), DecompressLen)

Debug.Print "File CRC: 0x" & Hex(FileCRC) & ", Decompressed CRC: 0x" & _
Hex(DecompressCRC) & " (" & IIf(FileCRC = DecompressCRC, "Match", "Diferent") & ")"

As long as all went well the CRC’s should match.

Finished code for chapter 2:
Code

Back to chapter 1 · Move on to chapter 3
Back to the index