Allegro.cc - Online Community

Allegro.cc Forums » Programming Questions » [Python] CRC32, problems due the lack of unsigned types

Credits go to bamccaig for helping out!
This thread is locked; no one can reply to it. rss feed Print
[Python] CRC32, problems due the lack of unsigned types
Mika Halttunen
Member #760
November 2000
avatar

I've been teaching myself Python for the last few days, seems very nice a language. :) I might even use PyQt4 for my CSSTint rewrite.. But, before I really start writing that, I wanted to learn more Python. I decided to write a tool for handling my MPAK files (my two latest C++ games use the format to pack all the gfx, sounds, etc. to a single file; it's simple WAD-like format really). I have a very crappy command line tool (written in C++) that allows creating .MPK files, listing their contents and extracting them.

So far I've got the listing to work with Python version of MPAK, but the CRC32 checksum is giving me problems. More specifically, as I can't use unsigned int in Python, the CRC32 wraps to the negative side, and thus my MPAK integrity check fails..

I found a way to make the Python calculated CRC to math the one written into the file, but it's hacky, and Python prints a deprecation warning when it executes the code.. :P

With a small .mpk file it works, but with a larger one it shows a negative CRC (wraps around because the file is larger, I guess). So, I read CRC from the file, it's 0x82245193L, then I calculate it for the same file and it's -0x7ddbae6d. Those numbers come from tomatoes.mpk (it's 9.2MB), for a smaller .mpk file (bootstrap.mpk from Funguloids, 251KB) both read the same.

Here's some code:

# Read the CRC32 checksum and the file table header offset
buffer = f.read(8)
crc32, headerOffset = struct.unpack("<LL", buffer)
crc32 = int(crc32)
print "CRC32:", hex(crc32)

# Check that the CRC32 matches
checksum = computeCRC(f, 8)
print "Checksum:", hex(checksum)
if checksum != crc32:
  f.close()
  errorMsg("Checksum doesn't match; perhaps a corrupted package?")

Here's the function that calculates the CRC32, using binascii.crc32.

1# Compute the CRC32 for the file, starting at given offset
2def computeCRC(f, offset):
3 origPos = f.tell()
4 f.seek(offset)
5 crc = 0
6 
7 # Compute a running CRC32 for the file in 16kb chunks
8 while True:
9 buffer = f.read(16384)
10 if buffer == "": break # End of file
11
12 crc = binascii.crc32(buffer, crc)
13 
14 f.seek(origPos)
15 #crcbuf = struct.pack("<L", crc) # Notice these!
16 #crc = struct.unpack("<L", crcbuf)[0]
17 return crc

If I uncomment the two lines marked above, it works, but Python spits out the following message:

./mpak.py:60: DeprecationWarning: struct integer overflow masking is deprecated
crcbuf = struct.pack("<L", crc)

So, how can I make this work? :) I'd like to avoid using deprecated stuff.. I can post the original C++ code too, if needed.

---------------------------------------------
.:MHGames | @mhgames_ :.

bamccaig
Member #7,536
July 2006
avatar

I don't know anything about Python (except that it uses whitespace in the syntax :-X), but this was returned by Google. It might be helpful? :-/ The alternative solution, which some Google results seem to support, is storing your 32-bit unsigned integer (C) in a 64-bit [signed] integer (Python), which should for all values that fit in a 32-bit unsigned integer (C) be positive in a 64-bit [signed] integer (Python). :-/ This is assuming that your C values are 32-bit unsigned integers and Python has support for 64-bit integers. ;) In more general terms, if possible, use the next largest signed integer in Python to enforce the unsigned-ness of the value. :-/

Mika Halttunen
Member #760
November 2000
avatar

Thank you very much bamccaig! I glanced over the Python Library Reference but missed ctypes.. :P The following code works as it should:

# Check that the CRC32 matches
checksum = c_uint32(0)
checksum.value = computeCRC(f, 8)
print "Checksum:", hex(checksum.value)
if checksum.value != crc32:
  f.close()
  errorMsg("Checksum doesn't match; perhaps a corrupted package?")

Anybody more experienced in Python care to comment, is this a correct way to do this? :) Thanks again!

---------------------------------------------
.:MHGames | @mhgames_ :.

Go to: