Sega surely loves their weird compression formats :P Anyway, this time I am trying to figure out their LZ01 compression format, and I am lost. I assume it is based off LZ77. This is the header (thanks drx): 0x00 - 0x03: 4C5A 3031 (LZ01) 0x02 - 0x07: The compressed file size (4 bytes) 0x08 - 0x0B: The uncompressed file size (4 bytes) 0x10: The start of the data So the header is 0x10 bytes long. The sliding window length is 2110 bytes (why this number, I have no idea.) The good thing is that there are uncompressed files in the DS version that are LZ01 compressed in the PS2 version (why would the PS2 versions files need to be compressed anyway?), so at least I know what the uncompressed data is. So, here are 2 files, which have a uncompressed version and compressed version: drop3ez.dat (LZ01 compressed) drop4ez.dat (LZ01 compressed) (There are a few bytes that are different between the compressed & uncompressed version, but they occur before byte 2048.) Anybody have any ideas on this compression format?
First off, the header looks like that: 0x00-0x03: LZ01 ascii 0x04-0x07: Compressed size 0x08-0x0b: Uncompressed size 0x0c-0x0f: ? 0x10 - start of the compressed data Now, the way the compressed data is organized, it starts with a control byte. The control byte is comprised of 8 bits (of course). Starting from the rightmost bit: 1 - pure, uncompressed copy (copy one byte from the source buffer, at current pointer, to destination buffer, at its current pointer) 0 - compressed E.g. if we have FF 01 02 03 04 05 06 07 08, then FF is the control byte, and all its bits are 1, so we copy 01 02 03 04 05 06 07 08 directly to the decompression buffer. If we have 5F 03 00 0F 00 38 EB F0 D2 EB F0. The control byte is 01011111, so we copy the first five bytes directly (03 00 0F 00 38). Then we encounter a compressed flag (which is two bytes -- EB F0). Then we copy D2 directly, then a compression flag again (EB F0). Now, I don't have time to decipher what the compression flags do, but if you don't figure it out soon, I'll try to have a go at it. Hope that helps.
That does help a lot, at least it is known how it stores the data. And now I would know how to make the "poor mans" LZ01 compressed file. I still don't get how those compression flags work. I noticed that EB F0 seems to refer to 00 00 00, so that may mean something.
Ok, first I fixed the links in the first post. Next, I took a look at the file over the past few days, and it definitely uses a variation of the LZSS compression method. So, EBF0 seems to be telling the decompressor "Goto offset 1 in the decompressed data and repeat for 3 bytes" or something similar to that, although it still confuses me. Help plz.
OK, I fully cracked it and coded a decompressor (that works), I'll put it up in a few hours (I'm in a hurry)
Ok, the decompressor is attached to this post. The way you use it, you drag & drop the file you want to decompress on the .exe, and it will decompress it and create a new file with the extension .out. The way the compression words work is this: xxyz offset = y*256+xx + 18 count = z+3 The decompressor copies (count) bytes from the buffer at (offset) Note: y*256+xx is a signed, 12-bit number. You *have* to sign extend it to whatever you're using to make it work. The best way to sign extend it is to do this: (x^0x800)-0x800 (where ^ is XOR) View attachment 2240
Thanks drx, that decompresser really helps. Though it seems to have an error with some larger files. This file for example: LZ01 compressed file The original looks like this: Original File But your program decompresses it as this: LZ01 decompressed file. This error starts at 0x8AC (offset 2220). I would have showed you this file before, but I wasn't sure if Sega fixed the errors with this file (which they didn't).
Yeah, I don't know why that happens, and I won't have time to check this for a while. The dictionary/buffer seems to be only 0x800 or 0x1000 wide, using a signed pointer.
I'll try to fix it then, although it would be nice if you can fix it. It seems like the window size is most likely 0x800. EDIT: Either Java hates me, or I suck at coding. Some numbers are correctly decompressed, but others don't. I'll just wait till you have the time to fix it.
Screw what I had in this post before. I'll just wait till you have the time to fix it, as I absolutely cannot make a working decompressor. I should also mention that a few bytes are different in the new compressed & decompressed files I posted. They happen before offset 0x800 though. (And I do not believe the one error in th fever chain after that offset.)
I want to say that although it's been a month since the last post, I still haven't forgotten about this and still need help with the decompresser. EDIT: Nevermind, I got it to decompress correctly. I just need to put in the sliding window. The problem was that the XX variable wasn't converted to a signed byte.