don't click here

Working with Konami-specific LZ compression...

Discussion in 'Technical Discussion' started by Travelsonic, Dec 14, 2017.

  1. Travelsonic

    Travelsonic

    Member
    827
    20
    18
    Deep into my efforts to reverse engineer Konami's Dance Dance Revolution games, working with game data has been one area of particular interest.

    Data like step data is compressed using a Konami-specific variation of LZ compression, and there is a tool that a member of the Bemani community had made to decompress data compressed this way.

    My question is, how difficult would it be to figure out how to reverse that process, that is, use the code he wrote for decompressing files, and figure out/write, an algorithm that utilizes Konami's specific implementation of LZ compression? I think my problem might be partially with understanding of what I am doing in working with the LZ format, might also be with whether I have the information needed to recreate the dictionary of strings needed to properly encode the data.... as well as my tendency to overthink things (and feel like an idiot in the process)... it would be nice to succeed in this endeavor, due to the avenues it would open up for customizing the game.

    Here is an example of data compressed with this LZ compression, and uncompressed. The data is the stepdata to Burning Heat! (3 Option Mix) from DDRMAX2 -DanceDanceRevolution 7thMIX-
    [​IMG]
     
  2. vladikcomper

    vladikcomper

    Tech Member
    205
    134
    43
    Sonic Warped
    To answer your question, LZ-based compressors are relatively easy to code, if you know the essentials. They follow the same "base" algorithm, as described by Lempel-Ziv (hence the name of algorithm family, LZ), and once you learn all technical details and nuances of a specific format/implementation (for instance, bit codes of compression control bytes, their lengths etc), you'll be able to write your own compressor for that format.
    Lucky for you, I may have something that may solve your problem (perhaps, with a tiny bit of fine-tuning)

    I did my own research on Konami-specific LZ(SS) implementation back in 2013, with both decompression and compression tools developed. It looks like I never made a proper release, though description of their algorithm should be available on my pastebin.
    However, I'm not entirely sure if this is exactly the same compression variant as that you're showing here, but judging by the compressed data, it looks pretty identical.

    Konami compression algorithm that I reverse-engineered comes from 1994's title, Contra Hard Corps, but I believe they re-used the same compression on other platforms; as even GBA and NDS titles seem to use similar compression techniques.
    Here's the description of format:

    Code (ASM):
    1.  
    2. Header:
    3.     First two bytes store uncompressed data size.
    4.  
    5. Block:
    6.     Block starts with 8-bit description field, used to determinate if the current byte in the stream is a raw byte or a decompression flag.
    7.         If bit 0 is fetched, the next byte is treated as raw byte and send to the uncompressed stream
    8.         If bit 1 is fetched, the next byte is treated as a decompression flag.
    9.        
    10.     For example, if the description field fetched is %0001010, the following bytes in stream will be decoded as follows:
    11.         RR FF.. RR FF.. RR RR RR RR
    12.     where RR are raw bytes and FF are decompression flags. Notice, the decompression flags activate different decompression modes that read bytes further, so these sequencies sometimes take more than one byte. Once control is returned to the main decompression loop, the next byte after the whole sequence is read, not the one after the flag itself.
    13.    
    14.     Decompression flags consist of one or two bytes. They execute different decompression modes, which may read the following bytes from stream. The upper bits activate different decompression modes with different formats, represented as follows:
    15.    
    16.     %0ddnnnnn dddddddd  Uncompressed stream copy (Mode 1)
    17.                     dd..dddddddd = Displacement (0..-1023)
    18.                     nnnnn = Number of bytes to copy (3..33)
    19.    
    20.     %10nndddd       Uncompressed stream copy (Mode 2)
    21.                     nn = Number of bytes to copy (2..5)
    22.                     dddd = Displacement (0..-15)
    23.  
    24.     %11nnnnnn       Compressed stream copy
    25.                     nnnnnn = Number of bytes to copy (8..71)
    26.    
    27.     If decompression flag $1F is fetched, decompression is terminated.
    To accentuate, the only difference I can see between this and your samples is absence of two-byte header, which only stores uncompressed data size.

    Now, here are compression tools that I developed back in 2013, both compressor and decompressor's sources and executables included:
    https://1drv.ms/u/s!AnfNFA8QMT8TbmHaxsk_vHhwQ_4
    I should warn you though, these tools are really old, and the code is far from being good or optimal. The compressor usually produce slightly worse results (in terms of compression) compared to the original compressor that Konami used. But... at least it should work~

    Just drag & drop whatever you want to compress or decompress on the executables, they'll do the job. Since my research originated from Konami's title Contra Hard Corps, compressed files produced will have .contra extension on them.
    On important note, if your compressed data doesn't include 2-byte header, add it manually of modify the tools to ignore it; compressed files produce shall have this header as well, so again, strip it out if your implementation doesn't use it. If the format in Dance Dance Revolution is identical, this should work for you, but don't always expect recompressed data to match originally compressed (as I said, my compressor fails to pull certain optimizations right).
     
  3. Travelsonic

    Travelsonic

    Member
    827
    20
    18
    Here is the source code for the decompression tool that I mentioned in my original post
    https://github.com/SaxxonPike/scharfrichter/blob/master/Scharfrichter/Compression/BemaniLZ.cs
    wonder if any of that would be usefu, in terms of the raw numerical values used (since that code is in C#, as opposed to C++] in adapting your code to work with the type used in Konami's bemani games.
    (more discombobulated than usual, finishing up my semester... that time of the semester. ~_~ )