Utility KENSSharp

Discussion in 'Engineering & Reverse Engineering' started by FraGag, Feb 26, 2011.

  1. FraGag

    FraGag

    Tech Member
    KENSSharp is a set of libraries written in C# that contain compressors and decompressors for the Kosinski, Enigma, Nemesis and Saxman compression formats. It is a port of flamewing's port, which is included in <a href="http://code.google.com/p/s2-ssedit/" target="_blank">Sonic 2 Special Stage Editor</a>. This port was motivated by the desire to make <a href='index.php?showtopic=24572'>S2LVL</a> work on other implementations of the Common Language Infrastructure, such as Mono, in which P/Invoke is not available.

    sonicblur independently ported the compressors and decompressors from the original KENS library to C# in a project called KensNET; see next post.

    KENSSharp is licensed under the <a href="http://www.gnu.org/licenses/lgpl.html" target="_blank">GNU Lesser General Public License</a>, version 3 or later.
     
  2. sonicblur

    sonicblur

    Oldbie
    1,325
    9
    18
    Sorry I didn't notice this sooner. My port you mentioned was finished that weekend as promised:
    <a href="http://www.sappharad.com/junk/KensNETv1.zip" target="_blank">http://www.sappharad.com/junk/KensNETv1.zip</a>

    All 4 compressors and decompressors are giving identical results for me. Feel free to integrate anything into your codebase if you want.
    However, as you know I took a different strategy (line-by-line, as close to the original code as possible but with some bugfixes to the original code) so it's not following the same structure. Cleanup shouldn't be difficult now that it compiles and works. If needed, I'd be glad to add some additional signatures to each for working with Memory streams instead of files.
     
  3. FraGag

    FraGag

    Tech Member
    <!--quoteo(post=565419:date=Mar 4 2011, 08:05 PM:name=sonicblur)--><div class='quotetop'>QUOTE (sonicblur @ Mar 4 2011, 08:05 PM) <a href="index.php?act=findpost&pid=565419"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->Sorry I didn't notice this sooner. My port you mentioned was finished that weekend as promised:
    <a href="http://www.sappharad.com/junk/KensNETv1.zip" target="_blank">http://www.sappharad.com/junk/KensNETv1.zip</a>

    All 4 compressors and decompressors are giving identical results for me. Feel free to integrate anything into your codebase if you want.
    However, as you know I took a different strategy (line-by-line, as close to the original code as possible but with some bugfixes to the original code) so it's not following the same structure. Cleanup shouldn't be difficult now that it compiles and works.<!--QuoteEnd--></div><!--QuoteEEnd-->
    I've looked at it briefly, and I find that the code is barely comprehensible; the comments explain what's being done but not why. Of course, it's a problem with the original code, so you're not to blame here. flamewing's code, on the other hand, is more structured and I was able to better relate to the descriptions I read on Sega Retro. (However, flamewing's code for Nemesis decompression didn't work right, so I had to change some things to make it work correctly.) Starting from your code to make it more structured would be pointless when I can just start from more structured code like flamewing's. Furthermore, flamewing improved some compressors to give smaller results (in particular, revision 31 in s2-ssedit improved Nemesis compression), so that's another reason why I prefer using his code as a base.

    However, flamewing didn't make <strike>an Enigma</strike> a Saxman compressor and decompressor, so I'll probably use your code as a base for that. I feel a bit hypocritical for encouraging you to port the original code but not using it though...
     
  4. sonicblur

    sonicblur

    Oldbie
    1,325
    9
    18
    <!--quoteo(post=565422:date=Mar 4 2011, 07:15 PM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 4 2011, 07:15 PM) <a href="index.php?act=findpost&pid=565422"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->I've looked at it briefly, and I find that the code is barely comprehensible; the comments explain what's being done but not why. Of course, it's a problem with the original code, so you're not to blame here. flamewing's code, on the other hand, is more structured and I was able to better relate to the descriptions I read on Sega Retro. (However, flamewing's code for Nemesis decompression didn't work right, so I had to change some things to make it work correctly.) Starting from your code to make it more structured would be pointless when I can just start from more structured code like flamewing's. Furthermore, flamewing improved some compressors to give smaller results (in particular, revision 31 in s2-ssedit improved Nemesis compression), so that's another reason why I prefer using his code as a base.<!--QuoteEnd--></div><!--QuoteEEnd-->
    All of the comments were also from the original, but I believe you're aware of that. I literally just pasted the original code into a .cs class and fixed each line one-by-one.

    I agree that in the end, your implementation is probably a better thing to do. When I was struggling to figure out why the Saxman decompressor wasn't working, comments or even documentation on the format (which I wasn't able to find) would've helped. In the process of fixing that code though, I eventually started to understand how the format worked. But a nice structure and comments would have helped.

    So keep up the good work. I just wanted to push something out quickly in the event you didn't continue and since I could finish it in a day. In the end, someone using the library doesn't need to care how awful the code inside is as long as it's working. I only briefly looked at yours, but the fact that it's broken up nicely is already a big improvement.
     
  5. flamewing

    flamewing

    Emerald Hunter Tech Member
    1,138
    0
    16
    France
    Sonic Classic Heroes; Sonic 2 Special Stage Editor; Sonic 3&K Heroes (on hold)
    <!--quoteo(post=565422:date=Mar 4 2011, 10:15 PM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 4 2011, 10:15 PM) <a href="index.php?act=findpost&pid=565422"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->However, flamewing didn't make an Enigma compressor and decompressor,<!--QuoteEnd--></div><!--QuoteEEnd-->
    I think you mean saxman compressor/decompressor, as I have indeed done the Enigma compressor and decompressor (<a href="https://code.google.com/p/s2-ssedit/source/browse/trunk/src/lib/enigma.cc" target="_blank">click here to go directly to it</a>).

    <!--quoteo(post=565427:date=Mar 4 2011, 10:32 PM:name=sonicblur)--><div class='quotetop'>QUOTE (sonicblur @ Mar 4 2011, 10:32 PM) <a href="index.php?act=findpost&pid=565427"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->I agree that in the end, your implementation is probably a better thing to do. When I was struggling to figure out why the Saxman decompressor wasn't working, comments or even documentation on the format (which I wasn't able to find) would've helped. In the process of fixing that code though, I eventually started to understand how the format worked. But a nice structure and comments would have helped.<!--QuoteEnd--></div><!--QuoteEEnd-->
    Yeah, this is the single biggest hurdle I am facing on rewriting the saxman compressor and decompressor -- no documentation at all (it is also why I am delaying working on it).
     
  6. FraGag

    FraGag

    Tech Member
    Nemesis compression and decompression are now implemented.

    <!--quoteo(post=565844:date=Mar 6 2011, 10:59 AM:name=flamewing)--><div class='quotetop'>QUOTE (flamewing @ Mar 6 2011, 10:59 AM) <a href="index.php?act=findpost&pid=565844"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec--><!--quoteo(post=565422:date=Mar 4 2011, 10:15 PM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 4 2011, 10:15 PM) <a href="index.php?act=findpost&pid=565422"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->However, flamewing didn't make an Enigma compressor and decompressor,<!--QuoteEnd--></div><!--QuoteEEnd-->
    I think you mean saxman compressor/decompressor, as I have indeed done the Enigma compressor and decompressor (<a href="https://code.google.com/p/s2-ssedit/source/browse/trunk/src/lib/enigma.cc" target="_blank">click here to go directly to it</a>).
    <!--QuoteEnd--></div><!--QuoteEEnd-->
    Indeed, I meant Saxman (de)compressor, sorry. And for the record, I'm keeping an eye on the s2-ssedit repository with <a href="http://tools.tortoisesvn.net/CommitMonitor.html" target="_blank">Commit Monitor</a>, so if you happen to find a way to improve the existing compressors or fix bugs, I'll know about it. :P Speaking of which...

    <!--quoteo(post=565422:date=Mar 4 2011, 08:15 PM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 4 2011, 08:15 PM) <a href="index.php?act=findpost&pid=565422"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->(However, flamewing's code for Nemesis decompression didn't work right, so I had to change some things to make it work correctly.)<!--QuoteEnd--></div><!--QuoteEEnd-->
    The way you're checking for codes (at "Find out if the data so far is a nibble code") doesn't make sense. During my tests, I use Green Hill Zone's first pattern set. Its header defines codes 000 and 001 (these are the shortest codes in the header). In your decompressor, as soon as one bit is read, it will match one of those codes, because you only consider the numeric value of the code. I used a binary tree instead, similar to the node class you use for encoding (and in fact, KENSSharp has 2 such classes, one used for encoding and the other used for decoding).
     
  7. flamewing

    flamewing

    Emerald Hunter Tech Member
    1,138
    0
    16
    France
    Sonic Classic Heroes; Sonic 2 Special Stage Editor; Sonic 3&K Heroes (on hold)
    <!--quoteo(post=567905:date=Mar 12 2011, 06:28 AM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 12 2011, 06:28 AM) <a href="index.php?act=findpost&pid=567905"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->The way you're checking for codes (at "Find out if the data so far is a nibble code") doesn't make sense. During my tests, I use Green Hill Zone's first pattern set. Its header defines codes 000 and 001 (these are the shortest codes in the header). In your decompressor, as soon as one bit is read, it will match one of those codes, because you only consider the numeric value of the code. I used a binary tree instead, similar to the node class you use for encoding (and in fact, KENSSharp has 2 such classes, one used for encoding and the other used for decoding).<!--QuoteEnd--></div><!--QuoteEEnd-->
    Oops, you are absolutely correct; I was extending too far the prefix-free property; I should also have been checking the code size, not just its value. I have fixed it in SVN now, so thanks for the report.
     
  8. FraGag

    FraGag

    Tech Member
    I've just committed the code for Enigma compression and decompression.

    flamewing, guess what... I found another problem! In your Enigma compressor, you have this:
    Code (Text):
    1. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unsigned short next = unpack[pos+1];
    2. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;int delta = int(next) - int(v);
    3. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if (delta == -1 || delta == 0 || delta == 1)
    4. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
    5. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;flush_buffer(buf, bits, mask, packet_length);
    6. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;size_t cnt = 1;
    7. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;unsigned short prev = next;
    8. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;next += delta;
    9. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;for (size_t I = pos + 2; I < unpack.size() && cnt < 0xf; I++)
    10. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
    11. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if (next == unpack[I])
    12. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{
    13. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;if (delta == 1 && prev == incrementing_value)
    14. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;break;
    15. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;next += delta;
    16. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;cnt++;
    17. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
    18. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;else
    19. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;break;
    20. &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;}
    I'm not too sure what you're trying to do with prev, but you're comparing it to incrementing_value too late (and neither prev not incrementing_value changes in the inner loop, so testing this in the loop is pointless). Right after initializing next, you should check if it's equal to incrementing_value and output an inline copy of a single word. In the inner loop, I compare unpack to incrementing_value and leave the loop if they're equal. I used tilemaps\Title Screen.bin from Sonic 1 for my tests, and my code now outputs a much smaller file when recompressed (222 bytes instead of 272 bytes with your implementation, and 269 bytes with The Sega Data Compressor), because the incrementing word is actually used correctly.
     
  9. flamewing

    flamewing

    Emerald Hunter Tech Member
    1,138
    0
    16
    France
    Sonic Classic Heroes; Sonic 2 Special Stage Editor; Sonic 3&K Heroes (on hold)
    <!--quoteo(post=568560:date=Mar 14 2011, 08:50 AM:name=FraGag)--><div class='quotetop'>QUOTE (FraGag @ Mar 14 2011, 08:50 AM) <a href="index.php?act=findpost&pid=568560"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->I'm not too sure what you're trying to do with prev, but you're comparing it to incrementing_value too late (and neither prev not incrementing_value changes in the inner loop, so testing this in the loop is pointless). Right after initializing next, you should check if it's equal to incrementing_value and output an inline copy of a single word. In the inner loop, I compare unpack to incrementing_value and leave the loop if they're equal. I used tilemaps\Title Screen.bin from Sonic 1 for my tests, and my code now outputs a much smaller file when recompressed (222 bytes instead of 272 bytes with your implementation, and 269 bytes with The Sega Data Compressor), because the incrementing word is actually used correctly.<!--QuoteEnd--></div><!--QuoteEEnd-->
    You are correct, of course; this happened because I gave much less love to the Enigma compressor than I did for the Nemesis compressor. My intention was to do what you did, but -- as you noted -- I botched the logic in my implementation; at least, this isn't an error that caused compression errors. This misuse of the incrementing word was the first mistake I saw in the original implementation -- it was computed as the word that gave the longest incremental run, but failed to take full advantage of it because of the inline incrementing runs. I back-ported your changes. Thanks again.
     
  10. FraGag

    FraGag

    Tech Member
    Saxman compression and decompression is done. KENSSharp is now complete!

    At MainMemory's request, I've put an option on Enigma and moduled Kosinski to choose between big endian and little endian, in order to support <strike>Sonic CD PC and</strike> Sonic & Knuckles Collection.
     
  11. MainMemory

    MainMemory

    Have no fear...Amy Rose is here! Tech Member
    4,373
    43
    28
    SonLVL
    AFAIK Sonic CD PC doesn't use any of the MD compression formats.

    But S&KC does, so thanks.
     
  12. MainMemory

    MainMemory

    Have no fear...Amy Rose is here! Tech Member
    4,373
    43
    28
    SonLVL
    I seem to have found a bug: Kosinski decompress "mappings/16x16/EHZ.bin" from the Sonic 2 disassembly, compress as Enigma, decompress. The result doesn't match the original decompressed file.
     
  13. FraGag

    FraGag

    Tech Member
    <!--quoteo(post=572631:date=Mar 25 2011, 07:07 PM:name=MainMemory)--><div class='quotetop'>QUOTE (MainMemory @ Mar 25 2011, 07:07 PM) <a href="index.php?act=findpost&pid=572631"><img src="public/style_images/retro/snapback.png"></a></div><div class='quotemain'><!--quotec-->I seem to have found a bug: Kosinski decompress "mappings/16x16/EHZ.bin" from the Sonic 2 disassembly, compress as Enigma, decompress. The result doesn't match the original decompressed file.<!--QuoteEnd--></div><!--QuoteEEnd-->
    Right, there was a little bug in the Enigma decompressor. It's now fixed.
     
  14. MainMemory

    MainMemory

    Have no fear...Amy Rose is here! Tech Member
    4,373
    43
    28
    SonLVL
    I've written a command-line interface for KensSharp, the source code to which will be available as soon as the repository works again.
    Code (Text):
    1. Usage: kenssharp [options] input output
    2.  
    3. Arguments:
    4.  
    5.     -h, --help              Shows this help screen.
    6.     -c, --compress=FORMAT   Compresses a file with the specified FORMAT.
    7.     -d, --decompress=FORMAT Decompresses a file with the specified FORMAT.
    8.     -r, --recompress=FORMAT Decompresses and recompresses a file with the
    9.                             specified FORMAT. If output file is not given,
    10.                             input file will be recompressed in place.
    11.     -s, --same-filename     The output file name will be the same as the
    12.                             input, with an extension indicating the type of
    13.                             compression: .kos, .eni, .nem, .sax, .kosm or
    14.                             .unc.
    15.     -l, --little-endian     Uses little endian (Intel) byte order
    16.                             for Enigma and Moduled Kosinski formats.
    17.     -n, --no-size           Do not include size in Saxman compressed file.
    18.  
    19. Formats:
    20.  
    21.     Kosinski, kos, k    The general-purpose Kosinski compression
    22.                         format.
    23.     Enigma, eni, e      The Enigma compression format for plane
    24.                         mappings.
    25.     Nemesis, nem, n     The Nemesis compression format for art tiles.
    26.     Saxman, sax, s      The Saxman compression format used by Sonic the
    27.                         Hedgehog 2's sound driver and music files.
    28.     ModuledKosinski,    The general-purpose Moduled Kosinski
    29.     KosinskiModuled,    compression format used by Sonic 3 & Knuckles.
    30.     mkos, kosm, mk, km
    Download binaries.
     
  15. MainMemory

    MainMemory

    Have no fear...Amy Rose is here! Tech Member
    4,373
    43
    28
    SonLVL
    I've just finished writing a shell extension for Windows explorer that adds a submenu to the context menu for all files, allowing you to compress and decompress data. It looks like this:
    [​IMG]
    You should be aware that the kenssharp program is invoked with the -s option, so the output file will have the same name as the input file, with the extension indicating the type of compression.
    You can download the installer here. I have not tested the installer or the extension itself on a 32-bit platform, so let me know if it doesn't work.
     
  16. FireRat

    FireRat

    Nah. Fuck it. Misfit
    48
    9
    8
    Chile
    Mobius Evolution 2
    HOLY GODDAMN SHIT!!!
    ... Because this is going be pretty useful for when compressing a lot of files at once (I'm talking about more than 500 files) without needing to copy derecmp, entering the DOS shell, and all that
     
  17. Clownacy

    Clownacy

    Tech Member
    785
    7
    18
    https://github.com/sonicretro/KENSSharp/releases/tag/v1.1

    A new release is out, now with Comper compression added.
     
  18. Clownacy

    Clownacy

    Tech Member
    785
    7
    18
    https://github.com/sonicretro/KENSSharp/releases/tag/v1.2

    Here's another release. It fixes a fair few bugs, and adds support for Kosinski+ compression.
     
  19. Clownacy

    Clownacy

    Tech Member
    785
    7
    18
    https://github.com/sonicretro/KENSSharp/releases/tag/v1.3

    Kosinski+ has been updated to its newer format. This improves decompression time (on the Mega Drive, that is).