Optimized KosDec and NemDec, considerably faster decompression UPDATE: New improved Kosinski and Nemesis compressors by flamewing
#17
Posted 08 November 2013 - 04:38 PM
It actually had a pretty good compression ratio when compressing Green Hill and Hidden Palace's (S2), and a custom tileset's tiles. Only around 5kb more each, compared to Nemesis.
UFTC can be found here: https://github.com/s...ree/master/uftc
The compression tool needs to be compiled though.
EDIT: I've compiled the tool, and here it is: http://trox.binary-d...public/uftc.exe
flamewing, on 08 November 2013 - 10:03 AM, said:
Unless you can fit all the decompressed art in main RAM, it would not be suitable for that: you can't start decompressing a part of the file until you have decompressed everything before it, which is not very good for DPLCs. If you were to break the file into the contiguous chunks of art for each DPLC frame, the compression ratio would probably be bad as there is not enough space to find good matches.
For DPLCs, a better choice would probably be Sik's UFTC format -- or would be, except FileDen seems to have gone the way of the dodo. Its compression ratio is much worse, but it allows decompression of specific tiles.
For DPLCs, a better choice would probably be Sik's UFTC format -- or would be, except FileDen seems to have gone the way of the dodo. Its compression ratio is much worse, but it allows decompression of specific tiles.
UFTC can be found here: https://github.com/s...ree/master/uftc
The compression tool needs to be compiled though.
EDIT: I've compiled the tool, and here it is: http://trox.binary-d...public/uftc.exe
This post has been edited by HCKTROX: 21 January 2014 - 11:31 PM
#18
Posted 08 November 2013 - 04:50 PM
Wow, this is awesome! Amazing job, seriously.
Once SonLvL has proper support for this (if it doesn't already), I want to convert over all the nemesis art in my hack. And the Level art too, depending on how much it affects the performance.
Once SonLvL has proper support for this (if it doesn't already), I want to convert over all the nemesis art in my hack. And the Level art too, depending on how much it affects the performance.
#19
Posted 08 November 2013 - 06:04 PM
If you're referring to the Comper format, well, the only way any new compression formats are getting support is if someone else (usually FraGag) writes a pure .NET library for it.
#20
Posted 09 November 2013 - 06:35 PM
Quote
Quote
Curious, would this thing be feasible to compress Sonic's art (or any other DPLCable art)?
Unless you can fit all the decompressed art in main RAM, it would not be suitable for that: you can't start decompressing a part of the file until you have decompressed everything before it, which is not very good for DPLCs. If you were to break the file into the contiguous chunks of art for each DPLC frame, the compression ratio would probably be bad as there is not enough space to find good matches.
Flamewing's right, you cannot compress Sonic's art right away. You need to compress each frame separately, for that you need to assemble all tiles transferred by DPLCs for each frame. Different frames may use the same group of tiles to save space, but in your reassembled separate frames you'll have to duplicate them. Nevertheless, I considered doing something like this earlier. Despite the need for duplicated tiles, this may worth a try. There aren't too many of them, at least in Sonic 1's tileset and the compression ratio could theoretically make up for that loss.
Also, you may easily use Comper compression instead of certain uncompressed art DPLCs.
Objects like Spin Dash dust and shields in Sonic 3 load the entire unique tile set from every animation frame they display. Therefore, each such frame can be fully compressed separately, no shared tiles will be met. Decompression should be fast enough to cause no lag, just like the uncompressed art. However, a lot of adjustments to the DPLC code should be made to handle the new way of delivering art. Luckily, each object has its own DPLC-related code, at least in Sonic 2.
I would recommend compressing DPLCs only if you're running out of space or dealing with really huge set of sprites with unique tiles each. I shall set some tests soon to see how well my compression can be used in Sonic games.
#21
Posted 09 November 2013 - 07:54 PM
SO... I tried to run the Comper compressor on my laptop... got greeted by one of those black console windows for a split second, then nothing... This isn't something new, but I never know what to do about this. Surely I cannot be the only one???
#22
Posted 09 November 2013 - 08:34 PM
All compressors provided by flamewing are command-line tools, which means they only work from a command prompt, where you should pass 3 or 2 arguments to the compressor.
To use them, open Command Prompt, set the current path to where compressors' executable files are, like so:
Now you can run any executable file from that folder just by typing its name (even without extension). For example, if you want to execute Comper's compressor (compcmp.exe), simply type in: compcmp. However, this won't get you anywhere, because, in order to make it to work, you should pass arguments to the compressor first. Executing it without them simply leads you to a short help message - the one that popped up for a split second when you launched it right from Windows Explorer.
Now, as for the arguments.
If you want compress file A.bin to Comper format, execute the following command in command prompt:
The output will be A.bin.comper (obviously). You may choose any name you like. I prefer extra extensions to identify compression format more easily.
If you want decompress B.bin.comper, do this:
If you don't like dealing with command prompt, you may check out my own original compressor tools (the compcmp.exe from the recent KENS was written by flamewing himself): https://dl.dropboxus...20Compressor.7z
It works simple: just drag&drop the file you want to compress onto executable. The output will be a file of the same name, but with extra .comper extension. This compressor, however, works slowly due to debug output, it also was compiled with MinGW and may not work on some machines if GCC dynamic link library is absent -- this can be easily fixed by finding and placing the said dll-file into compressors folder (the name will be displayed in the error message).
To use them, open Command Prompt, set the current path to where compressors' executable files are, like so:
cd D:\Sega\Dev\Tools\KENS_FL\
Now you can run any executable file from that folder just by typing its name (even without extension). For example, if you want to execute Comper's compressor (compcmp.exe), simply type in: compcmp. However, this won't get you anywhere, because, in order to make it to work, you should pass arguments to the compressor first. Executing it without them simply leads you to a short help message - the one that popped up for a split second when you launched it right from Windows Explorer.
Now, as for the arguments.
If you want compress file A.bin to Comper format, execute the following command in command prompt:
compcmp A.bin A.bin.comper
The output will be A.bin.comper (obviously). You may choose any name you like. I prefer extra extensions to identify compression format more easily.
If you want decompress B.bin.comper, do this:
compcmp -x B.bin.comper B.bin
If you don't like dealing with command prompt, you may check out my own original compressor tools (the compcmp.exe from the recent KENS was written by flamewing himself): https://dl.dropboxus...20Compressor.7z
It works simple: just drag&drop the file you want to compress onto executable. The output will be a file of the same name, but with extra .comper extension. This compressor, however, works slowly due to debug output, it also was compiled with MinGW and may not work on some machines if GCC dynamic link library is absent -- this can be easily fixed by finding and placing the said dll-file into compressors folder (the name will be displayed in the error message).
This post has been edited by vladikcomper: 09 November 2013 - 08:36 PM
#23
Posted 14 November 2013 - 09:51 PM
So, I was running the tools to have some statistics to show and I discovered that the KosM encoder was far from optimal as it was not properly managing the internal padding of each module. I also did some things to improve the Nemesis encoder a bit by using some new heuristics. And finally, I found out a case (that does not occur in practice) were the Saxman could be improved: if it started with a sequence of zeroes. I updated the download link in the previous post; but here it is, for convenience.
Anyway, stats: I reencoded all Sonic 1 Nemesis art, 256x256 blocks and its sole Kosinski art file; for Sonic 2, I reencoded all Nemesis art, all Kosinski art, all 16x16 and 128x128 blocks and all Saxman songs; and for S&K, I reencoded all Kosinski art, all Nemesis art and all moduled Kosinski art. I did this for both original KENS and the latest versions of my tools (in the download linked to above). The results:
All told, you can save 8762 B in S1, 12606 B in S2 and 20510 B in S3&K.
Edit: Added in a further improved version of the KosM encoder and updated the table accordingly -- it takes into consideration the 3-byte end-of-compression sequence when managing the padding for a module. It is as good as it gets now.
Anyway, stats: I reencoded all Sonic 1 Nemesis art, 256x256 blocks and its sole Kosinski art file; for Sonic 2, I reencoded all Nemesis art, all Kosinski art, all 16x16 and 128x128 blocks and all Saxman songs; and for S&K, I reencoded all Kosinski art, all Nemesis art and all moduled Kosinski art. I did this for both original KENS and the latest versions of my tools (in the download linked to above). The results:
Uncompressed Original KENS Mine Original ratio KENS ratio My ratio KENS/Original Mine/Original Mine/KENS S1 256x256 blocks 335872 B 70896 B 70511 B 66300 B 21.11% 20.99% 19.74% 99.46% 93.52% 94.03% S1 Nemesis art 361344 B 168306 B 166251 B 164198 B 46.58% 46.01% 45.44% 98.78% 97.56% 98.77% S1 Kosinski art 4096 B 1424 B 1419 B 1366 B 34.77% 34.64% 33.35% 99.65% 95.93% 96.26% S1 Total 701312 B 240626 B 238181 B 231864 B 34.31% 33.96% 33.06% 98.98% 96.36% 97.35% S2 Saxman music 29430 B 21935 B 21933 B 21819 B 74.53% 74.53% 74.14% 99.99% 99.47% 99.48% S2 16x16 blocks 42128 B 30688 B 30630 B 30342 B 72.84% 72.71% 72.02% 99.81% 98.87% 99.06% S2 128x128 blocks 262144 B 85104 B 84599 B 81098 B 32.46% 32.27% 30.94% 99.41% 95.29% 95.86% S2 Nemesis art 329216 B 158758 B 157823 B 154990 B 48.22% 47.94% 47.08% 99.41% 97.63% 98.20% S2 Kosinski art 238164 B 104816 B 104515 B 100446 B 44.01% 43.88% 42.18% 99.71% 95.83% 96.11% S2 Total 901082 B 401301 B 399500 B 388695 B 44.54% 44.34% 43.14% 99.55% 96.86% 97.30% S&K Nemesis art 264000 B 113268 B 108641 B 107274 B 42.90% 41.15% 40.63% 95.91% 94.71% 98.74% S&K KosM art 444544 B 205564 B 204595 B 197198 B 46.24% 46.02% 44.36% 99.53% 95.93% 96.38% S&K Kosinski art 304096 B 126736 B 126350 B 120586 B 41.68% 41.55% 39.65% 99.70% 95.15% 95.44% S&K Total 1012640 B 445568 B 439586 B 425058 B 44.00% 43.41% 41.98% 98.66% 95.41% 96.70%
All told, you can save 8762 B in S1, 12606 B in S2 and 20510 B in S3&K.
Edit: Added in a further improved version of the KosM encoder and updated the table accordingly -- it takes into consideration the 3-byte end-of-compression sequence when managing the padding for a module. It is as good as it gets now.
This post has been edited by flamewing: 15 November 2013 - 07:56 AM
#24
Posted 28 November 2013 - 01:51 AM
So, I'm trying out implementing the Decompressor code... and I've a question.
In the Nemesis Decompression code, above NemDec: There's a block labeled NemDec_RAM:
Should I just ignore this and port the code into Sonic 1 as is? Or is NemDec_RAM supposed to be used in certain instances?
In the Nemesis Decompression code, above NemDec: There's a block labeled NemDec_RAM:
Should I just ignore this and port the code into Sonic 1 as is? Or is NemDec_RAM supposed to be used in certain instances?
#25
Posted 28 November 2013 - 08:03 AM
It would appear that NemDec_RAM (or NemDecToRAM as Sonic 2 calls it) is in fact present and unused in Sonic 1. Look here (taken from HG disasm):
Above the '='s you can see NemDec. Below them, with no label, you can see NemDec_RAM. This is actually used in Sonic 2. Its use is similar to KosDec in the sense that, to use it, the source is loaded into a0 and the RAM destination is loaded into a4 (a1 in KosDec), so you can easily swap them out, as I did with CompDec.
To answer your question, no, you don't need NemDec_RAM unless you're planning to replace some Kosinski art/mappings/anything, and therefore their decompression routines, with Nemesis. From my experiences, NemDec_RAM, KosDec and CompDec are fully interchangeable (though Sonic 2's 16x16s are giving me some trouble.) So choose whichever suits your needs best; small file size or fast decompression speeds.
EDIT: Grammar!
NemDec: movem.l d0-a1/a3-a5,-(sp) lea (loc_1502).l,a3 lea (vdp_data_port).l,a4 bra.s loc_145C ; =========================================================================== movem.l d0-a1/a3-a5,-(sp) lea (loc_1518).l,a3
Above the '='s you can see NemDec. Below them, with no label, you can see NemDec_RAM. This is actually used in Sonic 2. Its use is similar to KosDec in the sense that, to use it, the source is loaded into a0 and the RAM destination is loaded into a4 (a1 in KosDec), so you can easily swap them out, as I did with CompDec.
To answer your question, no, you don't need NemDec_RAM unless you're planning to replace some Kosinski art/mappings/anything, and therefore their decompression routines, with Nemesis. From my experiences, NemDec_RAM, KosDec and CompDec are fully interchangeable (though Sonic 2's 16x16s are giving me some trouble.) So choose whichever suits your needs best; small file size or fast decompression speeds.
EDIT: Grammar!
This post has been edited by Clownacy: 28 November 2013 - 10:27 AM
#26
Posted 28 November 2013 - 10:54 AM
My understanding is that Nemesis is designed for, and should only be used with, 8x8 tiles. Kosinski and Comper can be used for any type of data, and Enigma is for plane mappings or 16x16 blocks (Sonic 1 uses it for them).
#27
Posted 28 November 2013 - 11:04 AM
That said, would it be feasible to simply oust Nemesis entirely in favor of Kosinski (or Comper now that we have that now)? I'm guessing part of the reason that's not done... is a filesize issue???
#28
Posted 28 November 2013 - 12:37 PM
I think the main reason is... well, you know how NemDec_RAM could be replaced because Kos/CompDec had the same results (source (a0) decompressed to a RAM address (a4/1)?) Well, NemDec is a different story: for example, it writes to VRAM and involves the VDP data port. I don't see that working with KosDec and CompDec in their current states. Maybe Sonic 3 found a way around that, having replaced a good chunk of its art (Nemesis included) with KosinskiM. Aside from that though, I'm not too sure either, though I can't imagine it being space limitations.
#29
Posted 28 November 2013 - 03:44 PM
The main issue in ousting Nemesis in favor of Kosinski (or Moduled Kosinski, or Comper, or a putative Moduled Comper) is simple -- RAM. As is, Kosinski (et al) can only decompress to RAM, while Nemesis can decompress directly to VRAM. Moduled Kosinski would be a viable alternative if you could get a buffer with $1000 bytes in main RAM -- or you use a modified encoder, such as mine, which allows you to specify a smaller buffer size for KosM, with a corresponding penalty in compression ratio.
It is entirely possible to free $1000 bytes in main RAM for KosM in S1 and S2 -- S2 being easier because it uses a z80 driver, which frees up a lot of RAM right there -- but you need to juggle around an awful lot of addresses to get the full $1000.
In theory, since it uses words internally, a modified Comper decoder could be written using VRAM writes and VRAM copies, thus being able to decompress directly to VRAM. I am not sure if a VRAM copy would work with Kosinski or not -- I need to test how VRAM copies would do what is required when copying an overlapping byte stream to be sure. I am not sure how fast this would be, but it would probably be a lot slower. Which might be OK for a PLC-like mechanism, I guess.
It is entirely possible to free $1000 bytes in main RAM for KosM in S1 and S2 -- S2 being easier because it uses a z80 driver, which frees up a lot of RAM right there -- but you need to juggle around an awful lot of addresses to get the full $1000.
In theory, since it uses words internally, a modified Comper decoder could be written using VRAM writes and VRAM copies, thus being able to decompress directly to VRAM. I am not sure if a VRAM copy would work with Kosinski or not -- I need to test how VRAM copies would do what is required when copying an overlapping byte stream to be sure. I am not sure how fast this would be, but it would probably be a lot slower. Which might be OK for a PLC-like mechanism, I guess.
#30
Posted 28 November 2013 - 09:43 PM
Kos_decomp_buffer = ramaddr( $FFFFD000 ) ; $1000 bytes ; each module in a KosM archive is decompressed here and then DMAed to VRAMAh... you mean this. Pulled it from S3K. Ok. I think I understand completely now. This greatly helps me with my hack. Thanks guys!

01