Sik, on Nov 4 2007, 06:50 AM, said:
Taking advantage of the bump: would you like to keep calling it Kosinski or not? Seriously. After examining Allegro's packfiles, Kosinski format turned out to be just a somewhat improved version of LZSS (improved by the fact that small offsets require less bytes). I'm not kidding. Go and check the LZSS file in Allegro sources and read the start comment. It describes how does the packfile compression works. It's way similar.
LZ77 is a method/theory of compression. LZSS is a painfully obvious addition to LZ77 compression, which barely warrants mentioning IMO.
Kosinski is an implementation of LZ77 compression. There are many others, and they are all similar. PRS compression is also LZ77 based. LZ77 is one of the most popular compression methods. In fact, of the 8 or so sonic-related compression formats I've worked on, I think 6 of them have been LZ77 based. Being based on the same compression theory means they are going to have a lot in common. They are unique implementations however, and there are a lot of implementational details which are not defined in the compression method, such as how the bit tags are embedded, how the offset/count pairs are specified, additional offset/count formats and how they are indicated, the number of bits for the copy count vs the offset in each pair, how the end of file is marked, etc. There's also a lot of careful measurement and testing that goes into choosing precisely how the offset/count pairs are balanced to produce the best compression ratios, and the kind of data you are compressing affects the choice greatly.
When you get down to it, there are only a handful of actual methods for lossless data compression. The high compression formats we use today such as zip, rar, 7z, etc all rely on the same basic compression methods. These more advanced "superformats" simply add header information which allows them to choose between a variety of methods, and select the best, or even combine several methods, to achieve the best compression ratios throughout a file. I could write a paper called "Uber Compression" which defines this compression method in a generic way. It wouldn't do away with the need to keep the distinction between rar and zip however. We name things by their implementation. Theories and methods are only useful to academics.
This post has been edited by Nemesis: 04 November 2007 - 09:28 AM