So awhile ago I made some SPA maps for competition purposes; now I have made more for 8-bit Sonic 2, Chaos and Triple Trouble. In the process I had to figure out the encoding for the maps, blocks, tiles, and palettes, so here it is. If there are other things people care about regarding the games there's a good chance I came across it as well. My tools included the Emukon/Mekaw debugging emulators, an SMS disassembler I got from SMSPower (but use with caution as it left out some important sections of code for me), a hex editor, and this document on using the hardware. My research was directed first at cracking STT and then extending that knowledge to Chaos and 8-bit Sonic 2, since they're mostly the same engine; the main points of difference were hardcoded addresses and 2/Chaos's map compression being somewhat inferior. Also, for what it's worth, I'm not under any illusion that there's likely to be much of anyone else who cares about this tome I'm about to write - I'm writing it largely for my own reference in the future, and thought I might as well dress it up for general consumption. :P All numbers cited are hex, and the terms "MS" and "LS" refer to most significant and least significant.
Here are the map images and PHP image-generating source. Rename the source to sttrom.php, edit the first set of variables to have working ROM filenames, set it in a live directory and it will work. I haven't made it functional on TSC's server to avoid stressing it out. If you just want a map call stt.rom?level=#, where the # is the level offset indicated in the filename of each map .png. Add &tiles=1 to get an output of the level's tiles, &blocks=1 to see the map block components, &gray=1 to turn off palettes, and/or &game=chaos or &game=s2gg to switch to Sonic Chaos or 2 instead of Triple Trouble. I assume Game Gear ROMs, so don't try sticking in an SMS Chaos rom. There are some minor map differences, so I might adapt the code to it at some later point if it's not too much hassle.
I'm not sure to what degree this is standard, since SMS cartridges use their own bankswitch methods, but the Sonic cartridges use 0000-7FFF as the equivalent first section of the ROM and 8000-BFFF as a bankswitched segment, which is determined by writing the bank number to FFFF. As such, any address that refers to a segment of the ROM above 8000 will have the actual ingame address of (offset) % 4000 + 8000, ie. 4135F -> 935F (which is highly annoying when trying to find specific references).
A summary: graphics are composed of 8x8 tiles, which are loaded into 8x16 sprites (not covered yet) or 32x32 pixel 4x4 tile blocks of 8x8 tiles. A rectangular array of blocks is what composes the map. There are also two 16 color palettes, one for blocks only and one for sprites and blocks.
As for the things I haven't done yet: Sprites in Chaos/STT are represented only by their IDs, which includes all badniks, most monitors and some rings. S2 objects remain completely gone. The water level also is absent, which is just a palette switch after a particular raster line.
The first point of interest in the ROM is the 8x8 tile methodology. The set of tile addresses is an index hardcoded in the ROM [STT:6B678|SC:7CEF|S2:7903], at which a set of seven bytes for each level give-
Tile Bank
??
??
LS Tile Address
MS Tile Address
??
??
Skip to that tile address, and you get
??
??
LS Number of Tiles
MS Number of Tiles
LS Guide Address (offset from Tile Address)
MS Guide Address (offset from Tile Address)
henceforth - Tile Data
For some reason STT checks the game mode byte and if it sees TA mode it overrides the usual zone/act tile selection and instead forces the address equivalent to level 29, whereas the TA mode level itself is level 30. I can't see why they made that special case behaviour instead of just correcting the address for level 30, but they did.
You need to work with both the tile data and the "guide" data at the same time - the "guide" is a set of 2-bit flags, 4 per byte, from LS to MS, that indicate what to do with the tile data, as follows:
0 - The tile is blank and uses no data
1 - The tile is a direct uncompressed copy of 20 bytes
2 - The tile is compressed
3 - The tile is compressed and an xor scheme should be applied to it (see php source)
The compression is of variable length, and as such there's no way to jump to a specific tile in the ROM; they have to be decompressed in order. The compression is relatively simple and consists of a 20-bit bitmask which indicates whether to load a byte of data or just push in 00. Here's an example, in which * indicates a compressed 0. Note that it processes bitmask word pairs "backwards", as with addresses.
The set of palette IDs are at [STT:3BF1C|SC:7F3C|S2:7BA1], and this is just a straight list of two bytes per level, the background palette and the object palette. The palettes themselves are at [STT:3B6FC|SC:3B653|S2:26E45], and adding the palette ID * 20 (the size of a palette) will take you to the one you're looking for.
The index of map data can be found at [STT:434F|SC:5139|S2:546A]. Add 2*zone bytes to that and you'll find the zone index; go to that address, add 2*act and you get the address of your level's map data. For the curious, here's what IIRC is a complete set of STT and somewhat less of a complete set from SC. The ??s are almost certainly special stages and ending sequences.
STT:
SC:
The block/map addresses are of the form
Bank
LS Address
MS Address
Note that Sunset Park 2-3 have the same map, and if you've played the game, you'll recall that there's not actually a transition between them unless you die.
Hit up that block address and you find a 200 byte collection of 2 byte offsets from the "main" block address. These 100 blocks correspond to the block numbers that will be used in the map; the first block is 00, second 01, etc. Some of the blocks will be identical offsets, which I assume is for the purpose of tagging them with additional effects such as breakability or hidden springage. At the offset is a collection of 20 bytes which form the actual block; each 2-byte segment defines a tile according to the SMS's specification found in the document I mentioned at the beginning, including the tile ID, palette, v/h flipping and bg/fg. The tile order is the same as printed English, left to right, top to bottom.
With the map compression comes the one substantive difference I found between the game engines. All map schemes uncompressed are identical - left-to-right, top-to-bottom, like the printed word again, across the whole map, with the width column in the above table of hex specifying how many blocks wide the map is. The SC map compression is pretty basic - just load all the data into RAM unless you come across an FF, which indicates that the next two bytes are a map piece and how many times to copy that piece. FFFF00 indicates that the map is over. Sonic 2's map loading is substantially similar to Chaos's, except that it uses FD to signal an identical string instead of FF, in identical strings the length comes before the byte to copy instead of after, and there's no indicator that the map is done - it just absorbs 1000 bytes regardless of whether they're garbage.
STT map compression is a tad more complicated; I just found the routine in the code and basically copied it word for word. It takes in a eight bit bitmask whenever its current bitmask runs dry; for each bit it tests, if that bit is 0 it copies in a byte of data, and if 1 that indicates that there is compressed data. The compression scheme works by copying data from a prior point in the map that has been already loaded. It reads in two bytes for directions: The MS half of the second byte + 3 indicates how many bytes to copy, and is then papered over with F so that the two bytes together can define the offset from the current position (which will always be backwards and less than 1000 because of the F and two's complement). If the two bytes are 0000, then the map is done.
Here's an example, from the first level of STT:
So F1 is the bitmask control byte, which put backwards in binary according to how the game uses it would be 10001111. OK, so we're loading in one byte, doing three compressed sequences, then loading in four more.
So take in that FE. Now for the compression, we read in FFFF. That third F, the MS half of the second byte, indicates we're doing F+3 or 12 bytes of copying. It gets papered over with an F (no effect in this instance), so our offset is FFFF, or back one byte.
What this means is that it will copy the FE over one byte, then the "read from" address will move over into that FE that was just copied and the "copy to" address to the next empty space, and so repeating the process another 11 times we'll get 11 extra FEs. This behavioral quirk only happens when the offset is FFFF; otherwise you'd be copying in, say, the previous 2 bytes over and over if the offset was FEFF, the previous 3 if FDFF, or just a plain block if it was any offset larger than the number of bytes we're copying.
According to the next two bits in the bitmask, we're doing two more FFFF compression sequences, so that gives us 24 more FEs. The last four bits in the bitmask gives us four straight bytes, so we copy in FEFE0D05, and then we load in another bitmask and start the process again. The result so far is 27 FEs in a row, with 0D05 at the end.
Anyway, if you're interested in this compression scheme you can check out the source compression at the addresses mentioned in the big block of hex above, and the output uncompressed map you can see in Emukon or Mekaw by loading up the appropriate level and pointing the memory viewer to C001, which is where it starts. I thought it was pretty nifty.
So my map program just makes two images, assigns them 32-color palettes, reads all the 8x8 tiles in and creates all the 32x32 blocks pixel-by-pixel in the block image, copies them into the map image according to the map, and calls it a day.
If there's any other inquiries I'll be happy to take them.
Here are the map images and PHP image-generating source. Rename the source to sttrom.php, edit the first set of variables to have working ROM filenames, set it in a live directory and it will work. I haven't made it functional on TSC's server to avoid stressing it out. If you just want a map call stt.rom?level=#, where the # is the level offset indicated in the filename of each map .png. Add &tiles=1 to get an output of the level's tiles, &blocks=1 to see the map block components, &gray=1 to turn off palettes, and/or &game=chaos or &game=s2gg to switch to Sonic Chaos or 2 instead of Triple Trouble. I assume Game Gear ROMs, so don't try sticking in an SMS Chaos rom. There are some minor map differences, so I might adapt the code to it at some later point if it's not too much hassle.
I'm not sure to what degree this is standard, since SMS cartridges use their own bankswitch methods, but the Sonic cartridges use 0000-7FFF as the equivalent first section of the ROM and 8000-BFFF as a bankswitched segment, which is determined by writing the bank number to FFFF. As such, any address that refers to a segment of the ROM above 8000 will have the actual ingame address of (offset) % 4000 + 8000, ie. 4135F -> 935F (which is highly annoying when trying to find specific references).
A summary: graphics are composed of 8x8 tiles, which are loaded into 8x16 sprites (not covered yet) or 32x32 pixel 4x4 tile blocks of 8x8 tiles. A rectangular array of blocks is what composes the map. There are also two 16 color palettes, one for blocks only and one for sprites and blocks.
As for the things I haven't done yet: Sprites in Chaos/STT are represented only by their IDs, which includes all badniks, most monitors and some rings. S2 objects remain completely gone. The water level also is absent, which is just a palette switch after a particular raster line.
The first point of interest in the ROM is the 8x8 tile methodology. The set of tile addresses is an index hardcoded in the ROM [STT:6B678|SC:7CEF|S2:7903], at which a set of seven bytes for each level give-
Tile Bank
??
??
LS Tile Address
MS Tile Address
??
??
Skip to that tile address, and you get
??
??
LS Number of Tiles
MS Number of Tiles
LS Guide Address (offset from Tile Address)
MS Guide Address (offset from Tile Address)
henceforth - Tile Data
For some reason STT checks the game mode byte and if it sees TA mode it overrides the usual zone/act tile selection and instead forces the address equivalent to level 29, whereas the TA mode level itself is level 30. I can't see why they made that special case behaviour instead of just correcting the address for level 30, but they did.
You need to work with both the tile data and the "guide" data at the same time - the "guide" is a set of 2-bit flags, 4 per byte, from LS to MS, that indicate what to do with the tile data, as follows:
0 - The tile is blank and uses no data
1 - The tile is a direct uncompressed copy of 20 bytes
2 - The tile is compressed
3 - The tile is compressed and an xor scheme should be applied to it (see php source)
The compression is of variable length, and as such there's no way to jump to a specific tile in the ROM; they have to be decompressed in order. The compression is relatively simple and consists of a 20-bit bitmask which indicates whether to load a byte of data or just push in 00. Here's an example, in which * indicates a compressed 0. Note that it processes bitmask word pairs "backwards", as with addresses.
5 D E 7 5 6 7 5 D57E6557 FF**FF**|C0**C03F|**3F3FC0|3FC0FF**|FC**FF**|**3FFF**|3FC0FF**|0F**FF**
The set of palette IDs are at [STT:3BF1C|SC:7F3C|S2:7BA1], and this is just a straight list of two bytes per level, the background palette and the object palette. The palettes themselves are at [STT:3B6FC|SC:3B653|S2:26E45], and adding the palette ID * 20 (the size of a palette) will take you to the one you're looking for.
The index of map data can be found at [STT:434F|SC:5139|S2:546A]. Add 2*zone bytes to that and you'll find the zone index; go to that address, add 2*act and you get the address of your level's map data. For the curious, here's what IIRC is a complete set of STT and somewhat less of a complete set from SC. The ??s are almost certainly special stages and ending sequences.
STT:
[block][ map ] wdth TH1 - 1100 8012 C080 A800 58FF 9804 0000 0800 3014 3802 334D TH2 - 1100 8012 B489 A800 58FF 9804 0000 0800 3014 3802 334D TH3 - 1100 801A 49B0 4000 C0FF C001 0000 0800 3007 3802 034E SP1 - 1180 9B12 5A92 A800 58FF 9804 0000 0800 3014 3802 334D SP2 - 1180 9B12 159C 8000 80FF 8003 0000 0800 300F 3803 634D SP3 - 1180 9B12 159C 8000 80FF 8003 0000 0800 300F 3803 634D MJ1 - 1B00 8012 AAA5 8000 80FF 8003 0000 0800 300F 3803 634D MJ2 - 1B00 8012 45AF 8000 80FF 8003 0000 0800 300F 3803 634D MJ3 - 1B00 8014 80B2 4000 C0FF C001 0000 0800 3007 3803 034E RW1 - 1400 8014 A5B5 8000 80FF 8003 0000 0800 300F 3803 634D RW2 - 1400 8015 0080 8000 80FF 8003 0000 0800 300F 3803 634D RW3 - 1400 8012 23B9 6000 A0FF A002 0000 0800 300B 3804 A34D TP1 - 14E0 9915 6686 8000 80FF 8003 0000 0800 300F 3803 634D TP2 - 14E0 9915 3490 3000 D0FF 5001 0000 0800 3005 3809 934E TP3 - 14E0 9915 0B9A 4000 C0FF C001 0000 0800 3007 3801 034E AD1 - 1B00 9C15 109B 8000 80FF 8003 0000 0800 300F 3803 634D AD2 - 1B00 9C15 7AA5 6000 A0FF A002 0000 0800 300B 3804 A34D AD3 - 1B00 9C15 E3AF A800 58FF 9804 0000 0800 3014 3802 334D SS1 - 1940 8315 68B5 A800 58FF 9804 0000 0800 3014 3802 334D SS3 - 1940 8315 3BB9 3000 D0FF 5001 0000 0800 3005 3809 934E SS5 - 1940 831A ABB2 8000 80FF 8003 0000 0800 300F 3803 634D Intro-1100 8012 0080 4000 C0FF C001 0000 0800 3007 3800 034E End - 1100 8016 2AB1 A800 58FF 9804 0000 0800 3014 3802 334D
SC:
[block][ map ] wdth TH1 - 1100 8012 0080 8000 80FF 8003 0000 0800 300F 1003 4E5B TH2 - 1100 8012 A98B 8000 80FF 8003 0000 0800 300F 1003 4E5B TH3 - 1100 8012 2C98 5000 B0FF 3002 0000 0800 3009 1001 1E5C GP1 - 1140 9612 3B9C A000 60FF 6004 0000 0800 3013 1002 1A5B GP2 - 1140 9612 16AA 8000 80FF 8003 0000 0800 300F 1003 4E5B GP3 - 1140 9612 ADB8 5000 B0FF 3002 0000 0800 3009 1001 1E5C SE1 - 1140 AA15 0080 8000 80FF 8003 0000 0800 300F 1003 4E5B SE2 - 1140 AA15 C88C 8000 80FF 8003 0000 0800 300F 1003 4E5B SE3 - 1140 AA14 60B4 8000 80FF 8003 0000 0800 300F 1003 4E5B MGH1- 1400 8015 D499 8000 80FF 8003 0000 0800 300F 1003 4E5B MGH2- 1400 8015 B8A6 8000 80FF 8003 0000 0800 300F 1003 4E5B MGH3- 1400 8015 FBB2 7800 88FF 4803 0000 0800 300E 1002 8E5B AP1 - 1420 9317 0080 A800 58FF 9804 0000 0800 3014 1002 EA5A AP2 - 1420 9317 C187 8000 80FF 8003 0000 0800 300F 1003 4E5B AP3 - 1420 9315 A6B8 5000 B0FF 3002 0000 0800 3009 1001 1E5C EE1 - 14E0 A417 3A8F 8000 80FF 8003 0000 0800 300F 1003 4E5B EE2 - 14E0 A417 739B 8000 80FF 8003 0000 0800 300F 1003 4E5B EE3 - 14E0 A417 53A8 7000 90FF 1003 0000 0800 300D 1003 D45B
The block/map addresses are of the form
Bank
LS Address
MS Address
Note that Sunset Park 2-3 have the same map, and if you've played the game, you'll recall that there's not actually a transition between them unless you die.
Hit up that block address and you find a 200 byte collection of 2 byte offsets from the "main" block address. These 100 blocks correspond to the block numbers that will be used in the map; the first block is 00, second 01, etc. Some of the blocks will be identical offsets, which I assume is for the purpose of tagging them with additional effects such as breakability or hidden springage. At the offset is a collection of 20 bytes which form the actual block; each 2-byte segment defines a tile according to the SMS's specification found in the document I mentioned at the beginning, including the tile ID, palette, v/h flipping and bg/fg. The tile order is the same as printed English, left to right, top to bottom.
With the map compression comes the one substantive difference I found between the game engines. All map schemes uncompressed are identical - left-to-right, top-to-bottom, like the printed word again, across the whole map, with the width column in the above table of hex specifying how many blocks wide the map is. The SC map compression is pretty basic - just load all the data into RAM unless you come across an FF, which indicates that the next two bytes are a map piece and how many times to copy that piece. FFFF00 indicates that the map is over. Sonic 2's map loading is substantially similar to Chaos's, except that it uses FD to signal an identical string instead of FF, in identical strings the length comes before the byte to copy instead of after, and there's no indicator that the map is done - it just absorbs 1000 bytes regardless of whether they're garbage.
STT map compression is a tad more complicated; I just found the routine in the code and basically copied it word for word. It takes in a eight bit bitmask whenever its current bitmask runs dry; for each bit it tests, if that bit is 0 it copies in a byte of data, and if 1 that indicates that there is compressed data. The compression scheme works by copying data from a prior point in the map that has been already loaded. It reads in two bytes for directions: The MS half of the second byte + 3 indicates how many bytes to copy, and is then papered over with F so that the two bytes together can define the offset from the current position (which will always be backwards and less than 1000 because of the F and two's complement). If the two bytes are 0000, then the map is done.
Here's an example, from the first level of STT:
F1FE FFFF FFFF FFFF FEFE 0D05
So F1 is the bitmask control byte, which put backwards in binary according to how the game uses it would be 10001111. OK, so we're loading in one byte, doing three compressed sequences, then loading in four more.
So take in that FE. Now for the compression, we read in FFFF. That third F, the MS half of the second byte, indicates we're doing F+3 or 12 bytes of copying. It gets papered over with an F (no effect in this instance), so our offset is FFFF, or back one byte.
What this means is that it will copy the FE over one byte, then the "read from" address will move over into that FE that was just copied and the "copy to" address to the next empty space, and so repeating the process another 11 times we'll get 11 extra FEs. This behavioral quirk only happens when the offset is FFFF; otherwise you'd be copying in, say, the previous 2 bytes over and over if the offset was FEFF, the previous 3 if FDFF, or just a plain block if it was any offset larger than the number of bytes we're copying.
According to the next two bits in the bitmask, we're doing two more FFFF compression sequences, so that gives us 24 more FEs. The last four bits in the bitmask gives us four straight bytes, so we copy in FEFE0D05, and then we load in another bitmask and start the process again. The result so far is 27 FEs in a row, with 0D05 at the end.
Anyway, if you're interested in this compression scheme you can check out the source compression at the addresses mentioned in the big block of hex above, and the output uncompressed map you can see in Emukon or Mekaw by loading up the appropriate level and pointing the memory viewer to C001, which is where it starts. I thought it was pretty nifty.
So my map program just makes two images, assigns them 32-color palettes, reads all the 8x8 tiles in and creates all the 32x32 blocks pixel-by-pixel in the block image, copies them into the map image according to the map, and calls it a day.
If there's any other inquiries I'll be happy to take them.
This post has been edited by Rolken: 02 March 2008 - 12:33 AM



