don't click here

Sonic & Knuckles Collection C port

Discussion in 'Engineering & Reverse Engineering' started by BenoitRen, Jan 11, 2024.

  1. Sonic 1 and 2 possibly because of music issues?
     
  2. MainMemory

    MainMemory

    Has-Been Modder Tech Member
    4,819
    408
    63
    Myself
    At least part of the inconsistency in the ports themselves can be ascribed to different teams working on them. Sonic CD, as I recall from a thread by one of the developers, was done by a team at Intel to show off their new multimedia library DINO (which got superseded by DirectX only a year later). Sonic & Knuckles Collection was worked on by the Japanese PC software company H.I.C. And of course Sonic 3D Blast and Sonic R were done by Traveller's Tales themselves.
     
  3. Bobblen

    Bobblen

    Member
    470
    242
    43
    Sorry I'm completely derailing the thread but I would be completely down with a 'Sonic 1 & 2 Collection' built in the same way as SKC with its own funky MIDI remixes of the music :-D

    But anyway, Sonic & Knuckles Collection, C port. Way above my head technically, but it sounds great. Maybe you'll unearth another debug feature nobody noticed before like with CD.
     
  4. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    I haven't found any unknown debug features so far, and I'm almost done with decompiling the initialisation.

    The only odd thing I've found that I suspect MainMemory has already made use of is that there is logic to select between BGM and MIDI. Maybe they had planned an option to use recorded Mega Drive music at one point for computers that were strong enough.
    There were two teams involved, actually. There was Intel's team, which worked on DINO, and there was a Japanese (Sega?) team that converted the 68000 ASM to C. An employee from Intel's team would regularly fly over to Japan to coordinate.
     
    • Informative Informative x 1
    • List
  5. MainMemory

    MainMemory

    Has-Been Modder Tech Member
    4,819
    408
    63
    Myself
    I haven't made any use of the BGM.DLL feature, I simply inject the mod loader in place of MIDIOUT.DLL and have it return a custom interface for music that supports multiple formats (SMPS, MIDI, and streamed audio).
     
  6. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    I vaguely remember posting that they don't have the same size. The full version is 5,25 MB, while the trial version is 5,12 MB. That's why I use the retail version with Ghidra.

    Earlier today I remembered the claim that the trial version could become the retail version through an INI flag. While I did finally edit the INI file in the right location to fix the game's speed, setting the GameMode key to 0 didn't unlock the full game.

    My curiosity was piqued, so I disassembled the trial version and made my way to where the INI file is read. There I saw that the code for reading the GameMode key was removed. Instead, the values for demo mode and trial mode are hardcoded.

    I used Ghidra to patch the ASM and exported the patched EXE. It worked! The game thinks it's a retail version now. It even saves.

    But as this is a trial version, it would make sense that most data past the end of the demo wouldn't be shipped. What will happen once I clear Angel Island act 2?

    upload_2024-7-27_0-20-22.png
    "'sup" - Knuckles

    The game continued like normal. Nothing seems to be missing (so far). Makes one wonder what the reason for the EXE size difference is.

    EDIT: I beat the boss of Hydrocity act 1, and the game crashed.
    EDIT2: I guess the only acts that are playable are those that are shown in attract mode. Through level select, I could also play: Marble Garden act 1, Flying Battery act 1, and Sandopolis act 1.
     
    Last edited: Jul 27, 2024
    • Informative Informative x 2
    • List
  7. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    The C++ wrapper's initialisation code has been decompiled. Find it at the project's new Git repo.

    I was initially nervous about the code being C++ instead of C, as the former tends to be more labour intensive to decompile. However, my worries were unfounded. The only feature from the language that's used is encapsulation. There's no inheritance, overriding, or templates. The classes properly segment parts of the code, and provide automatic clean-up of resources.

    The only real unknown from the initialisation is "MidiOut.dll". As it's a custom DLL, I have no idea what the interface it returns from the function GetMidiInterface looks like.

    Also, suggestions for better names are welcome. For example, I know there are two initSound functions in there.
     
  8. MainMemory

    MainMemory

    Has-Been Modder Tech Member
    4,819
    408
    63
    Myself
    This is the basic layout of the class:
    Code (Text):
    1. class MidiInterface
    2. {
    3. public:
    4.     virtual BOOL init(HWND hwnd); // hwnd = game window
    5.     virtual BOOL loadSong(short id, unsigned int bgmmode); // id = song to be played + 1 (well, +1 compared to the sound test id, it's the ID of the song in the MIDIOUT.DLL's resources); bgmmode = 0 for FM synth, 1 for General MIDI
    6.     virtual BOOL playSong();
    7.     virtual BOOL stopSong();
    8.     virtual BOOL pauseSong();
    9.     virtual BOOL resumeSong();
    10.     virtual BOOL setTempo(unsigned int pct); // pct = percentage of delay between beats the song should be set to. lower = faster tempo
    11. };
     
  9. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    Next decompilation milestone has been pushed: the (main) window procedure (WndProc).

    I'm still nervous about naming, and some of the game's own naming doesn't make it easier.
    • The game's INI file can have a key named "GameMode" that determines whether the game is in retail, demo, or trial mode.
    • Another key is "SonicGameMode", which determines which of three games in the collection to start.
    Then there's another, internal-only, value. 0 means to start the game based on SonicGameMode. 1 starts the game in Special Stage Mode. 2 starts the game at the Level Select. I eventually settled on naming this "StartGameMode".

    Not sure what I'm going to decompile next. There are the different dialog windows (Change Controls, Use JoyStick, Sound Test) that have their own procedures, but that's not very interesting.
     
  10. Bobblen

    Bobblen

    Member
    470
    242
    43
    Just to note that starting the game at the level select is what happens when you enable debug mode (DebugMode=1 in s3k.ini. It also enables the numpad keys (which map to mega drive A,B,C) as well as edit mode. Special Stage Mode can be selected from the menu bar.
     
  11. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    I had managed to extract resources from Windows executables before, but they were always in binary form. Then I found a paper on decompiling resources, and they recommended Resource Hacker. Great tool.

    Then I figured I could at least decompile the Sound Test dialog. So I did. Code has been pushed.

    Kind of weird that they used the variable for the Mega Drive CPU's d0 register to pass the ID of the sound effect to be played. They didn't do that for background music. As a consequence, they have to back up the current d0 value before the Sound Test dialog opens, then restore it once it's closed:
    Code (C):
    1. int openSoundTestDialog(HWND hWnd) {
    2.   BOOL_00856e10 = FALSE;
    3.   stopBgm();
    4.   ULONG mdRegisterD0Backup = g_mdRegisterD0;
    5.   int ret = DialogBoxParam(g_hInstance, MAKEINTRESOURCE(IDD_SOUNDTEST_BASE + g_langMode), hWnd, (DLGPROC)soundTestDlgProc, 0);
    6.   BOOL_00856e10 = TRUE;
    7.   restartBgm();
    8.   pauseBgm();
    9.   g_mdRegisterD0 = mdRegisterD0Backup;
    10.  
    11.   return ret;
    12. }
     
  12. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    Aside from decompiling the dialogs, the only path forward seemed to be jumping over the great C++ wrapper/inline ASM divide to the games's start-up code, which makes me nervous. So I decompiled the other dialogs instead!

    The way the program's memory management seems to work is that the C++ wrapper reserves virtual chunks of memory at certain locations so the inline ASM can freely use those region of memory. Those regions of memory being the Mega Drive's video RAM and its main RAM. It's certainly not using a pointer to index it!

    In porting the inline ASM to C, presumably I'd have to use those pointer indexes by casting them to structs that represent the Mega Drive's memory regions. Which means I'll have to adapt what I've ported so far.

    I think I'm going to concentrate on porting the Level Select entry point of the games which is reached by enabling the debug features, as I imagine that'll take the least amount of conversion to get running. I don't want to make the same mistake again of working on a project for over a year and still have nothing to show.
     
  13. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    One of the first things the program does when starting the actual game is set the Mega Drive's region to "overseas". It does this by setting bit 7 of mdstatus (name leaked through Sonic Jam). I know Tails's name is dependent on this value, so would unsetting this bit actually work here? I patched the program to find out:

    upload_2024-8-5_20-22-20.png

    It does!

    Makes you wonder why they didn't set this value based on the region. It's not like they didn't know how. In Trial Mode, when you clear Angel Island act 2, a window pops up thanking you for playing the demo. The message that appears there is Japanese if your Windows's system language is Japanese. If it's not, the message is in English.
     
    • Like Like x 2
    • Informative Informative x 2
    • List
  14. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    After setting the virtual console's region, the program jumps to one of the start-up routines depending on the start-up mode and the chosen game. They do mostly the same things, most of which were already identified thanks to the labels MainMemory shared based on the old disassembly. But I quickly found out that most functions are stubbed.
    • Testing the checksum: stubbed.
    • The variable cartridge (name also leaked through Sonic Jam) is set. This is set to 1 when playing Sonic & Knuckles alone.
    • Routine to count the number of cycles between VBlanks (presumably to know if it's running on a PAL console): stubbed.
    • Initialising the sound driver: stubbed. Everything related to sound is handled by Windows-facing code.
    • Initialising the controllers: stubbed. Also handled by Windows-facing code.
    • Loading the savedata: this code has mostly survived. But instead of interfacing with SRAm, it interfaces with Windows-facing code that loads and saves the savefile.
    • Sets the gamemode value, which decides the next state of the game's main state machine.
    • Enters the game loop.
    The rest of this post will be dedicated to the code loading the savedata. Because, while the code did mostly survive, it was also added to.

    I can post walls of text, but it'll be easier with some pseudo-code. Here's what it looks like:
    Code (Text):
    1. routine load_savedata:
    2.   if sonic 3 mode then goto load_sonic3_savedata
    3.   if cartridge == 1 then exit (Sonic & Knuckles)
    4.   read competition mode savedata
    5.   if data is corrupt then copy default competition mode savedata
    6.  
    7.   if sonic 3 mode
    8.     read sonic 3 savedata
    9.     if data is corrupt then copy default sonic 3 savedata
    10.  
    11.   if not sonic 3 mode
    12.     read sonic 3 & knuckles savedata
    13.     if no active save slot found then copy default sonic 3 & knuckles savedata
    What's this? Why is there code for Sonic 3 in there if early on it leaves for another routine in that mode? That's right, the Sonic 3-exclusive code in this routine is dead code.

    How about the Sonic 3-exclusive routine?
    Code (Text):
    1. routine load_sonic3_savedata:
    2.   read competition mode savedata
    3.   if data is corrupt then copy default competition mode savedata that's identical to S3K's but stored elsewhere
    4.  
    5.   read sonic 3 savedata but place it at a different offset than the dead code version of this code
    6.   if data is corrupt then copy default sonic 3 savedata but not from the same place as the dead code version
    So, not only do we have dead code, but the game contains two identical chunks of default competition mode savedata, and two identical chunks of default Sonic 3 savedata.

    I think there was a problem coordinating the team when these changes were done...
     
    Last edited: Aug 7, 2024
    • Informative Informative x 3
    • List
  15. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    My journey has taken me to the Nemesis decompression code. Spent the better part of an afternoon porting the part that builds the codetable, and got this:
    Code (C):
    1. void bitdevr2(unsigned char** pSource) {
    2.   unsigned short data = *(*pSource)++;
    3.   while (data != 255) {
    4.     unsigned short byte1 = data;
    5.     while (!((data = *(*pSource)++) & 0x80)) {
    6.       unsigned char byte2 = data;
    7.       unsigned char palette_index = byte1 & 0xF;
    8.       unsigned char indexrepeat_cnt = byte2 & 0x70 >> 4;
    9.       unsigned char codebits_cnt = byte2 & 0xF;
    10.       unsigned short codetable_entry = codebits_cnt << 8 | indexrepeat_cnt << 4 | palette_index;
    11.       if (codebits_cnt == 8) {
    12.         unsigned short code = *(*pSource)++;
    13.         g_mdRam.bitdevwk[code] = codetable_entry;
    14.       }
    15.       else {
    16.         unsigned short code = *(*pSource)++ << 8 - codebits_cnt;
    17.         unsigned short cnt = 1 << 8 - codebits_cnt;
    18.         do {
    19.           g_mdRam.bitdevwk[code++] = codetable_entry;
    20.         } while (--cnt != 0);
    21.       }
    22.     }
    23.   }
    24. }
    Ideally, I'd have a variable for the value expressed by "8 - codebits_cnt", but I don't know a good name for it. I know it's used to shift the code variable left so its most significant bit is bit 7, but I don't know why the same value is used to construct a copy count for the codetable entry.
     
  16. MarkeyJester

    MarkeyJester

    You smash your heart against the rocks Resident Jester
    2,316
    565
    93
    Japan
    That variable counter is to create an entropy tree that's quick and easy for the CPU to extract.

    Let's say your bitfield is 01001110, and let's say the entropy 01 at the beginning are the bits needed to collect the value from the tree, it'd cost CPU time to and mask off those other bits, and shift the 01 into place such that you can collect the value from the table.

    So instead, they construct the table/tree such that they repeat the value needed for 01 about $40 times, and shove it into the table at index $40 to $7F (or 01000000 to 01111111). So when you read the value at index 01001110, it'll collect the value needed for 01, and the remaining 001101 are technically ignored. You might find the table will also contain a shift value as well, and this particular example will hold 2. After the value is collected, the 01001110 is shifted up by 2 to 001110?? ready for the next set of bits to be read.

    Obviously, the binary pattern is variable and dependent on the compressed data's tree, and each entry in the table/tree is a word, the above is just an example for simplicity sake.
     
  17. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    Instead of waiting until I've decompiled and ported all the code necessary to get to the level select, I've decided I'm going to push segmented parts bit by bit.

    Now available: the code for loading the savedata that I've discussed above, and the Nemesis decompression routines (which, based on the information we have, was called bitdevr internally).
     
    Last edited: Aug 9, 2024
    • Informative Informative x 1
    • List
  18. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    I've now also ported the Enigma decompression routines. The main routine is more straight-forward than Nemesis's, but the routine to retrieve the inline code value was more annoying to port, in part because it's not as well documented. It used ASM tricks that relied on CPU flags being (un)set for flow control. There's also a backwards jumping goto statement in there that I don't know how to get rid of without reordering the code.

    In the original code, when the decompression routine is done, the source pointer is set back one or two bytes based on the shift value. Then, if the memory address is odd, the source pointer is set forward one byte so it's even. The disassembly doesn't say why, and Sega Retro's documentation doesn't even mention it, so I'm left scratching my head. I don't think it's relevant to the C version, so I left that part out.

    That's all I've got to say about Enigma, but I've still got something to say about how S&KC uses Nemesis.

    Nemesis decompression isn't called directly. There are two prologue routines to use depending on if the game wants to extract the tiles to main RAM or VRAM. In S&KC, the prologue routine for extraction to VRAM was modified to extract to main RAM. As a result, the two prologue routines are identical. The routines that write the rows of pixels to VRAM still exist, but they're unused.
     
  19. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    Today's yield: palette operations (fadein, fadeout, colorset) and the color table.

    After the level select palette are two sets of 8 colours that aren't referred to by colortbl. The first one isn't referred to by anywhere else in the code either, but the second is written to by ev1710. It seems to be an event taking place in a version of Hidden Palace's Master Emerald shrine. Oddly enough, surrounding code refers to the level select palette.

    Speaking of the level select, I've made it to where it waits for one of the players to press the Start button. Meaning I'm getting close to making my proof of concept a reality.
     
  20. BenoitRen

    BenoitRen

    Tech Member
    956
    580
    93
    I've added the ported level select code for Sonic 3 & Knuckles. Sonic 3's is separate.

    As for the overall progress:
    • Timing-related functions have been decompiled. It wasn't fun, because doubles are used for increased precision, and those have their own ASM instructions.
    • Most of the sound code has been decompiled, except for one object that seems to hold data related to sound samples that I haven't figured out yet.
    • A lot of graphics code has been decompiled as well, and I've started on the code responsible for getting pixels on the screen. To my horror, these functions were either separately compiled with heavy optimisation, or they were hand-coded in assembly. They will take a while to figure out.
     
    • Informative Informative x 1
    • List