So, I've been digging at those ELF files for the Gems Collection version of Sonic CD. While objdump could be used to extract function and variable names, I noticed that it wasn't able to parse the .debug sections. The thing is that, there were definitely more symbols to extract, such as structure member names. Today, I did some more digging, and found this thread, which actually talked about my very issue, but for DDRMAX2. The very last post then took me to this post, which showed that someone successfully managed to extract data from the .debug section, and provided a batch script on how they did it. Gems Collection so happened to have been compiled and linked with the same tools as DDRMAX2, and I found an archive of those tools on archive.org, with cracks. So, I installed them and ran the Sonic CD ELF files from the PS2 version through the linker's disassembler, and... It worked. It's pure beauty. A much more complete dump of symbols with disassembled code from Sonic CD is now available.
The SMPS source code leak has this for the 68K version: Code (Text): rcunt EQU $01 ; delay count work cuntst EQU $02 ; delay store And this for the Z80 version: Code (Text): RCUNT EQU se_mode+1 ; For DLEAY COUNT NOTE LENGTH ;----------- 20 ---------------------- CUNTST EQU RCUNT+1 ; Probably just a typo by people who are unaware of the English word. Kind of the inverse of how the Disney film The Emperor's New Groove had to have the main character's name, Kuzco, be renamed from Manco because of what that name means in Japan.
I don't know in japanese, but, in spanish, you have to have lost your arm or at least your hand to be "manco". Not as unfortunate as the SCD one, though.
I'm not sure what's going on, but tools like IDA or objdump cannot parse the .debug section, hence why I even went through this process, even though apparently it is in a DWARF format. objdump was able to retrieve the filenames and line numbers from there, but for some reason couldn't recognize anything else. They're able to parse all the other symbols from other sections, though.
There's not much I can really do with those, as the only ELF files that I can really find don't really seem to be directly related to the games themselves, and even then, they appear to have their symbols stripped.
I'd be interested to see if S0.DAT (Sonic the Fighters executable) produces anything useful if you wouldn't mind humoring me. If nothing directly related to the game is in there perhaps it could reveal something interesting about it's emulation.
Boooo.... but if you find some time, a confrimed executable with symbols exists in the PS2 Virtua Fighter 2 - Sega Ages 2500 does exist, and I do know that Sonic Gems shared the same emulation codebase. The StF executable and VF2 executables for PS2 both have the same strings "MW MIPS C Compiler (2.4.1.01).PlayStation2". I just feel any more information might be helpful in me trying to unravel aspects of that game.
I do link to the archive of the IDE and compiler/linker in the first post, so you may wanna check that out.
I know I just wanted to be lazy and have it done by someone who's done it before already lol. Sigh, I don't use Windows so I will see what rabbit I can pull out of my hat for this. EDIT: Which of the installers should I be using?
Well I found the objdmp executable but how did you manage to spit out all the individual files like that?
You can run the linker separately. It's in the "PS2_Tools/Command_Line_Tools" folder in the installation path.
My internet is crapping out on me and I didn't see this second post. Ah, so it seems .debug is stripped out of the executable and gives no new info. Boo!
So, upon digging through this more after finding the time to do so... not only are there more symbols that can be found than what was initially found, but also every function's arguments, their local variables, the structures used, and other information attached to them, like their types, and all kinds of other stuff that was used for debugging. Here's an excerpt: Code (Text): 00011781:<116>TAG_compile_unit 00011787 AT_sibling(000118f9) 0001178d AT_low_pc(010075e0) 00011793 AT_high_pc(010077a0) 00011799 AT_stmt_list(00003e18) 0001179f AT_language(LANG_C) 000117a5 AT_producer(MW MIPS C Compiler) 000117ba AT_name(C:\project\GEMS\application\SonicCD\src\ps2\main\ENEMY.C) 000117f5:<107>TAG_global_subroutine 000117fb AT_sibling(000118f5) 00011801 AT_low_pc(010075e0) 00011807 AT_high_pc(010077a0) 0001180d AT_fund_type(FT_void) 00011811 AT_global_refs_block(<8>8f fc 00 00 b2 fc 00 00 ) 0001181d AT_restore_S0(<6> OP_BASEREG(29) OP_DREF8) 00011827 AT_restore_S1(<12> OP_BASEREG(29) OP_CONST(16) OP_ADD OP_DREF8) 00011837 AT_return_addr(<12> OP_BASEREG(29) OP_CONST(32) OP_ADD OP_DREF8) 00011847 AT_restore_SP(<11> OP_REG(29) OP_CONST(64) OP_ADD) 00011856 AT_name(ka_move) 00011860:<45>TAG_formal_parameter 00011866 AT_sibling(0001188d) 0001186c AT_mod_u_d_type(<5>MOD_pointer_to (0000fcd8)) 00011875 AT_location(<11> OP_BASEREG(29) OP_CONST(48) OP_ADD) 00011884 AT_name(pActwk) 0001188d:<24>TAG_lexical_block 00011893 AT_sibling(000118f1) 00011899 AT_low_pc(010075e0) 0001189f AT_high_pc(010077a0) 000118a5:<42>TAG_local_variable 000118ab AT_sibling(000118cf) 000118b1 AT_mod_u_d_type(<5>MOD_pointer_to (0000fcd8)) 000118ba AT_location(<5> OP_REG(17)) 000118c3 AT_name(pPlayerwk) 000118cf:<30>TAG_local_variable 000118d5 AT_sibling(000118ed) 000118db AT_fund_type(FT_signed_short) 000118df AT_location(<5> OP_REG(16)) 000118e8 AT_name(d0) The numbers on the left are basically the "location" of the information listed on that line. "AT_fund_type" and "AT_mod_u_d_type" indicate a type of variable. "MOD_pointer_to" means that it's a pointer, and they can be repeated to indicate the number of layers (i.e. "MOD_pointer_to MOD_pointer_to FT_char" is the same as char**). Any time that has a hex number instead is pointer to one of those location numbers found on the left, and that will give you the actual type info. "AT_name" is the symbol name, of course. In the "TAG_compile_unit" section, the name is the full path name of the source file that the subsequent sections were compiled from. Unfortunately, it seems that structure names were not kept...? They're just labelled as "anonX". I wonder if I can write myself a quick tool to convert all of this info into something more legible... or maybe something already exists, considering this is definitely DWARF. EDIT: It's DWARF v1... apparently that's why there's been some trouble, because some tools don't even support it. EDIT 2: Found a tool that actually does what I wanted to do a bit. Will come back with dumps soon-ish.