Hello folks, back from a 10 year hiatus from this site. I'm making this post mainly to document what I do and what it has to do with the Sonic franchise. Long story short, I found out a little while back that Sonic Advance 1 (I only examined the US ROM) appears to use a version of gcc around 2.95. Here's proof: Original assembly: Code (Text): THUMB_FUNC_START sub_08001930 sub_08001930: @ 0x08001930 ldr r3, _08001948 ldr r2, [r3] cmp r2, #0x7f bgt _08001950 ldr r1, _0800194C lsls r0, r2, #2 adds r0, r0, r1 ldr r0, [r0] adds r1, r2, #1 str r1, [r3] b _08001952 .align 2, 0 _08001948: .4byte 0x03001B3C _0800194C: .4byte 0x03001220 _08001950: movs r0, #0 _08001952: bx lr Proposed C code: Code (Text): #include "global.h" // ignore this, i have it included in my test setup for basic types and such. extern s32 gUnknown_03001B3C; extern u32 gUnknown_03001220[]; u32 sub_08001930(void) { int retVar; if(gUnknown_03001B3C > 0x7f) return 0; retVar = gUnknown_03001220[gUnknown_03001B3C++]; return retVar; } Output asm on -O2 -mthumb-interwork for gcc 2.95.1 (using the gcc included with GBA SDK) Code (Text): sub_08001930: push {lr} ldr r3, .L5 ldr r2, [r3] cmp r2, #0x7f bgt .L3 @cond_branch ldr r1, .L5+0x4 lsl r0, r2, #0x2 add r0, r0, r1 ldr r0, [r0] add r1, r2, #0x1 str r1, [r3] b .L4 .L5: .word gUnknown_03001B3C .word gUnknown_03001220 .L3: mov r0, #0x0 .L4: pop {r1} bx r1 If you note the extra push/pops, I actually figured that out a little while ago. They're seemingly a gcc difference between the GBA SDK version and stock gcc, so if you build a stock gcc 2.95.1 for ARM Thumb target, this will match. I honestly don't know if it would be worth it to build a disassembly of the Sonic Advance and do the matching decompilation approach like what pret does though, but it might produce some interesting findings. Can't say for certain if the later Advance titles use the same language or codebase or even a similar compiler but if they do they might be worth looking into.
I think if it's possible to recompile it as 1:1, then a matching decompilation would definitely be the way to go in my book.
Where are you getting your original disassembly from? Is the disassembler not specifically targeted for GNU GCC?
I use https://github.com/camthesaxman/gbadisasm to pull functions via a cfg that can be defined via IDA scripts. I don't need IDA though. You can use it to casually view the assembly of blobs by the process of repeatedly adding the function addrs to your config and repeat the disasm via cam's program.
Very interesting. I'd love to need around with a fully decompiled version of this. There's always the Android port source, but that's in Java, which I'm not familiar with nor interested in learning for one hobby project.