Double posting, because I think it's warranted. As I said in the previous post, I found a tool called dwarf2cpp, that parses DWARF v1 data and generates C/C++ skeletons from them, basically allowing for easier analysis of variables, structures, and function prototypes and their local variables, while also setting up the folder structure of the source code. No actual code is decompiled, it just dumps those things. Should be a good resource for a possible decompilation in the future maybe? Download GitHub Repository Some samples of what it generated: Spoiler Here's a sample decompilation I did with this information (note: structure names and constant names had to be made up): Code (Text): void action(void) { actwkt *pActwk; i32 i; pActwk = actwk; for (i = 0; i < ACTWK_SLOTS; ++i) { if (pActwk->actno != 0) { act_tbl[pActwk->actno](pActwk); } ++pActwk; } } void speedset(actwkt *pActwk) { i32u xpos; i32u ypos; i16u spd; ypos = pActwk->yposi; xpos = pActwk->xposi; spd = pActwk->xspeed; xpos.l += (spd.w << 8); spd = pActwk->yspeed; if (!(pActwk->actfree[PLAYCTRL] & 8)) { if (spd.w >= 0 || (!(pActwk->actfree[PLAYCTRL] & 2) || spd.w >= -0x800)) { if (!(pActwk->actfree[PLAYCTRL] & 4)) { pActwk->yspeed.w += 0x38; } } } if (pActwk->yspeed.w >= 0) { if (pActwk->yspeed.w >= 0x1000) { pActwk->yspeed.w = 0x1000; } } ypos.l += spd.w << 8; pActwk->xposi.l = xpos.l; pActwk->yposi.l = ypos.l; } void speedset2(actwkt *pActwk) { i32u xpos; i32u ypos; i32 spd; i32 actwkno; i16 d1; xpos = pActwk->xposi; ypos = pActwk->yposi; spd = pActwk->xspeed.w; if (pActwk->cddat & 8) { actwkno = pActwk->actfree[PLAYRIDE]; if (actwk[actwkno].actno == 0x1E) { d1 = -0x100; if (!(pActwk->cddat & 1)) { d1 = -d1; } spd += d1; } } spd <<= 8; xpos.l += spd; spd = pActwk->yspeed.w; spd <<= 8; ypos.l += spd; pActwk->xposi = xpos; pActwk->yposi = ypos; }
Hi Devon. Could you please explain how you went about creating your sample decompilation? I can't find good references for reading PS2 MIPS assembly. I've only had experience with reading NES 6502 assembly until now. action() is a global function, so it appears in multiple files. So far I've figured out the first instructions do general housekeeping to update the stack and return addresses. The first actual instruction seems to be: Code (Text): lui a0,hi(actwk+257) According to a reference, lui means "To load a constant into the upper half of a word.". So here the high part of actwk is loaded into the high part of a0 (function argument register)? What does +257 mean, here? Can you please get me started? Thanks!
To be honest, I have a very very bare grasp on MIPS assembly. I just used Ghidra to provide a base decompilation of the function, and then attempted to follow the assembly code to make it more accurate to what was actually programmed. The line numbers associated with addresses dumped via dwarf2cpp also helped with grouping instructions.
Not familiar with PS2 MIPS, but in most assembly languages, that would be the offset to move to after loading the constant address actwk. Think of Actwk like an array, with Actwk being the base offset of the array. It's saying the constant to load is the 257th element of the array in Actwk. Looking further at the code, it seems Actwk is a structure containing many values byte packed into chunks. So that is the offset of some smaller value, ie xposi or yposi or something else in Actwk.
I've found the actual action function using the referenced memory address. I'm stuck on the following instructions: Code (Text): lui s0,hi(actwk+256) addiu s0,s0,lo(actwk+13696) I've determined that the struct type that actwk points to is 74 bytes, which is a strange size. +256 would mean the fourth element, member cddat. Then +13696 would mean the start of struct number 185, which doesn't make sense as the array of actwk structs has 128 elements. I must be doing something wrong. No idea how that translates to: Code (Text): pActwk = actwk; EDIT: Looking back at the symbol table, actwk's size is 8704. Divided by 128, that'd mean that each struct is 68 bytes. When I recalculate, I count 66 bytes. There must be two bytes of padding. EDIT2: Yup, there were two padding bytes: Code (Text): struct anon0 { unsigned char actno; // 1 unsigned char actflg; // 1 unsigned short sproffset; // 2 _anon3** patbase; // 4 _anon5 xposi; // 4 _anon5 yposi; // 4 _anon9 xspeed; // 2 _anon9 yspeed; // 2 _anon9 mspeed; // 2 unsigned char sprhsize; // 1 unsigned char sprvsize; // 1 unsigned char sprhs; // 1 unsigned char sprpri; // 1 unsigned char patno; // 1 (padding byte) _anon9 mstno; // 2 unsigned char patcnt; // 1 unsigned char pattim; // 1 unsigned char pattimm; // 1 unsigned char colino; // 1 unsigned char colicnt; // 1 unsigned char cddat; // 1 unsigned char cdsts; // 1 unsigned char r_no0; // 1 unsigned char r_no1; // 1 (padding byte) _anon9 direc; // 2 _anon9 userflag; // 2 unsigned char dummy[2]; // 2 unsigned char actfree[22];// 22 }; Finally, my decompilation of the action function: Code (Text): void action() { _anon0* pActwk = actwk; for (int i = 0; i < 128; ++i) { if (pActwk->actno != 0) { act_tbl[(pActwk->actno - 1) << 2](pActwk); } ++pActwk; } }
I've made lots of progress the past week, reading the disassembly assisted by Ghidra. I've decompiled almost all the code that's meant to go in the action.c file. There's just one function left, but it uses an undefined global variable called @113 and an existing stack variable that's probably set by the calling function. The calling function is not part of action.c, so I'm moving on to other files until I have more data. For some reason Ghidra is bugged regarding global variables. It seems to have multiplied the addresses by 2, which not only makes it point to incorrect addresses, but also makes them fall outside of the memory range. This makes it impossible to name or assign types to them. I have no idea of how to resolve this.
I want to ask: how did you all manage to get the DWARF v1 data properly interpreted by Ghidra? From what I've seen, Ghidra only natively supports DWARF versions 2 and above, and while an extension exists to handle this sort of thing, but to my knowledge, it does not work on newer versions of Ghidra (or at the very least, I cannot recompile it with current tools as described here).