don't click here

Sonic 2 Split Disassembly

Discussion in 'Engineering & Reverse Engineering' started by FraGag, Oct 6, 2008.

  1. (quoting from the Sonic 1 disassembly thread to avoid taking that off-topic)

    Hmm do you mind elaborating on this a bit more? It's something we can look into fixing.
  2. Double posting but whatever.

    What do you guys think of creating a downloadable version of the current revision of this disassembly, similar to what Hivebrain's done for the Sonic 1 disasm? Are there any particular changes you'd like to make before any such step is taken, or is it fine in its current state?
  3. MoDule


    Tech Member
    Procrastinating from writing bug-fix guides
    I'm all for it. I've still got a few things I haven't committed yet, though I really only feel I need to ask about one:
    What's everyone's opinion on giving all the unknown RAM addresses generic unk_*(address) names? It would make rearranging the RAM layout a lot easier. If everyone's okay with that I'll go ahead and commit my changes (though I'm still missing a few addresses used during the special stages which I can't make heads nor tails of).
  4. MarkeyJester


    Nothing's Impossible Resident Jester
    Well, let's say for example this line during the Vertical Blanking Interval (Sonic 1):

    Code (ASM):
    1.         writeVRAM   v_spritetablebuffer,$280,vram_sprites
    ...Of which is an equivilent to the original assembly mnemonics:

    Code (ASM):
    1.         lea ($C00004).l,a5
    2.         move.l  #$94019340,(a5)
    3.         move.l  #$96FC9500,(a5)
    4.         move.w  #$977F,(a5)
    5.         move.w  #$7800,(a5)
    6.         move.w  #$0083,($FFFFF640).w
    7.         move.w  ($FFFFF640).w,(a5)
    This is responsible for setting the VDP, to DMA transfer sprite data from the sprite table buffer (00FFF800) to the VDP's sprite table V-Ram location (incidentally F800), allowing the MC68 to do other tasks.

    The issue I find here is that the first one (Running under a Macros) although it's easier to edit by all means (why hell it is rather useful in all honesty), most people would edit/use it there entire lives and not understand WHY it works or even HOW, their knowledge is therefore limited and their understanding of the VDP and it's registers, may never occur. The possibilities are almost endless with the VDP, and understanding how and why it works (Given the original code as help) would help one benefit from a learning point of view.

    That was basically what I meant, not to say that I'm "complaining" per se, but I felt that it should at least be noted for future reference.
  5. I'm okay with it (and I'm kinda used to it from IDA anyway). We should probably add a readme.txt of sorts explaining some of the major changes that have been made (such as the dynamic IDs system and the new way of defining RAM equates) so that newcomers don't get overwhelmed.

    I see your point, but the benefits far outweigh the drawbacks in this case IMO, and if someone's keen on learning how to program the VDP manually they can always take a look at the macro source or a document such as genvdp.txt.
  6. FraGag


    Tech Member
    Yes please. There should be no literal absolute RAM address left in the source, so that absolutely nothing breaks when moving stuff around. Ideally, some offsets on registers that point to RAM should also be labelled somehow (possibly using subtraction with 2 labels), but it might be a bit more difficult to do if it's not obvious what the register can point to.
  7. MoDule


    Tech Member
    Procrastinating from writing bug-fix guides
    Well, I guess I can go ahead and commit what I've got. I'm afraid there might still be some RAM addresses missing (besides the special stage ones, that is). I don't suppose there's a smart way to find RAM addresses that don't have the $FFFF prefix? I mean stuff like this:
    Code (ASM):
    1. ; word_917A:
    2. OptionScreen_Choices:
    3.         dc.l (3-1)<<24|(Player_option&$FFFFFF)
    4.         dc.l (2-1)<<24|(Two_player_items&$FFFFFF)
    5.         dc.l ($80-1)<<24|(Sound_test_sound&$FFFFFF)
  8. Can we shorten some of the equate names a bit? E.g. ehz_1 instead of emerald_hill_zone_act_1, GMID_* instead of GameModeID_*, and bit_* instead of button_* and btn_* instead of button_*_mask (copying Hive's naming scheme for this one).
  9. ICEknight


    Researcher Researcher
    What for? This way you don't need to look up what does stuff like GMID stand for...
  10. Because having to type GameModeID_ContinueScreen in your own code would get kinda annoying (though admittedly GMID_* is only 6 characters shorter). emerald_hill_zone_act_1 vs. ehz_1 is a saving of 18 characters, however, and I'm sure everyone knows what ehz is.
  11. FraGag


    Tech Member
    I definitely agree for the zones/acts constants. For GameModeIDs, I'm not sure; on one hand, as ICEknight said, acronyms/abbreviations may be cryptic until you look them up. On the other hand, they always get moved to Game_Mode. Having "game mode" spelled out twice on the same line is redundant, but then shortening "GameModeID" to "GMID" can lead to conflicts if something else is shortened to "GMID". That said, then I made all those IDs, I used shorter names for the others, so we might as well just go with GMID...
  12. Spanner


    The Tool Member
    United Kingdom
    Sonic Hacking Contest
    A universal labelling scheme would make some matters easier for both SVN disassemblies. This would at least make a start to it.
  13. All right, cool. What about the button constants?
  14. ICEknight


    Researcher Researcher
    ...Perhaps we could start by using the known labels from the original source files by Sonic Team? We have a few of them, don't we?
  15. Andlabs


    「いっきまーす」 Wiki Sysop
    Writing my own MD/Genesis sound driver :D
    We have label names found in the symbol tables of the S&KC exe. That said, I dunno if a universal naming scheme would extend nicely to other games. Maybe it should stay with S1-3K only, but first we need to finish up our S&K split. And even if we do that, would having to abandon descriptive labels for the somewhat constrained ones Sonic Team used be worth it?

    As far as abbreviations go, I don't see how abbreviating zone names would not be common sense. As far as everything else goes, you are probably better off leaving them alone.
  16. Hivebrain


    53.4N, 1.5W
    I thought about that, but decided against it because it wouldn't really be helpful to us. Programmers don't always use labels with clear meanings to anyone but themselves, especially if they don't expect their code to be made public.
  17. FraGag


    Tech Member
    bit_* is too generic. Perhaps btnb_* (button bit) and btnm_* (button mask)? We could also drop the underscore entirely.
  18. Sounds good to me, although I'd suggest keeping the underscore for consistency with other equates. We should also probably have btnm_ABC and btnm_dir (or btnm_UDLR, whichever one is preferred).
  19. Alriightyman


    I am back... from the dead! Tech Member
    Somewhere in hot, death Florida
    0101001101101111011011100110100101100011 00000010: 0101001100000011 01000101011001000110100101110100011010010110111101101110
    I'd say btnm_dir. And keep the underscore. It helps make it easier to read.
  20. I've shortened the equates and was commenting the Enigma decompression routine when I ran into a little hitch:

    Code (ASM):
    1.     move.w  d5,d1
    2.     move.w  d6,d7
    3.     sub.w   a5,d7   ; subtract length in bits of inline copy value
    4.     bhs.s   $$enoughBits    ; branch if a new word doesn't need to be read
    5.     move.w  d7,d6
    6.     addi.w  #16,d6
    7.     neg.w   d7  ; calculate bit deficit
    8.     lsl.w   d7,d1   ; and make space for that many bits
    9.     move.b  (a0),d5 ; get next byte
    10.     rol.b   d7,d5   ; and rotate the required bits into the lowest positions
    11.     add.w   d7,d7
    12.     and.w   EniDec_Masks-2(pc,d7.w),d5
    13.     add.w   d5,d1   ; combine upper bits with lower bits
    15. $$maskValue:
    16.     move.w  a5,d0   ; get length in bits of inline copy value
    17.     add.w   d0,d0
    18.     and.w   EniDec_Masks-2(pc,d0.w),d1  ; mask value appropriately
    19.     add.w   d3,d1   ; add starting art tile
    20.     move.b  (a0)+,d5
    21.     lsl.w   #8,d5
    22.     move.b  (a0)+,d5    ; get next word
    23.     rts
    24. ; ===========================================================================
    26. $$enoughBits:
    27.     beq.s   $$justEnough    ; if the word has been exactly exhausted, branch
    28.     lsr.w   d7,d1   ; get inline copy value
    29.     move.w  a5,d0
    30.     add.w   d0,d0
    31.     and.w   EniDec_Masks-2(pc,d0.w),d1  ; and mask it appropriately
    32.     add.w   d3,d1   ; add starting art tile
    33.     move.w  a5,d0
    34.     bra.s   EniDec_FetchByte
    35. ; ===========================================================================
    37. $$justEnough:
    38.     moveq   #16,d6  ; reset shift value
    39.     bra.s   $$maskValue
    40. ; End of function EniDec_FetchInlineValue
    AS complains about symbol undefined errors in the second pass even though I don't have any non-temporary labels in between the named temporary ones and so there shouldn't be any issues. Anybody know why this is happening? Using nameless temporary symbols instead works fine but I'd prefer to keep the named ones.