don't click here

Code Data Logger analysis

Discussion in 'Engineering & Reverse Engineering' started by evilhamwizard, Mar 29, 2017.

  1. evilhamwizard

    evilhamwizard

    Researcher
    1,392
    455
    63
    A have a bit of a secret tool that I've been using for a few years now. While there exist emulators with code data logging support for a while now (such as the modification to Gens that Nemesis did eons a go, Exodus, and FCEUX), not a lot of people know about Bizhawk's ability to create code data logs for Genesis, SNES, NES, PCE, Game Boy (Color), and Game Gear/Master System games. What code data logging does, for those that aren't aware, is analyze the ROM for code and data that is accessed as you play the game. This can be extremely useful for disassemblies where it's unknown where code and data can exist in the ROM. In my case, I've been using it to find unused/inaccessible data/code branches in games. To find unused data with these tools, you just have to play the game extensively (access every screen, every level, every path, every enemy, every in game case scenario, etc).

    In Bizhawk's case, the code data logger (CDL) works by creating a file that resembles the original ROM aside from Bizhawk's file headers and adds flags to locations within the file that correlate to the original ROM. For Mega Drive games, these flags mark where 68k code is executed, data is accessed by the 68k, code executed by the z80, and data accessed by the z80. It even marks DMA data as well. Unlike Exodus, however, Bizhawk does not determine how the data/code it identifies is being used (for instance, it will not identify bytes representing pointers as anything other than 'data', whereas something like Exodus will properly identify the data as pointers and even form them into a table/array).

    There are two scenarios where Bizhawk fails to properly identify data. The first is when the game you're playing has a checksum check routine that calculates a checksum by using every byte in the ROM. This causes Bizhawk to mark everything as data even when it's code. The only solution to this is to either turn on the CDL after the game loads or to set a breakpoint to before and after the checksum check is complete, and to turn off the CDL before the checksum check occurs. The other scenario where Bizhawk seems to have some issues with is certain audio data accessed by the Z80. Audio samples in games like Golden Axe 2/3, either the drum samples or voice samples, don't seem to be identified when they are played. Games like Sonic 3, however, are identified correctly because of how the sound driver accesses the sample data. Other than that, Bizhawk does a good job at determining what is code and what is data for usually most scenarios.

    There is one issue with using Bizhawk at the moment. The code data logger in it's current state can't create a dissasembly from a log file. So, scripts have to be made to convert the .cdl file into something that can be used in something like IDA Pro.
    As it turns out, I made a shitty python script for Mega Drive that you can use to do just that.
    I also made one for Game Gear and Master System, but it doesn't work nearly as well because I can't figure out how to take care of ROM map segments.

    I haven't worked on the script in years and I remember leaving it in an odd state. It works, but probably not well. But this script will allow you to take a .cdl file and convert it to a .idc file for importing in IDA Pro. However, the script will only MakeCode/MakeData one byte at a time, so for .cdl files that are mostly identified you could be looking at a huge .idc output. You can use this to at least get a fundamental understanding of where unused data can be located in the ROM.

    Over the years I played through a few games using the CDL just to get an idea of how much of a ROM is actually used. I haven't bothered to look at some of the unidentified data in most of these, but there's a good chance that most of the games (besides the prototypes) avoided wasting space on the cart as much as possible. You can check out the games I've gone through below:

    Mega Drive:
    Aladdin (Prototype) (June 27th, 1993)
    • I believe I finished as much as I could for this one. I can't really remember.
    Bare Knuckle II (Beta)
    • I tried to access as much data that can be accessed in game.
    Captain Lang (Early Prototype)
    • I believe I finished as much as I could for this one. I can't really remember.
    Castle of Illusion Starring Mickey Mouse
    • I think this one is complete. I know it's not as complete as the Japanese one.
    Ex-Ranza
    • I believe I finished as much as I could for this one. I believe only half of the ROM is used.
    Golden Axe (W) (REV00)
    • I believe I finished as much as I could for this one. I can't really remember.
    Golden Axe III (J)
    • I believe I finished as much as I could for this one. If I recall, I think this is very close to being complete beside the audio samples mentioned earlier and maybe a few case scenarios that are rare and hard to find.
    I Love Mickey Mouse - Fushigi no Oshiro Dai Bouken
    • Complete. Played the game with all difficulties.
    Juu-Ou-Ki (Altered Beast)
    • Complete. Played the game with all difficulties.
    Michael Jackson's Moonwalker (W) (REV00)
    • Complete. Played the game with all difficulties.
    Mickey Mania (Prototype, not the HPZ one)
    • Almost complete I think. The level select is inaccessible so the final stage can't be loaded. I can't remember if I loaded the secret stage or not.
    Ninja Gaiden (Beta)
    • Complete. There's a lot of unused data in this one.
    OutRun
    • Complete I think. Played the game with all difficulties.
    Pulseman
    • Complete.
    Quack Shot Starring Donald Duck (W) (REV00)
    • Complete. Played the game in it's entirety twice in both English and Japanese.
    The Ren and Stimpy Show - Stimpy's Invention (Beta)
    • Completed as much as I could play.
    Revenge of Shinobi (Beta - Smash Pack)
    • Completed as much as I could play. There's extra data for sure, see the secret mini game I discovered a year or so a go.
    Ristar (Jul 1, 1994 prototype)
    • Very near complete I believe. I don't remember if I played the entire game with every difficulty or explored every case scenario, but it's almost there.
    Sega Channel Demo Cartridge #4 (2-16-94)
    • Complete.
    Sonic 3C (Prototype 0408 - Apr 08, 1994 prototype)
    • Complete, or at least close to it. Expect near completion for all Sonic games due to the collision array. I played through every zone as Sonic, Tails, and Knuckles. Went through every special stage, and super emerald stage. Went through every sound test value, level select, and debug mode entry/placement (even in 2P). Used Super Sonic/Tails/Knuckles. Used debug mode to view all available tiles for each character (super sonic, in fact ,any super form crashes bizhawk with certain corrupt tiles/mappings). Went through every menu in game. The only thing I did not attempt to do was get a perfect score by collecting every ring. I did get a perfect score in some of the special stages though. Only about 64% of the ROM is used, either way. What else does this ROM have!?
    Sonic the Hedgehog 2 (Beta 4)
    • Complete, or at least close to it. Expect near completion for all Sonic games due to the collision array. I played through every zone as Sonic/Tails. Went through every special stage. Went through every sound test value, level select, and debug mode entry/placement. Can't recall if I tried to hack Super Sonic. Went through every menu in game. The only thing I did not attempt to do was get a perfect score by collecting every ring.
    Sonic the Hedgehog 2 (Nick Arcade)
    • Complete, or at least close to it. Expect near completion for all Sonic games due to the collision array. I played through every zone as Sonic/Tails. Went through every sound test value, level select, and debug mode entry/placement. Went through every menu in game. The only thing I did not attempt to do was get a perfect score by collecting every ring. Only about half of the ROM is actually used, thanks to all the symbol/assembler trash in the ROM. What else is in here?
    Sonic the Hedgehog 2 (Simon Wai)
    • Complete, or at least close to it. Expect near completion for all Sonic games due to the collision array. I played through every zone as Sonic/Tails. Went through every sound test value, level select, and debug mode entry/placement. Went through every menu in game. The only thing I did not attempt to do was get a perfect score by collecting every ring. Only a little more than half of the ROM is actually used, thanks to all the left over garbage in the ROM. What else is in here that might've been missed?
    Sonic the Hedgehog 3
    • Complete, or at least close to it. Expect near completion for all Sonic games due to the collision array. I played through every zone as Sonic/Tails. Went through every sound test value, level select, and debug mode entry/placement (even 2P). Went through every menu in game. The only thing I did not attempt to do was get a perfect score by collecting every ring. A little over 80% of the ROM is used. What else is in here?
    Sonic 3D Blast (Prototype 73, Jul 03, 1996)
    • I believe I finished as much as I could for this one. I can't really remember. This game loads the level tiles as you walk through the stage so there might be some parts of the maps that are unidentified. I think there's still some stuff left that needs to be explored, but not much. They really truncated these early prototypes of unused data.
    Street Fighter II' Turbo (Beta)
    • I believe I finished as much as I could for this one. I can't really remember.
    Streets of Rage (W) (REV00)
    • I believe this is complete.
    The Super Shinobi II (Early prototype)
    • I believe I finished as much as I could for this one. I can't really remember. Definitely lots of unused data in here.
    World of Illusion Starring Mickey Mouse & Donald Duck (Prototype)
    • I believe I finished as much as I could for this one. I can't really remember. I recall a lot of unused data. I posted about some of this over at TCRF.

    SNES:
    Earthworm Jim 2 (Beta)
    • I believe I finished as much as I could for this one. I can't really remember. Most of the ROM is used.
    Earthworm Jim 2
    • I believe I finished as much as I could for this one. I can't really remember.
    Lion King (Early prototype)
    • I believe I finished as much as I could for this one. This game is very broken and unfinished so there's unused stuff aplenty.

    Master System:
    Castle of Illusion Starring Mickey Mouse (Beta)
    • Completed. This is that auto demo that was released a few years back. Fun fact, this was used at SCES 1990 to show case the Master System version almost a whole year before it's release. The Mega Drive version was at the show too at a nice early state. Some shots of both are in the SCES 1990 issue of EGM.
    Castle of Illusion Starring Mickey Mouse
    • Complete
    Sonic Chaos
    • Complete, or very close to due to the collision array. I played through the entire game twice as Tails and Sonic, got the bad ending at Sonic, and played through every special stage and got the good ending with Sonic. Got to all the secret menus and accessed every level both with the level select and by playing the game. Every sound in the sound test was played. Even got Sonic's hadouken to work.

    Game Gear:
    Sonic the Hedgehog (Beta)
    • Not even close to being complete. Only the first zone and act is playable, but data for all the other stages still exist the ROM. It might be very close to the final version sans a few differences, but I'm surprised no one's bothered to look at this one closer.

    That's all I made that I could find. But I'm interested to see if someone can improve the conversion process or find something with the code data logger in their own favorite games.