Sonic 2 for PC, 3DS, & Wii

Discussion in 'Engineering & Reverse Engineering' started by Clownacy, Nov 24, 2017.

  1. Clownacy

    Clownacy

    Tech Member
    782
    7
    18
    Contest page copypaste:
    Overview
    You might consider the TaxStealth mobile version of Sonic 2 to be a port, but I'd beg to differ: that version was made by recreating Sonic 2 on top of the Retro Engine - while it may look and feel like the original, under the hood that is simply not the case. In that regard, it's about as much of a port as Sonic 2 HD is.

    With that out of the way, this is a port that was done by manually translating the original game's 68000 assembly code to C, and adapting it for PC and New 3DS. A similar thing was done by Sega back in the 90s to produce their PC ports of Sonic CD and Sonic 3 & Knuckles.

    Note that this is an extremely early proof-of-concept: there are barely any objects, there's no Tails, and Sonic cannot die. The only level currently available is Aquatic Ruin Zone Act 1. Debug Mode is enabled by default, so at least you won't get stuck anywhere.

    That said, to showcase the possibilities of this port, I've added some enhancements:
    • The game is in widescreen (in fact, the whole screen's been extended, from 320x224 to 400x240) - this is not possible on the Mega Drive for reasons that should be obvious.
    • Aquatic Ruin Zone's underwater section has an added ripple effect (actually ported from Sonic 3's Angel Island Zone) - while a ripple effect was in Sonic 1, it was removed from Sonic 2, likely for performance reasons.
    • The music has been replaced with the original demo tracks by Masato Nakamura - the original game didn't use these, primarily because of the massive amount of cartridge space they would take.

    Please note that this port is meant to be as accurate to the original game as possible (barring the above contest-only enhancements), so you may encounter a number of vanilla bugs, such as Sonic having trouble jumping out of shallow water and Debug Mode causing him to behave like he's underwater even when he isn't. These are intentional.

    Controls
    Keyboard:
    • WASD - Movement
    • O - A
    • P - B (Debug Mode)
    • [ - C

    PC gamepad:
    • D-pad / Left Stick - Movement
    • A / Cross - A
    • X / Square - A
    • Y / Triangle - B (Debug Mode)
    • B / Circle - C

    New 3DS:
    • D-pad / Circle Pad - Movement
    • B - A
    • Y - A
    • X - B (Debug Mode)
    • A - C

    Platform Support
    The PC port is Windows-only. It's 32-bit, and relies on SDL2, so it should be fairly compatible.

    The New 3DS port will not run on the original 3DS (XL), or the original 2DS. Also, because it's a homebrew CIA file, it will need to be installed with FBI, which requires your New 3DS have Custom Firmware. For the audio to work, you'll need to dump your DSP.

    Screenshots:
    [​IMG]
    [​IMG]
    [​IMG]

    Old video from back when this had EHZ:
    http://www.youtube.com/watch?v=z_SpvR4mUiQ

    I've been working on and off on a PC port of Sonic 2 for about two years, which I initially worked on in order to teach myself C and SDL2.

    Back in August, after tiring myself out with Cave Story modding, I finally picked up the project after a year-long hiatus, and put to use what I'd learnt in that time. This included the addition of level drawing, a shortlived rebase on OpenGL, and a port of the Sonic object. This turned the port from a silly little demo program called EGGMANQUEST into something actually resembling a port of Sonic 2.

    What, you think I'm joking?
    [​IMG]

    It started to look like the project was finally coming together, and with the Hacking Contest on the horizon, I thought I'd brush up what I had so far, and let it finally make its debut.

    While cleaning up the demo, I was reminded of my time porting my own game to the 3DS. Figuring it would demonstrate my port's portability (harr harr), and help set it apart from Sonic 2 HD, I began working on a 3DS version, ditching the (albeit crappy) OpenGL backend for a software renderer. Unfortunately, the renderer wasn't as efficient as it could be, limiting it to New 3DSs only.

    ...So why does it say 'Wii' in the thread's title?
    [​IMG]

    Basically, to prove a point, I dropped it on the Wii in an hour or two.

    Unfortunately, this was a few days after the contest deadline, so it couldn't be submitted. It was a really quick-and-dirty port, anyway: it doesn't have sound, and it's in 4:3 (why the hell does the Wii have a 4:3 framebuffer?), so I guess that's not a huge loss.

    I suppose I should say how you use the thing. Extract the contents of the 'Wii' folder to the base of your Wii's SD card, and that should be all for installation. Of course, you have to run it from the Homebrew Channel.

    Controls are...

    Wiimote:
    • D-pad - Movement
    • 1 - A
    • 2 - B (Debug Mode)
    • Minus - C
    • Home - Quit

    Wii Classic Controller/Wii U Pro Controller/Wii U Gamepad:
    • D-pad - Movement
    • B - A
    • Y - A
    • X - B (Debug Mode)
    • A - C
    • Home - Quit

    Download here.

    Also, as a bonus, I dug up two old builds of EGGMANQUEST I had lying around on Dropbox, in case anyone's curious:
    2015 version
    2016 version
     
  2. Krigo

    Krigo

    Robotics;Notes shill Member
    Holy crap.

    This is really impressive. Hands down my favorite pick for the SHC this year. I didn't see this coming at all.
     
  3. EnderWaffle

    EnderWaffle

    Ghostly Friend Member
    Just tested the Wii version, and I gotta say, this seems pretty good.
    Now if only I could get my 11.6.39 New 3DS homebrew'd, then I'd have the good version of a handheld Sonic 2. + - Yeah, I said it. I'm not too fond of Sonic 2 SMS.  
     
  4. Clownacy

    Clownacy

    Tech Member
    782
    7
    18
    Code (Text):
    1. - threading (fuck no)
    2. - proctex (lolwut)
    3. - linked list dirty rendering (wat)
    4. - Wii audio
    5. - SonLVL integration (external collision arrays)
    6. - More enemies
    7. - AND OR blitting (nope...)
    8. - Old 3DS Frameskip (pfft)
    9. - New plan: move the colouriser to the second core, optimise the hell out of the blitter, and Bob's your uncle
    10. - Blit different colours to different framebuffers, and render them with vertex colours? (nope)
    11. - Extra new plan: switch to tile-based drawing, and merge the colouring stage with the blitting stage, to reduce palette line calculations
    12. - Alright, let's be more clear: the current renderer is tile-scanline-based. It's simple, but slow as ass.
    13.   My idea is to make the renderer emulate the VDP much more closely: by emulating planes, I can render
    14.   entire tiles instead of scanlines. This allows me to cache palette line calculations for all 64 pixels.
    15.   Also, by directly emulating planes, I can simply port the original game's level drawer, so that's cool.
    16.   Anyway, as I said above, I could potentially merge the colouring stage with the blitting stage if I do
    17.   so, though I'm not sure if that would be good for performance. I'm also not sure how I'd make priorities work.
    18.  
    19. Attempt 9000000:
    20. So we start off with the two planes. We parse them, and put the high-priority ones into a list for later [NOPE].
    21. The low background doesn't need an alpha test: just set the first colour in every palette line to the
    22. background colour, and memcpy the pixels. The other layers need an alpha test.
    23.  
    24. We'll be using a 'flooded buffer' for the indexed bitmap: room will be given so that entire tile-lines can be
    25. written, even if they go outside the bounds of the screen. This will allow us to do the AND OR trick with
    26. longwords, since tiles are 8 pixels/bytes wide. [old ARM doesn't like unaligned writes, so no]
    27.  
    28. Then it will be the colourising stage that clips the flooded buffer, and rotates the framebuffer.
    29.  
    30. Also, add a mode for the foreground, so it doesn't do the per-scanline scrolling stuff. [turns out I can't]
    This is a snippet of my to-do list for this port. For the last year I've been trying to figure out how to make this fast enough to run on the Old 3DS. As you can probably see, it didn't go well.

    The 3DS is useless. For starters, its GPU doesn't support fragment shaders, so the one part of the drawing process I could use hardware-acceleration for - applying the palette - I now can't. So I'm stuck doing everything in software... except the CPU is slow as mud: a 268MHz ARMv6 with only one actually-usable core. Great. And to add insult to injury, the 3DS's framebuffer? It's sideways. So on top of blitting and colouring in software, I also have to rotate the framebuffer, which takes a good 20% off my cycle budget.

    For the past year I've been plodding away, trying every possible optimisation I could think of, but it wouldn't help. No simple optimisation would help that the software-renderer alone was using 300% of CPU time. Was it a lost cause? Surely it had to be possible: I mean, Sonic 1/2 3D run on Old 3DSs, and they're emulators. Not to mention they have to draw two screens per frame.

    So I started planning a complete rewrite of the renderer, this time emulating the VDP much more closely. As ironic as it sounds, emulating the VDP's planes is faster than skipping them entirely, so building a renderer around those could be just the breakthrough the port needed.

    But something like this would take ages to write, and I've been burned before, spending days working on a fancy new drawing method, hoping to at least save some performance, only to have it thrown back in my face by making everything slower. I didn't want to try this until I had time to burn, and was sure it would work.

    So months passed, until finally August rolled around. I had some spare time, and my notes were all fresh in my memory, so I got to work. While the original software-renderer was developed on PC, and merely ported to the 3DS later, this one was running on there from its earliest test builds.

    Results were promising: a single plane took only a fraction of the CPU. Things got more complicated when I added sprites and the second plane, but I was still scraping by at about 90% of the CPU's threshold.

    Then I plugged it into the port. The game ran at half-speed.

    No, it wasn't the game itself using up the rest of the CPU (the engine only takes 1%), it was the compiler. Somehow, between the test program and the actual game, the compiler was giving up on whatever optimisations gave the test an edge. So the colour/rotate stage for instance went from 32% CPU time to 40%. When the test was already running at 90%, that's bad.

    After that, I was pretty much ready to give up on the 3DS. I've been stuck trying to get this working for a year, for crying out loud. But after some more optimisations - some good, others hackish abominations that should be accompanied by animal sacrifice - things started to look up: I found a faster way to draw sprites by using an incrementing pointer, I experimented with compiler flags, and even found a crazy way to optimise plane blits while also obeying priority with a 16KB LUT.

    All in all, if my checks are accurate, the newest build scrapes by at around 85%!

    So why go through this much effort to support some handheld from 2011? Well... why not? This is a port of a game from 1992 - it should be able to run on the 3DS. So I figure it makes a good benchmark: if it can't run on a 3DS, I'm doing something wrong.

    And I think it paid off: if I wasn't targetting the 3DS, the port would still be using its awful original software renderer. Heck, it would probably still be using its old OpenGL backend, calling multiple functions and sending numerous floats to the GPU just to draw 8 pixels of a scanline, when the software renderer can do the same thing by just copying two long ints.

    And if the 3DS isn't your thing? These optimisations benefit the other builds too: back when I released the SHC 2017 demo, I was using my desktop PC. Way more of a powerhouse than the flimsy laptop I use nowadays. And that demo? It doesn't run too great on this laptop. Now, with these optimisations (and way better use of SDL2's API...), the port uses only 15% of the CPU.

    With this nonsense finally behind me, I can move onto actually porting stuff again. But before that, I figured I'd clean a few things up, and make another release:

    As far as the game itself goes, nothing's really changed. It's still an empty ARZ Act 1. Still, a few bugs and inaccuracies have been fixed, and I've added pausing, so there's that.

    The Windows build, as well as being faster, now lets you resize the window. You can also enter fullscreen by pressing F1.

    As for the Wii build, I figured out how to get a vaguely 16:9 framebuffer (432x240), so that port's finally joined the widescreen club. Still no audio though.

    If you're interested, here are the download links:
    Windows
    3DS
    Wii

    On the 3DS, you can install straight from FBI using this QR code (select the 'Remote Install' option at the bottom):

    [​IMG]

    Installation instructions and controls are the same as before, sans the 1 & 2 buttons on the Wii being swapped.
     
  5. Graxer

    Graxer

    Member
    This is a really cool project. It would be awesome to see the original code ported to more modern hardware.
     
  6. Fred

    Fred

    Formerly known as 'Neo' Oldbie
    1,483
    42
    28
    Portugal
    Sonic 3 Unlocked
    For the record, I really appreciate the writeup. Fascinating read.
     
  7. sonicblur

    sonicblur

    Oldbie
    1,320
    8
    18
    You complain about not having fragment shaders, yet you are drawing directly to the frame buffer so you wouldn't be using them in the first place. The 3DS hardware fully supports OpenGL ES 1.1, so why are you rotating the image by hand instead of using hardware for that? You just bind your generated image to a texture and draw it on a quadrilateral that fills the entire screen. This is what most of the emulators are doing. Rotation and scaling are free then. The rest of your post never indicates that you stopped drawing directly to the framebuffer, so I assume this to be the case.

    If you're looking for a free performance boost, you may want to consider trying that.
     
  8. Clownacy

    Clownacy

    Tech Member
    782
    7
    18
    I have thought about it before, but the overhead of uploading a 400x240 texture every frame scares me off from trying. The 3DS's GPU uses some weird tiled format internally, that all textures have to be converted to. I'd have to convert the image at runtime... just like I already am. With fragment shaders, I'd at least be saving the cycles the colour stage would be using, so whether or not switching to a texture would make things faster wouldn't be a problem.
     
  9. Lapper

    Lapper

    Member
    1,546
    22
    18
    England
    Sonic Studio, Kyle & Lucy: WW, Freedom Planet
    Awesome, only just discovered. I love this sort of thing, It's what I'd be doing if I was smarter. :colbert:


    It's furiously accurate however there's definitely more trouble with jumping out of the water than in the original.
     
  10. Clownacy

    Clownacy

    Tech Member
    782
    7
    18
    You're sure it's not just this bug? There's one area in the level where you're basically guaranteed to run into that bug, but the original game has a monitor there, so most people don't notice.

    EDIT: Oh wait, I double-checked the original game and you're right. Turns out I accidentally inverted a check. Thanks for pointing that out.

    EDIT2: Uploaded fixed builds.
     
  11. sonicblur

    sonicblur

    Oldbie
    1,320
    8
    18
    Yikes, I wasn't aware of that weirdness. It's a shame the 2D hardware from DS mode isn't accessible in 3DS mode.

    Looking at what's out there, it sounds like gnu.hpp has a textureIndex() function created specifically for updating pixels inside a mapped texture. But that's a lot of math to be doing. Given the 3DS has much more memory than the original system, you should have enough RAM to avoid doing the excess math though. A pre-calculated 240x400 lookup table using 32-bit indexes into the destination texture is only 375KB. (You only need 24 bit, but of course that's not aligned) If you store your lookup table in the same pixel order as your video buffer then you can just loop and do a linear copy without having to do a bunch of calculations every frame.
    texture[lookupTable] = input

    Although at that point you're still copying every single pixel. so I guess there's no point in using the 3D hardware to rotate for you at that point. That being said, you could use the same strategy for your frame buffer rotation if you're not already.
     
  12. Clownacy

    Clownacy

    Tech Member
    782
    7
    18
    Funny you mention that. I had that same idea about a rotation LUT a few days ago. It's been sitting in my todo list until I'm not so burned-out.