don't click here

Utility ClownMDEmu - The Greatest Mega Drive Emulator Ever (Someday)

Discussion in 'Technical Discussion' started by Clownacy, Jun 23, 2022.

  1. Clownacy

    Clownacy

    Tech Member
    1,160
    843
    93
    Thanks for reporting this. Oh boy is this one complicated:

    The YM2612's low-volume distortion quirk means that the audio output level is never actually 0, even during silence. This does not normally cause any audio to be produced, however, because sound is created by the audio level changing, not just by being non-zero. So why does this produce an audible noise in the emulator then? The audio resampler is to blame: it expects to average a fixed number of samples (taps), but occasionally receives one fewer than usual, which causes the normalisation to be overkill, attenuating the audio level slightly. Since sound is created by the audio level changing, this occasional attenuation creates a sound!

    To fix this, I will either have to make it so that the number of taps is always consistent or that the average always accounts for the number of taps.
     
    Last edited: Apr 2, 2025
  2. Clownacy

    Clownacy

    Tech Member
    1,160
    843
    93
    v1.3
    Try it in your web browser: clownmdemu.clownacy.com
    Download: Standalone, libretro

    The update mainly accumulates changes which have trickled-in over the last couple of months; there are optimisations, a few bugfixes, and even AppImages for another CPU architecture.

    Performance Improvements
    I had received feedback from a user that the emulator was not running very well on his old laptop, so I set up my Raspberry Pi 3B+ as a low-end benchmark. As expected, the emulator struggled to run at full-speed.

    There were some easy optimisations that could be made, such as making the 68000 bus callbacks cleanly divide the address space into chunks based on the upper three bits, but the two main optimisations were more involved:

    The first major optimisation was to make the YM2612 emulator only recompute the phase step delta when absolutely necessary, as it is quite a slow process. The emulator featured this optimisation in earlier versions, though it was removed when the low-frequency oscillator was added, as it broke some assumptions that the optimisation relied upon.

    The next optimisation was to optimise the VDP emulator, particularly its line blitter. When support for the Window Plane was greatly improved, the renderer was made much slower by the requirement to stop rendering a line partway through, to create the seam between the Window Plane and Plane A. However, this requirement can be bypassed by splitting the screen into two halves - one with the Window Plane, and one with Plane A. Each half can be rendered, and then combined afterwards to produce the final image. With this, there is no need to end the rendering of a line prematurely, lightening the renderer by many lines of code.

    Another optimisation was to offload the reference frontend's screen-upscaling logic to SDL3, using its fancy new 'SDL_SCALEMODE_PIXELART' option. This replaces the previous naive method of upscaling to the closest integer multiple using nearest-neighbour filtering, and then downscaling to the target resolution using bilinear filtering. Both methods allow pixel-art to be scaled without making the image excessively blurry nor blocky.

    The final optimisation was to overhaul the audio mixer to not perform expensive sinc resampling on all four audio sources individually (FM, PSG, PCM, and CDDA). Instead, all sources are combined together, and resampling is performed on that result. There are some concerns about distortion introduced by upsampling the sources to a common sample rate, but theoretically this is only an issue for the PCM sound source, as the others would only suffer distortion in the inaudible ultrasonic range. For PCM, a special solution to upsample it into the ultrasonic range without distortion may be required. Another upside to this change is that it allows the resampling to be offloaded to SDL3/libretro, completely bypassing the need for the emulator to bundle its own sinc resampler.

    According to Visual Studio's profiler, all of these optimisations combined make the emulator roughly a third faster, with the YM2612, VDP, and mixer optimisations saving roughly 10% of frame-time each.

    AArch64 AppImage Builds
    Since the Raspberry Pi 3B+ is an AArch64 (64-bit ARM) platform, I can use it to produce AppImages that should run on any AArch64 Linux device. Conveniently, support for producing non-x86-64 AppImages was added some time ago by a contributor!

    AppImage is a format used to distribute Linux software in a way that is simple and distro-independent, meaning that is now easy for Linux users on AArch64 to try-out the emulator without the need to build it from source code.

    Option to Disable Rewinding
    Rewinding is a feature that uses a lot of RAM: roughly 500MiB, in fact. This is terrible for low-end platforms like the Raspberry Pi 3B+ (which has only 1024MiB of RAM).

    Previously, while rewinding could be disabled, it could only be done so at build-time, making it impossible to change for users that only had access to a pre-built executable. To solve this problem, rewinding can now be disable at runtime; an option has been added to the options menu to control this. When rewinding is disabled, the RAM it requires is freed, allowing the emulator to use as little at 60MiB (most of which is used by the SDL library rather than the emulator itself). Disabling rewinding should also provide a slight performance boost.

    SRAM Improvements
    When support for SRAM was first added, in v1.0, it was limited to 16KiB, as that was the largest that I had ever heard of a game using. This has now been raised to 64KiB, as homebrew tends to require this.

    In addition, an off-by-one error was fixed which was preventing games that require exactly 16KiB of SRAM from saving.

    DAC Test Bit
    Long ago, it was discovered that the YM2612 has a hidden setting that allows the DAC channel to override all 6 FM channels. While somewhat limited in its uses, it does have the upside of allowing for a form of soft-panning. Since it is a very rudimentary feature, support for it has been added to ClownMDEmu. Its implementation was verified against Nuked OPN2, so it should be accurate to real hardware.

    An example of this feature in use can be found here:



    Low-Volume Distortion Improvement
    The YM2612's 'ladder effect' was previously slightly too quiet; due to me misreading some code in Nuked-OPN2, I had the noise set to 2 volume levels when it should have been 3. This has to do with how the YM2612 multiplexes its audio, with 3 out of 4 'slots' being occupied by noise. Since my emulator does not multiplex its audio (for performance reasons), it instead sums these slots together, meaning that it should be 3 noise values summed with the channel output.

    Adaptive V-Sync
    Recently, I noticed that SDL3 introduced the ability to enable adaptive V-sync, which is similar to regular V-sync except that it gracefully handles lag, allowing the program's speed to reduce gradually instead of delaying by entire frames. Not every platform supports this, so the emulator will fall back on regular V-sync if adaptive is unavailable.

    'inih' Library Removal
    The reference frontend currently saves its settings in an INI file. Since the introduction of this feature, the frontend has used the 'inih' library to read its settings. As of this update, said library has been removed, replaced with a minimal reimplementation. There were several reasons for this:
    • ''inih' is written in C, which was ideal back when the frontend was written in C (or C-style C++), but now that the frontend is written in modern C++, the library's C interface is unnecessarily clunky.
      • 'inih' does provide a C++ interface, but it is grossly overkill for the frontend's needs, and is likely incompatible with the INI files used by the frontend.
    • 'inih' is released under the 3-clause BSD licence, which annoyingly requires that a copy of the licence be distributed with every executable. Writing my own replacement means that I can license it under the same licence as the rest of the frontend, leaving me with one fewer licence to lug-around.
    • 'inih' is designed for embedded platforms, making minimal use of the stack and heap. This previously led to an issue where long file paths were prone to truncation, putting the INI parser into a corrupt state. Making 'inih' take advantage of system resources more freely requires a bunch of messy configuration, and, while its limits can be raised, they can never be completely removed.
    • 'inih' takes hundreds of lines C code, while a minimal approximation can be written in just a couple dozen lines of modern C++ (it could be made even simpler if C++ had a standard scanf-style function).

    In the future, I would like to switch to the JSON format, though the frontend will still need an INI reader for backwards-compatibility. Ideally, this INI reader would require minimal maintenance and take up very little space; I believe the new INI reader meets these needs perfectly.

    PCM Low-Pass Filter
    In v1.2, accurate low-pass filters for the Mega Drive's FM and PSG sound sources were added, but the Mega CD's PCM was left unfiltered. This has now been corrected, and the PCM audio is filtered with a second-order low-pass filter with a cut-off point of roughly 8kHz.

    As with the other two sound sources, this has the effect of reducing the volume of high-frequency sounds, making the audio less harsh and giving games their intended volume balancing.

    H-Scroll Fix and Optimisation
    A member of the Sonic Retro forum reported that some homebrew of his did not animate correctly in ClownMDEmu:
    [​IMG]

    The fine-scrolling that was intended for the text at the bottom of the screen was instead applied to a row in the middle of the screen. The cause of this issue had to do with the unusual combination of per-tile horizontal scrolling with a doubled vertical resolution.

    Typically, the Mega Drive renders at a resolution of 320x224, but it has a seldom-used feature that allows it to render at twice the vertical resolution. I only know of this feature being used by Sonic the Hedgehog 2 and Combat Cars, both for split-screen multiplayer.
    [​IMG][​IMG]

    Neither of these games use this in tandem with another feature which allows lines to be scrolled in groups of 8. As can be expected, this mode instead scrolls lines in groups of 16 when the resolution is doubled, however the code was failing to account for this.

    Whilst addressing this issue, I discovered that I could greatly simplify the relevant code using some basic bit-masking:

    Before
    Code (ASM):
    1. static cc_u16f GetHScrollTableOffset(const VDP_State* const state, const cc_u16f scanline, const TileInfo* const tile_info)
    2. {
    3.    switch (state->hscroll_mode)
    4.    {
    5.        default:
    6.            /* Should never happen. */
    7.            assert(0);
    8.            /* Fallthrough */
    9.        case VDP_HSCROLL_MODE_FULL:
    10.            return 0;
    11.  
    12.        case VDP_HSCROLL_MODE_INVALID:
    13.            return ((scanline >> state->double_resolution_enabled) % 8) * 4;
    14.         case VDP_HSCROLL_MODE_1CELL:
    15.            return (scanline >> tile_info->height_power << tile_info->height_power) * 4;
    16.  
    17.        case VDP_HSCROLL_MODE_1LINE:
    18.            return (scanline >> state->double_resolution_enabled) * 4;
    19.    }
    20. }
    After
    Code (ASM):
    1. static cc_u16f GetHScrollTableOffset(const VDP_State* const state, const cc_u16f scanline)
    2. {
    3.    static const cc_u8l masks[4] = {0x00, 0x07, 0xF8, 0xFF};
    4.    return ((scanline >> state->double_resolution_enabled) & masks[state->hscroll_mode]) * 4;
    5. }
    These bit-masks help to illustrate how the invalid H-scroll mode works, as it simply uses the inverse of the per-cell mode's mask (just as the whole-screen mode uses the inverse of the per-line mode's mask).

    Closing
    Really, I need to look into creating a build-bot for this project, as compiling the many executables and libretro cores takes far too much time! Not only are there so many different builds, but they are built on different platforms with different operating systems!

    There current sprawling mess of builds is as follows...
    • The latest Arch Linux (ran on my laptop).
      • i686 Windows EXE.
      • Emscripten build.
      • i686 Windows libretro DLL.
      • x86_64 Windows libretro DLL.
    • The oldest supported Ubuntu, for compatibility with LTS Linux distros, as required by appimage.github.io (ran in a VM on Arch Linux).
      • x86_64 Linux AppImage
      • x86_64 Linux libretro SO.
    • The oldest supported Raspberry Pi OS Legacy, also for compatibility with LTS Linux distros (ran on my Raspberry Pi 3B+).
      • AArch64 Linux AppImage.
      • AArch64 Linux libretro SO.

    While getting all of these to be built on just one platform would be ideal, at the very least it would be good to automate the process so that I do not have to produce each build manually. Not only do the builds have to be made, but they also need to be named, stripped, and, in the Emscripten port's case, uploaded to a server. Then there are these blog posts, which need to be written and then converted for posting on various forums. Altogether, the process of releasing an update for this emulator takes hours!
     
    Last edited: May 10, 2025
  3. Clownacy

    Clownacy

    Tech Member
    1,160
    843
    93
    v1.3.0.1
    Try it in your web browser: clownmdemu.clownacy.com
    Download: Standalone, libretro

    Just a hot-fix to patch an issue where out-of-bound array accesses would occur when a game sets its window plane horizontal boundary past the edge of the screen.

    While I was at it, I optimised the VDP renderer some more by drawing Plane B to the scan-line all at once, instead of it being split across the window plane boundary.
     
  4. rata

    rata

    Member
    709
    87
    28
    Argentina
    Trying to be useful somehow.
    Billion dollar companies: make a hotfix in a menu after one week.

    Clownacy: makes a hotfix and optimises the renderer while he was at it.
     
  5. sics

    sics

    Member
    Hello, I wanted to express my gratitude for the recent changes you've implemented. What you're achieving with this emulator is truly remarkable. Regarding the audio experience, it has noticeably improved. However, I feel it's necessary to point out that the current maximum volume might be considered somewhat low compared to other emulators, something I hadn't noticed until now, as I've been using the amplified speaker output.

    Secondly, I'd like to take this opportunity to mention that the antipiracy screen in "Sonic the Hedgehog: SAGE 2010 EDITION" is activated, although I imagine that must be an issue with the hack itself.
     
  6. Clownacy

    Clownacy

    Tech Member
    1,160
    843
    93
    The low volume has to do with the combined Mega Drive and Mega CD having 4 audio sources, leaving the FM with only a fraction of the volume range. If the FM were given the full volume range, then it would be possible for the other audio sources to cause the combined audio to peak, causing distortion. I do not know how peaking works on a real Mega Drive, as I imagine that it occurs in the analogue domain, which I am extremely unfamiliar with.
     
  7. Chibisteven

    Chibisteven

    Member
    1,384
    41
    28
    On real hardware the Sega Genesis and the Sega CD both use solid state amplifiers that clip when stuff gets too loud and slightly distorts when frequencies go into the ultra sonic range (inter modulation distortion). When running a model 1 Sega Genesis through a model 1 Sega CD, the Genesis gets clipped at half it's total volume capability and higher frequency content is subject to even more inter modulation distortion issues than the Sega Genesis itself has. The Sega CD audio as a whole is roughly half the volume level on it's side. Running a Sega CD through the Sega Genesis introduces some very minor treble loss (i.e. high frequency roll off). As far as the 32X goes, it always through the Sega Genesis and that track called "Tachy Touch" from Chaotix always clips regardless of the volume level of the Sega Genesis. The A/V out is more prone to distortion and clipping than the headphone out but actually similar to running audio through a Sega CD but in mono and twice the volume and without the Sega CD's added inter modulation distortion and the RF out has both a loss of bass (low frequencies) and treble (high frequencies).
     
  8. Clownacy

    Clownacy

    Tech Member
    1,160
    843
    93
    So the Mega Drive audio is clipped to half of its usual range, and not just attenuated, meaning that the upper half of the volume range - which usually sounds fine - is now practically lost as the signal can never go above halfway, meanwhile the Mega CD audio naturally outputs at half of the volume range by default, without any clipping?
     
  9. Chibisteven

    Chibisteven

    Member
    1,384
    41
    28
    How it works on both my model 1 Sega CDs is the upper -8 to -6 dB range of the Genesis gets clipped or distorted by the Sega CD. And it can be bad and very audible. Peaks can't exceed beyond this point.

    Anything that plays on the Sega CD side of the setup such as CDs or Sega PCM is around -6 dB lower. clipping is possible but usually doesn't happen and if it does it's usually inaudible.

    On real hardware it's also possible to connect the audio outputs of both the Sega Genesis and Sega CD separately to an external mixer bypassing those issues entirely instead of using the official methods of routing audio through one or the other which as you can see from my previous post has downsides with audio quality depending on which one you go with and what games you're playing, the volume of the Genesis side can be adjusted as well when routing through the Sega CD but this doesn't fix everything but it can improve some things. The manual recommends a level of 7-8 for the Genesis when using the mixing cable but this can be too loud for some Genesis games through the Sega CD's outputs. A level of 10 will be the worst of the A/V out and the worst of the Sega CD at the same time.

    Quick reference:
    Half = -6 dB
    Double = +6 dB

    Correction Edit: Clarified what I meant by half, misread the question.
     
    Last edited: Apr 30, 2025