don't click here

General Project Thread & Feedback

Discussion in 'Sonic 2 HD (Archive)' started by steveswede, Apr 29, 2010.

Thread Status:
Not open for further replies.
  1. LordOfSquad

    LordOfSquad

    bobs over baghdad Member
    5,201
    243
    43
    Winnipeg, MB
    making cool music no one gives a shit about
    As another layman, I can say that neither in or out sound all that great to me. They both sound pretty muddy and washed out.
     
  2. dsrb

    dsrb

    Member
    3,149
    0
    16
    Ironic since what I'm trying to combat is the possibility that your views are skewed, whether you know it or not—and that suggesting to people that they do a sighted test based on samples that may be from improperly acting hardware, and that such a test is a valid method of evaluation, is just inviting a multiplication of the problem.

    But yeah, whatever. I've said about as much as I can, or can be bothered, to say.

    Oh you! So British. :D
     
  3. saxman

    saxman

    Oldbie Tech Member
    To test the emulation. VGM is more accurate, but a terrible format. GYM is easier to work with, so it was added first.

    To provide some context, the point is the engine takes an existing sound and improves it. I provided the most dramatic example.
     
  4. Falk

    Falk

    Member
    1,570
    15
    18
    It's not that 'in' is better. It's that 'out' is worse. I can go all the way into spectral analysis and psychoacoustics if you want about inharmonicity and ear fatigue, which is science vs subjectivity but like you said, it's veered way off the purpose of this thread.
    edit: Was trying to find the part of your post that said "if you think 'in' was better" for quoting purposes but it's gone. Nevermind.

    This is so blatantly incorrect in approach on so many levels I don't even know where to start. First and foremost, the first overtone on a waveform with 20kHz fundamental would be 40kHz which is way beyond the range of human hearing, even assuming an unlimited sampling rate, meaning they'd essentially sound identical. At least 40% of adults past 25 won't even hear a 20kHz signal to begin with, especially those brought up in city areas/subways/etc.

    Secondly, following conventional DSP techniques especially getting into DACs and sample rate conversions, it's actually the other way around. Assuming the theoretical listener being able to hear up to 100kHz (DOG MAN TEST SUBJECT EXTRAORDINAIRE, please pardon the fact he had a terrible accident during birth) a 20kHz SQUARE wave would theoretically sound like a 20kHz SINE wave with a 44.1kHz sampling rate since the low-pass 22.05k Nyquist cutoff (which prevents aliasing) will obliterate all the overtones. Even the first one. (A sine wave has no overtones, only a fundamental)

    Thirdly, extrapolating from the above, a higher sampling rate benefits more complex waveforms due to overtone content. A sine wave is as simple a waveform as you can get. You cannot represent a square wave close to Nyquist frequency, but you absolutely can represent a sine wave close to Nyquist frequency. In other words, theory is in direct contradiction to your post.

    Yes. 100% this.
     
  5. Falk

    Falk

    Member
    1,570
    15
    18
    In the name of science here's something else stemming from this post.

    http://dl.dropbox.com/u/19357938/SineSweep.wav A sine wave sweeping from ~2kHz up to ~47kHz. at a sampling rate of 96kHz.

    I don't know about you, but at 20kHz, it doesn't sound much like a square wave to me. In fact, it's pretty inaudible. Despite abusing my ears quite a bit I'd like to say I have still above average range, and I can reliably hear up to 19kHz, which is more or less right smack after the 3sec mark. Hence I'm not too sure if you were serious when you said "have you ever tried X". In fact I'm actually wondering if -you've- tried it yourself.

    What's actually more interesting though, is that I guarantee on many consumer playback systems attempting to play this 96kHz sampling rate sweep you -will- hear audible artifacts after the 3sec mark, where the sweep continues on from ~19k up to ~47k, technically which is supposed to be completely inaudible to human hearing. This is most commonly audible as a very rapid sweep back down and up again at lower volume, where there's supposed to be silence. You aren't actually hearing above 20kHz - you're hearing much lower frequency signals generated as a result of aliasing occuring from the quick-and-dirty realtime downsampling applied to anything not at the sound driver's default playback rate. In other words, 96kHz when you don't absolutely need (or are sure that the target delivery has the means to not butcher it on playback) it is not a good idea.

    http://dl.dropbox.com/u/19357938/SineSweep2.wav Here it is again, converted to 44.1kHz with Adobe Audition's cookie cutter tools (They aren't spectacular but they're decent). If you open it up in a waveform editor, you'll notice it abruptly goes down to silence at the 3sec mark. This is the band reject filter at work, which is designed to prevent the aliasing problems as according to Nyquist theorem. What I wanted to point out with the second clip though, is right up to that 3sec point, it's going to sound indistinguishable to practically anyone, save a select few and even then only on very specific playback setups.

    http://dl.dropbox.com/u/19357938/sine.png Lastly, here's your empirical evidence that there's no problem representing or playing back sine waves close to Nyquist limit. The blocks represent the data stored of a sine wave close to Nyquist (19.5kHz on 44.1kHz sampling in this case). The line represents the waveform as it would be played back by any decent DAC that band-rejects everything above Nyquist. I have no idea how you're coming to the conclusion that it's 'basically square waves'.
     
  6. steveswede

    steveswede

    Member
    5,032
    1
    16
    Ask my hand
    Fighting against the Unitary State of Europe
    @Saxman

    Falk and dsrb are right about audio myths I've seen no end of debates about this in the FL studio forum(sorry I chuckled at your 96kz HD music due to it being way beyond human hearing, Scubasteve and Teeloops should have pointed that out for you considering their knowledge of music). If you need more convincing than what falk's and dsrb's brilliant posts have explained, it could be your worth to sign up to KVR audio to ask other people's questions on it. I'm not saying that to prove a point but it's good to understand something correctly.
     
  7. dsrb

    dsrb

    Member
    3,149
    0
    16
    As well as Steve's good suggestion there, I'll repeat my earlier recommendation of Hydrogenaudio. It has a number of very experienced members, and crucially they can be very patient in explaining things. More so than me, probably! ;)
     
  8. GerbilSoft

    GerbilSoft

    RickRotate'd. Administrator
    2,971
    76
    28
    USA
    rom-properties
    If that's the case, then why bother creating "high-definition" graphics for S2HD in the first place? Just use the hq4x filter on Gens. It's the same thing, right? (That, and it'd take a lot less time to implement.)

    [​IMG]
     
  9. Sik

    Sik

    Sik is pronounced as "seek", not as "sick". Tech Member
    6,718
    1
    0
    being an asshole =P
    Upscaling and doing interpolation doesn't make it sound better, it makes it sound worse. You're trying to generate data that doesn't exist and interpolation only muffles the waveform. And in fact, your analogy to 2xSAI is perfect - the outcome is crap, and that's because the filter is unable to generate the extra details one would expect from HD graphics.

    There's only one instance where upsampling is valid. Let's say we have a 11KHz sample. If you try to pass it as-is to the sound card, it'll sound completely muffled because the sound card (or the driver, if relevant) will apply a ridiculous amount of interpolation. If you upscale it to 44KHz without interpolation (that is, just repeat the samples), it'll sound extremely clear. In other words, interpolation only makes things sound more muffled, not higher quality.
     
  10. Falk

    Falk

    Member
    1,570
    15
    18
    I don't know if you're trolling but... no. O_o
     
  11. Black Squirrel

    Black Squirrel

    no reverse gear Wiki Sysop
    8,609
    2,485
    93
    Northumberland, UK
    steamboat wiki
    All that ZIP shows is that your converter multiplies the file size by five.

    At some point I'll be wanting to download this game. Don't really want to sit for longer for a negligible effect that's only detectable on the high-end sound cards that nobody needs nor wants (bar the professionals). The effect might be lovely but it's ultimately a bit pointless. I actually think the default output of an emulator such as Kega Fusion sounds better for Sonic games, even if it's "squeakier" than intended.


    I think the high definition equivalent of audio tends to be things like surround sound, not differences in frequency.
     
  12. saxman

    saxman

    Oldbie Tech Member
    Me and dsrb discussed all this today. After re-reading the first couple of posts, I realized I said some things that sound a bit harsh. This wasn't my intent, but I can't deny what I said. So I appologized to him for that. We also talked briefly about the substance of the discussion. It was productive.

    I can honestly say I did not expect a debate to errupt from this. I could continue to make points, but it all leads to the same place -- nowhere! That's why I opted out of the discussion, because it really takes away from what I wanted to do in the first place, and that is to talk about the sound engine we're using! I'm excited about it, and on a personal level, I feel very proud of what we've all been able to accomplish.

    Since there was a debate, I think it's fair that I modify what I wrote about the features. So here it is:




    * Genesis/Mega Drive sound emulation with support for VGM/VGZ and GYM playback

    * Compressor to allow for full, thick layer of sound

    * Tempo and pitch control to allow variations in sound (e.g. revving spindash)

    * OGG Vorbis playback support to showcase Tee's music

    * Interpolation used to resample and smoothen the audio for alternate sample rates



    If anyone would like to ask me about specific points listed, feel free to do so. For instance, a question was asked about the use of GYM, and I explained it was implemented initially as a way of testing the emulation. That's the kind of discussion I'm interested in.
     
  13. Falk

    Falk

    Member
    1,570
    15
    18
    As long as this stops you from asking for people to listen to 20kHz waveforms.
     
  14. Hamneggs

    Hamneggs

    Official Breakfast of S2HD Member
    303
    0
    0
    TEXAS
    Networked lighting
    So are you guys still having the DirectX issues?
     
  15. Sik

    Sik

    Sik is pronounced as "seek", not as "sick". Tech Member
    6,718
    1
    0
    being an asshole =P
    I'm not trolling. If anything, the sound hardware is x_x When playing 11KHz samples they interpolate like hell to make it sound "nicer", but the result is extreme muffling. The interpolation is much smaller for higher sample rates. Thereby, if you take a 11KHz waveform, repeat every sample four times and feed it to the sound hardware as 44KHz, it will sound much less muffled.

    It doesn't make the original waveform sound better than it is, but rather it prevents it from sounding worse than it should.
     
  16. dsrb

    dsrb

    Member
    3,149
    0
    16
    But that's completely butchering the signal. The whole point of DAC-side interpolation is to round the waves off, thereby removing the post-Nyquist nonsense that results from square waves. Why would you deliberately circumvent this?!

    11.025 kHz sounds muffled because it is. A properly output signal at this rate will only contain frequencies up to 5.5125 kHz, which will naturally sound muffled to us, being used to music with hats, cymbals, and whatnot to 16 kHz and a bit beyond. If you want to get around this, find a better sample, rather than mutilating the signal into an obscene square wave that it was never supposed to be.

    Edit: fixing sampling rate figures to proper accuracy
     
  17. Falk

    Falk

    Member
    1,570
    15
    18
    That's the complete opposite way around. You're essentially thinking that interpolation is a problem and zero-order hold is the fix. In fact, it's the other way around. Zero-order hold and the resultant random inharmonics generated as a result is the problem, and interpolation is the fix. The rest was pretty much covered by dsrb.

    Considering that zero-order hold is pretty much the easiest way to implement a DAC (and sounds bad) and interpolation methods have been refined over the years... decades really, to best rectify the problem I'm completely baffled as to how you could think the problem and solution are the other way around. The irony is you yourself said "You're trying to generate data that doesn't exist" because that's -exactly- what zero-order hold does. It generates frequencies above Nyquist, which should not exist.

    edit: Might as well add that's why I'm baffled as to why this audio engine's 'interpolation' results in frequencies that weren't previously there, when interpolation is supposed to get rid of frequencies that shouldn't exist, but -this- dead horse has been beaten to a fine pulp by now and is more of semantics.

    edit2: to make this post more useful:
    http://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
    http://en.wikipedia.org/wiki/Zero-order_hold

    Not exactly the most layman of explanations but good enough.
     
  18. winterhell

    winterhell

    Member
    1,165
    7
    18
    Can we say that the results of sample repetition (like Sik said) and interpolation are similar to that of nearest neighbor and bilinear in imaging ?

    btw personal opinions:
    I can rarely hear a difference between 192kbps and 320kbps mp3 and CD audio. But if its recorded 24bit/48KHz and done correctly it sounds better even if lossy.
    For the synthesized sound of Sonic 2 if you can run it directly at the target frequency will be better instead of recording it and so on. Or you are talking only for the digital samples like the "Sega" chant ?
     
  19. Falk

    Falk

    Member
    1,570
    15
    18
    Quite right, although I'd add that it'd be more akin to a non-integer resize if you're going from 11,025 to e.g. 48kHz (which used to be quite common)

    Sample repetition, or 'nearest-neighbour', would look something like this when not every pixel of an input represents the same number of pixels on an output:
    [​IMG]
     
  20. dsrb

    dsrb

    Member
    3,149
    0
    16
    Assuming you're referring to 24 bit / 48 kHz vs. 16 bit / 44.1 kHz:

    Congratulations on having superhuman hearing and/or terrible hardware. Please post ABX results.
     
Thread Status:
Not open for further replies.