The originals sound better. The aliasing provides a crunchiness and sharpness to the final output that drives emotional energy. That zero mission rhythm isn't intended to sound smooth and soft, the driving hard beats are an emotional tool for eliciting anxiety and anticipation from the player.
But this is a bit like those who use smoothing filters. It's ultimately about taste, but it should be recognized that unless the filter is attempting to accurately recreate the original hardware of the era then the original design intent is not being adhered to, and so something may be lost in the "enhancement".
A friend had this killer basement setup with a projector into a huge canvas dropsheet. Plus the game cube, and the GBA dock for it, so we were projecting those games meant for a 2 inch screen maybe 10-15 feet wide.
> The originals sound better. The aliasing provides a crunchiness and sharpness to the final output that drives emotional energy.
In the mid-1980s the first really affordable sampler was the Ensoniq Mirage, which used the Bob Yannes-designed ES5503 DOC (Digital Oscillator Chip) to generate its waveforms. It played back 8-bit samples and used a fairly simple phase accumulator that didn't do any form of interpolation (I don't count "leftmost neighbour" as interpolation). Particularly when you pitch it down, you get a rough, clanky, gritty "whine" to samples, that the analogue filters didn't necessarily do a lot to remove.
Later on they released the EPS which had 13-bit sampling. Why 13-bit? I don't know, I guess because the Emulator I and II used 8-bit samples but μ-law coding, giving effectively 13-bit equivalent resolution. It also used linear interpolation to smooth the "jumps" between samples, and even if you loaded in and converted a Mirage disk the "graininess" when you pitched things down was gone.
I'm currently writing some code to play back Mirage samples from disk images, and I've actually added a linear interpolator to it. Some things sound better with it, some things sound worse. I think I'll make it a front panel control, so you can turn it on and off as you want.
I don't think so, I think you're just getting a high end that isn't in the original audio. In the places where there are high frequencies the aliasing and the hiss just gets in the way.
I don't quite understand why the author is doing special handling for PSG versus PCM audio.
My GameBoy emulator generates one "audio sample" per clock tick (which is ~1 mhz, so massive 'oversampling'), decimates that signal down to like 100 ksample/sec, then uses a low-pass biquad filter or two to go down to 16 bit / 48 khz and remove beyond-Nyquist frequencies. Doesn't have any of the "muffling" properties this guy is seeing, aside from those literally caused by the low-pass.
This is great stuff… basically, an easy way to get much higher quality audio out of a GBA emulator.
I’ll add some context here—why don’t more games run their audio at 32768 Hz, if that’s such a natural rate to run audio? The answer lies in how you fill the buffers. In any modern, sensible audio system, you can check how much space is available in the audio buffer and simply fill it. The GBA lacks a mechanism to query this. Instead, what you do is calculate this yourself, and figure out when to trigger additional audio DMA from the VBlank interrupt. You know the VBlank runs every 280896 cycles, and you know that the processor runs at 16777216 Hz, so you can do some math to calculate how much data is remaining in the audio DMA stream.
A lot of games simplify the math—it’s easier to start a new audio DMA in your VBlank handler, but that means running at a lower sample rate, which will sound pretty crispy.
YMMV, some people like the crispy aliased audio. If the audio weren’t crispy, the sound designers probably would have adjusted the samples to compensate. Other factors being equal, I’d rather listen to what the original artists heard when they were testing on real hardware, because that is probably closer to what they intended, even though it has a lot of artifacts in it.
> why don’t more games run their audio at 32768 Hz, if that’s such a natural rate to run audio?
I've written some code to play back 8-bit samples (and indeed to wavetable, FM, and VA synthesis) on 8-bit Arduinos using the PWM to output 8-bit audio. That runs at 31373Hz which is a pretty crazy sample rate.
Why?
Because the chip is clocked at 16MHz, and if you program the PWM for no prescaler and "phase correct" PWM where it counts up and back down, so you get a widening pulse in the middle of a "burst", then it counts 510 "steps" of the counter. It's an 8-bit counter so it counts from 0 to 255, then the next step counts back down to 254, and so to 0 again, when the next step takes it to 1.
The crispy aliasing of the audio has always felt cozy to me. It’s also a bit of a signature of the system, like the wobbly polygons on PS1. I appreciate that there are ways to change the sound, but it feels a bit rude to label it broken or defective.
I suspect that without nostalgia, the fixed interpolation would absolutely sound better. Unfortunately, nostalgia. The lesson I'm taking away here is that, oh, the terrible resamplings are the aspect of faithful emulation that makes it sound like a GameBoy and not just sawtooths.
The reason the nearest neighbour interpolation can sound better is that the aliasing fills the higher frequencies of the audio with a mirror image of the lower frequencies. While humans are less sensitive to higher frequencies, you still expect them to be there, so some people prefer the "fake" detail from aliasing to them just been outright missing in a more accurate sample interpolation.
It's actually the other way round: Aliasing fills the lower frequencies with a mirror image of the higher frequencies. So where do the higher frequencies come from? From the upsampling that happens before the aliasing. _That_ makes the higher frequencies contain (non-mirrored!) copies of the lower frequencies. :-)
Audio was the thing I could never figure out on my Gameboy emulator. I couldn’t get it to pass basic tests, even without bothering to output sound on the computer.
The loss in high-frequency information is not worth the interpolation. Bass loses its crunch. Percussion fades into the background.
Besides, I personally prefer to play my vgm at the original sample rate, and my soundcard adjusts to the correct rate for each song through fb2k plugins.
The originals sound better. The aliasing provides a crunchiness and sharpness to the final output that drives emotional energy. That zero mission rhythm isn't intended to sound smooth and soft, the driving hard beats are an emotional tool for eliciting anxiety and anticipation from the player.
But this is a bit like those who use smoothing filters. It's ultimately about taste, but it should be recognized that unless the filter is attempting to accurately recreate the original hardware of the era then the original design intent is not being adhered to, and so something may be lost in the "enhancement".
A friend had this killer basement setup with a projector into a huge canvas dropsheet. Plus the game cube, and the GBA dock for it, so we were projecting those games meant for a 2 inch screen maybe 10-15 feet wide.
> The originals sound better. The aliasing provides a crunchiness and sharpness to the final output that drives emotional energy.
In the mid-1980s the first really affordable sampler was the Ensoniq Mirage, which used the Bob Yannes-designed ES5503 DOC (Digital Oscillator Chip) to generate its waveforms. It played back 8-bit samples and used a fairly simple phase accumulator that didn't do any form of interpolation (I don't count "leftmost neighbour" as interpolation). Particularly when you pitch it down, you get a rough, clanky, gritty "whine" to samples, that the analogue filters didn't necessarily do a lot to remove.
Later on they released the EPS which had 13-bit sampling. Why 13-bit? I don't know, I guess because the Emulator I and II used 8-bit samples but μ-law coding, giving effectively 13-bit equivalent resolution. It also used linear interpolation to smooth the "jumps" between samples, and even if you loaded in and converted a Mirage disk the "graininess" when you pitched things down was gone.
I'm currently writing some code to play back Mirage samples from disk images, and I've actually added a linear interpolator to it. Some things sound better with it, some things sound worse. I think I'll make it a front panel control, so you can turn it on and off as you want.
The originals sound better.
I don't think so, I think you're just getting a high end that isn't in the original audio. In the places where there are high frequencies the aliasing and the hiss just gets in the way.
that drives emotional energy
Seems like a hyperbolic rationalization.
The ‘improved’ versions sound muffled like I have water in my ears. Plus I’d rather hear the game as it was designed, artefacts and all.
I don't quite understand why the author is doing special handling for PSG versus PCM audio.
My GameBoy emulator generates one "audio sample" per clock tick (which is ~1 mhz, so massive 'oversampling'), decimates that signal down to like 100 ksample/sec, then uses a low-pass biquad filter or two to go down to 16 bit / 48 khz and remove beyond-Nyquist frequencies. Doesn't have any of the "muffling" properties this guy is seeing, aside from those literally caused by the low-pass.
This is great stuff… basically, an easy way to get much higher quality audio out of a GBA emulator.
I’ll add some context here—why don’t more games run their audio at 32768 Hz, if that’s such a natural rate to run audio? The answer lies in how you fill the buffers. In any modern, sensible audio system, you can check how much space is available in the audio buffer and simply fill it. The GBA lacks a mechanism to query this. Instead, what you do is calculate this yourself, and figure out when to trigger additional audio DMA from the VBlank interrupt. You know the VBlank runs every 280896 cycles, and you know that the processor runs at 16777216 Hz, so you can do some math to calculate how much data is remaining in the audio DMA stream.
A lot of games simplify the math—it’s easier to start a new audio DMA in your VBlank handler, but that means running at a lower sample rate, which will sound pretty crispy.
YMMV, some people like the crispy aliased audio. If the audio weren’t crispy, the sound designers probably would have adjusted the samples to compensate. Other factors being equal, I’d rather listen to what the original artists heard when they were testing on real hardware, because that is probably closer to what they intended, even though it has a lot of artifacts in it.
> why don’t more games run their audio at 32768 Hz, if that’s such a natural rate to run audio?
I've written some code to play back 8-bit samples (and indeed to wavetable, FM, and VA synthesis) on 8-bit Arduinos using the PWM to output 8-bit audio. That runs at 31373Hz which is a pretty crazy sample rate.
Why?
Because the chip is clocked at 16MHz, and if you program the PWM for no prescaler and "phase correct" PWM where it counts up and back down, so you get a widening pulse in the middle of a "burst", then it counts 510 "steps" of the counter. It's an 8-bit counter so it counts from 0 to 255, then the next step counts back down to 254, and so to 0 again, when the next step takes it to 1.
And 16000000/510 is 31372.55 ;-)
The crispy aliasing of the audio has always felt cozy to me. It’s also a bit of a signature of the system, like the wobbly polygons on PS1. I appreciate that there are ways to change the sound, but it feels a bit rude to label it broken or defective.
I suspect that without nostalgia, the fixed interpolation would absolutely sound better. Unfortunately, nostalgia. The lesson I'm taking away here is that, oh, the terrible resamplings are the aspect of faithful emulation that makes it sound like a GameBoy and not just sawtooths.
The reason the nearest neighbour interpolation can sound better is that the aliasing fills the higher frequencies of the audio with a mirror image of the lower frequencies. While humans are less sensitive to higher frequencies, you still expect them to be there, so some people prefer the "fake" detail from aliasing to them just been outright missing in a more accurate sample interpolation.
It's basically doing an accidental and low-quality form of spectral band replication: https://en.wikipedia.org/wiki/Spectral_band_replication which is used in modern codecs.
It's actually the other way round: Aliasing fills the lower frequencies with a mirror image of the higher frequencies. So where do the higher frequencies come from? From the upsampling that happens before the aliasing. _That_ makes the higher frequencies contain (non-mirrored!) copies of the lower frequencies. :-)
Oh yes you're correct, imaging would be the correct term for what's happening I think (aliasing is high -> low and imaging is low -> high)?
Impressive.
Audio was the thing I could never figure out on my Gameboy emulator. I couldn’t get it to pass basic tests, even without bothering to output sound on the computer.
The original sounds so much better...
The loss in high-frequency information is not worth the interpolation. Bass loses its crunch. Percussion fades into the background.
Besides, I personally prefer to play my vgm at the original sample rate, and my soundcard adjusts to the correct rate for each song through fb2k plugins.