I saw this mentioned by a YouTuber, who went on to say that many people believe that the SNES was originally intended to be backward-compatible with the NES.
IMHO that's not quite right — there's a lot of simple things Nintendo could have done to make the SNES a lot more backward-compatible with the NES, that would have had to have been done right at the beginning (and so stuck around as additional evidence of this), just as the few places that do line up (e.g. controller read-out + MMIO address) were certainly done right at the beginning.
Instead, I think what Nintendo was imagining, was that — as long as the NES and SNES were still both "alive" in the market concurrently — then companies would want to concurrently release both NES and SNES versions of their games. And, with some careful planning, development studios would be able to have a single 6502 macro-assembler codebase that compiled to either a NES or SNES target. Developers could either add MMIO address pokes / JSRs inside #ifdef-like macro structures; or just write two subroutines, one for NES and one for SNES, and then compile-time branch to determine which would get compiled in.
This makes sense of where the NES and SNES line up in compatibility, and where they diverge: they line up where there's an obvious way to support both consoles with one block of instructions; they diverge where developers would have to write separate subroutines to support the different hardware anyway, so instruction-level / memory-map-level compatibility isn't so crucial.
This seems so obviously "possible" (though definitely challenging!) that I've always wondered whether there are any games that are actual examples of this — where the game came out at the same time in both NES/FC and SNES/SFC releases, and binary analysis reveals that the majority of the game's engine code is shared between the two, with differences mostly in the subroutines/coroutines composing the "graphics engine", and in the static data.
—————
On a separate note, I feel like it would be even simpler — and a lot more performant! — to ahead-of-time recompile a NES game to run on the SNES. The NES doesn't tend to do tricks with dynamic runtime code generation, so NES games are good candidates for being statically analyzed and transpiled. (And the SNES has such a similar architecture, that even those dynamic runtime tricks might map cleanly onto the SNES, too.)
What I think really happened is that they originally planned on it being backwards compatible, but then realized how many games depended on undocumented opcodes and bugs that had been 'fixed' in the 6502 mode of the 65816. At that point, go back to your bread boards and go nuts removing anything that was there for NES compat but was holding back the SNES in some way. A few iterations of that and you have a pretty much what an SNES was on launch.
Edit: Thinking about it more, that all would also make sense given their strategy past the SNES for back compat. On the GBA they literally just have a complete GameBoy SoC on the die. It's not accessible to GBA software, is probably clock gated in GBA mode, but is there with as few changes as possible from GB hardware to run GB software. Then for the Wii, the companion ARM processor (Starlet as it's known by the Wii hackers) will go so far as to patch problematic GameCube games on load even though it's very nearly identical hardware with a few extensions. It seems like they either drop down gate for gate compat, leave a back door for themselves to get really dirty with patching, or these days just keep an emulator around that they've re-QAed each game on that they'll allow. They're very intentional with back compat in a way that feels like an ancient learned lesson internalized by the company.
This has always been my view; It's the only CPU I know about that met the requirements.
Other 16bit consoles used an off-the-shelf 68000, combined with custom support chips. Nintendo made their own custom cpu silicon, an off-the-shelf soft-cpu and custom support logic for bus control and DMAs.
And there simply weren't that many 16bit soft-cpus available for Nintendo to license. You couldn't license the 68000 HDL, nor the 8086 HDL.
"From a supplier Nintendo had a good working relationship with?" What do you mean by that? Nintendo did not have a working relationship with the supplier of the 65816 prior to development of the SNES.
Nintendo also tends to be pretty conservative when selecting technology for new products (Gunpei Yokoi's "lateral thinking with withered technology"[1]). It's entirely possible that the similarities between the NES and SNES are that they wanted to avoid any needless innovation -- keeps costs down, makes sourcing parts easier, you already have a lot of expertise in at least some of the hardware you're using, and your developers don't need to relearn as much.
Nintendo stuck with pretty similar PowerPC chips for three generations while Sony went MIPS->PowerPC->x86. And when they finally switched off of PowerPC, they went to ARM, which they'd been using for handheld systems for two decades.
There are no similarities between the two except for the 65816 having a 6502 compatibility mode. Everything else in the SNES is radically different because of innovation.
> The NES doesn't tend to do tricks with dynamic runtime code generation, so NES games are good candidates for being statically analyzed and transpiled.
Runtime code generation is rare, but dynamic dispatch (i.e. "JMP indirect" through RAM), jump tables, and certain instruction-level hacks are fairly common. This can make it tricky to reliably identify all subroutines or traces [1].
While not quite the same thing, I've read that the games in Super Mario All-Stars use basically the same assembly code as the original NES versions, just with updated graphics.
I'd be curious to know more about the development of Super Mario All-Stars. I've been playing both the original and All-Stars (on the Switch--if that matters) and the acceleration of Mario in SMB1 feels different. The original SMB and All-Stars were only released 8 years apart. If they didn't use the original code base they likely had access to the original dev team.
> The actual mechanics behind the issue were discovered a while ago (the Y-reverse thing isn't right). Something to do with the block getting replaced immediately instead of a one-frame queue/delay.
Unfortunately I can't find a fuller explanation for this claim.
I wonder if this isn't a bug in the SMAS remake, but rather in the original game.
Maybe brick-blocks taking an extra frame to swap from tiles to sprites for their animations, was something perceived as an unfixed bug (or a necessary compromise, due to per-frame CPU-time limits) at the time of the original's development — one with, perhaps, a "TODO" comment sitting there in the original assembly source. And the development team of the port noticed that, with either more dev time, or with the increased cycles-per-frame available on the SNES, the "bug" was able to be "fixed".
But perhaps — since that team wasn't aware that the change in physics this created was unexpected by the original devs — they thought that the physics resulting from "fixing" this "bug" were the originally-intended physics of the game (i.e. that the animation "bug" was a regression, and fixing it put the game back the way it was in some earlier build), rather than that they were introducing a regression themselves.
These sorts of things are the reason I've always wished version control got used more heavily in the gamedev industry back in the 80s. Then any leak of a game's code-base would actually have a productive use (rather than just serving as poison fruit to aspiring developers): it could be a treasure-trove of data for Software Engineering academics to study the in-industry SwEng practices that lead to "good" games vs "bad" ones.
(I've had a longstanding hypothesis that many "bad" games from the 80s/90s were "good" games for much of their development, but got broken very close to release as they tried to merge in tons of longstanding feature-branches — i.e. the exact problem Continuous Integration tries to address.)
your link is about the behavior when Mario hits a brick whereas OP was talking about acceleration of Mario (I think he means running?!). Quite interesting link, nonetheless!
Yes, but we often don't feel what we think we're feeling! The patch changes Mario's speed when he hits a brick, which will have implications for when he hits the ground.
I'm far from an expert player, but I really think the physics are identical once the patch is in place. I sometimes wonder if it was really a mistake at the factory—something they didn't catch until 20,000 PCB's in, and didn't want to change in revisions for consistency.
They did use the original source, the All Stars source is included in the recent leaks of Nintendo software.. You can see a lot of the original code and see what they changed. It seemed when I glanced at it that the SNES developers used a different level of indentation so it was noticeable when some new code was added in.
I think you’d find the opposite - that NES games use lots of weird unstructured or dynamic code tricks that make automated AOT static recompilation very difficult or impossible, in most cases. These things tend to be more common on older CPUs and in hand-written assembly code, not less.
Personally, I disagree with the author of that article's conclusion here, that the [only] "solution is to embed an interpreter runtime in the generated binary." You can statically recompile this code!
You "just" need to run a concolic interpreter over the code, generating out new static instruction streams for each possible phi-node (branch) state, and then coalescing them back together where you find you've generated the same instruction stream. You can early-terminate these spec-exec paths when you find yourself back at an instruction postdominated by entirely static code [within the body of the unrolled runtime coroutine sequence].
Might sound like an unbounded exponential check, but in practice 1. merging back together will usually happen within, at most, five instructions, because these are almost always last-minute bit-packing-hacks to get code to fit within a given constrained ROM image size; and 2. there are always only two concolically-discoverable valid interpretations of a given instruction stream (if doing mid-instruction jumps), and only one concolically-discoverable valid interpretation of data-as-code / RAM-as-code. (NES games don't assemble and JIT arbitrary code streams. They don't have the work RAM for it. They're at most unpacking existing, non-arbitrary code streams; or taking something that is primarily a code stream, and then reusing it as data elsewhere, maybe a PRNG seed, or a 1-bit noise texture.)
A large part of my job lately is decompiling and "reconstructing meaning" for object code compiled for lousy underspecified abstract-machine architectures. There are a lot of modern compiler-theoretic tools and tricks that can make short work of "statically" recompiling any but the hardest of hard cases. They're sadly virtually unknown outside of academia, though (probably because they're based on various graphical transformations of the code proposed after the Dragon Book was published, which is the point in history where most people in industry's knowledge of compilers seems to stop.)
Well, I know a good deal about old video game systems but you clearly know more about (re/de)compilers than I do, so although still skeptical I’ll defer to your expertise, while vaguely gesturing towards “lots of people would love the performance gains of static recompilation in our emulators, if it’s feasible please advance the state of the art” :)
Pretty sure you’re still going to be stymied by things like cycle-counted code to line up I/O writes with the state of the hardware, though. I don’t suppose there’s tricks for that.
Cycle-accurate transpiling, now that sounds like a fun project. I think it would "only" have to be cycle-accurate between specific events, e.g. access to specific registers. (No idea how feasible this whole thing could be.)
> I think it would "only" have to be cycle-accurate between specific events, e.g. access to specific registers.
That’s correct — but one of those events is external interrupts (from the graphics, audio, or cartridge hardware), which complicates things a bit since your “checkpoints” can be anywhere in the code rather than just at a specific set of register accesses.
You could work around this by tracking cycles per-basic-block while running transpiled code, and falling back to cycle-accurate software emulation when you get close to a hardware interrupt.
I remain skeptical (but would love to be proven wrong!).
Most of the point of recompiling (JIT or AOT) is to amortize instruction decode and dispatch costs versus an interpreter. If you have to do the cycle accurate bookkeeping anyway, you're most of the way to the overhead of an interpreter again. There'll be some savings around partial instruction streams that don't have effects outside the CPU core that can just be run as a burst, but I'm not sure it'll be a game changing amount.
The zero page is very small, I don't think this was done that much. You'd constantly be writing large gobs of stuff into it and that'd be very slow.
That said, the stack was used as a fast indirect jump address location pretty frequently, as described here, essentially abusing the fact that PHS and RTS are 1 byte instructions: http://www.6502.org/tutorials/6502opcodes.html#RTS
And really since basically all PC-modifying instructions operate on either relative (b*), absolute-indirect (jmp), absolute (jmp, jsr), or stack-indirect (rts, rti) there's no special benefit you get from having code in the zero page anyways afaik.
> there's no special benefit you get from having code in the zero page anyways afaik.
You save one cycle off each instruction that modifies another instruction, which can important for inner loops. See eg http://www.linusakesson.net/programming/gcr-decoding/index.p..., although that's a disk-induced realtime requirement, rather than hblank/beam-induced.
ah ok yeah that makes sense. If you're directly modifying the instructions then yeah there's a big benefit to the zero page. I think I misread the GP post and missed the self-modifying context somehow.
I wouldn't have thought so. Back then version control was making copies on a floppy disk. A lot of developers didn't even have dedicated build machines let alone the fancy CI/CD pipelines we do these days. Plus most studios had pretty tight deadlines. So if they did any code sharing it would have been more like copying/pasting rather than assembly macros. In fact I wouldn't be surprised if the "copying/pasting" was literally someone printing out (to a dot matrix printer) and an office junior manually punching those instructions back in on another machine.
It was a very different era back then. Tooling sucked and everything was a lot more manual. But we didn't mind as were still pushing boundaries of our imaginations and didn't know any better.
I don't think this was really a practical thing and I think there'd be other signs of an attempt at if if so. The final version of the SNES PPU probably would have resembled the NES PPU more, for example, in small ways like tile sizes at the very least. There's at least one mode on the snes ppu that's kiiiinda like the nes ppu's standard behaviour but it's just off enough to make reusing code for it fairly hard I think.
And then I think you'd have seen a mapper for the NES that more closely resembled the 65816's native addressing, since that would definitely have eased such porting efforts. But even the MMC5, which was first used the same year the SNES came out didn't move much in that direction.
Tbh I think you're also overestimating the assemblers they used at the time as well, both in terms of the expressiveness of the macros they had available and and kind of dead code removal you might be imagining happening.
I think a much simpler explanation is just that they had it in mind as a possibility from the start and they made some design choices that were both easy and would have allowed it. They may have been leaving open the possibility of an adapter cart like the genesis power base converter (which had much less work to do) too. But they also decided fairly early on that it wasn't really worth it economically and nixed it early enough that it didn't make a huge impact on the architecture.
I'm also curious about how what architectural similarities there might've been between NES/SNES and Gameboy, since there were quite a few common releases.
Probably not as many as you might assume. There's also the issue of the Game Boy running on an 8080 derivative, unlike the NES and Super NES, which ran on 6502 (and 65816) derivatives.
You might be interested in these articles covering each console's architecture:
Former SNES and GBA hobbyist developer here. The different CPUs aside - Z80 in the GB, 6502 in the NES, 65816 in the SNES and ARM in the GBA - the GB, just like the NES, shares very little with the SNES. The GBA is a different story: it is in so many ways like the SNES' sibling, but an improved sibling, most distinctly because of its open memory architecture allowing convenient and fast access to all of its custom hardware, whereas in the SNES the video and sound chip RAM is tucked away behind single-byte-sized memory channels you have to access by shoveling bytes manually (or by programmable DMA) back and forth between CPU-accessible RAM and video/sound chip RAM etc. This architectural choice was in my opinion the largest improvement the GBA brought, more so than moving away from the SNES' custom "sound computer" which was a bit of a hurdle.
Now that you mention the sound, I always had this question: was the GBA sound system any good? My understanding is that the GBA only has the original GB sound chip plus two PCM channels, without any dedicated hardware synthesis or mixing, so all sound effects need to run on software. Now the CPU was so much faster than the SNES or Megadrive, but was it fast enough to make up for it?
GBA games were generally regarded as having lower sound quality than SNES games, although this perception was somewhat exaggerated by the poor quality of some SNES->GBA ports. It took a lot more CPU to get good audio out of the GBA, and I'm sure developers were taking the GBA's tinny speaker into account when deciding how much effort to invest in polishing the soundtrack. Hell, the GBA SP needed an adapter to use headphones!
It has the GB legacy sound and two 8-bit PCM channels (stereo wave output basically). It is definitely worse than the SNES - however, because of how fast the ARM is (relatively speaking) it has no problem doing software mixing to provide developers with effectively 4, 6 and even 8 channels of PCM audio for soundtracks and effects. Not as smooth sound, but totally sufficient.
The Game Boy is much more similar to the Sega Master System & Game Gear than to the NES. Some games (e.g. Lemmings 2, Pinball Dreams, Spirou) were developed for both handhelds (or even for all 3 platforms) using a single assembly-language codebase.
The GameBoy is distinctly different from the (S)NES. It uses an 8080/Z80-ish CPU core instead of the 6502/65816, the audio and graphics hardware have different capabilities than the NES, etc. There were a lot of games that were both on the NES and the GB but they were rarely 1:1 ports for these and other reasons.
They're basically completely different other than broad architectural strokes. Different CPU ISA, the keys are sampled through a matrix rather than a bit banged shift register setup, audio channels are pretty different, GPU is internally pretty different (LCD allows pauses in the pixel stream unlike a CRT so the GPU uses that), etc.
The Super GameBoy is essentially full GameBoy hardware in a cart, that hijacks the SNES PPU for video output and uses SNES controllers for input. There is no hardware emulation going on.
No sure if this is the right place to ask, but are there any retro emulator setups which allow you to play 2 player games over the internet?
Ive heard something maybe possible with retroarch but Im surprised there is no pi zero with an easy multiplayer setup yet ( for instance to play 2 player Mario Tennis on the Gameboy )
retroarch netplay worked very well for me. Hosted a game on one PC, it appeared in the online game browser, joined it from the 2nd PC.
In addition, it's very easy to have your rom inside the retroarch folder, then send the whole folder to another person / computer. That way there is no confusion about rom versions, in addition you can pre-configure the controls if the person you're sending the folder to isn't as tech savy.
Yes, but you need very good latency to the other player for it to work well. The typical setup is to use something like parsec where one person hosts the emulator and streams audio/video and the other person sends keyboard commands.
You can add a small delay to the host to even things out but to properly account for latency is a major reverse engineering project which afaik off the top of my head has only been done for super smash bros melee
The readme is quite light on details. What is the performance? What’s the snes emulator.. is it this as well? Could this run on the real thing given it sounds .exe driven?
Pretty cool, but it definitely can't quite keep up. Just tried the Legend of Zelda, and it lags with flickering horizontal lines (not drawing fast enough?) on both snes9x and original hardware (via FXPAK Pro).
Just to put the achievement into perspective, this is code written for a console with a 1.79 MHz 6502 running on a 3.58 MHz 65816 (a 6502 with 16-bit capabilities crudely bolted on) -- that it even runs anywhere close to native speed and with so relatively few bugs is a monumental achievement considering that even though the SNES will run 6502 code, the audio and graphics hardware, etc., are entirely different between the consoles and have to be emulated.
I haven't tried this, but there is so much awesomeness in this idea. I wonder how many layers down one could pull off in the Nintendo line, even if you had to skip generations. For example could one do a Wii emulating GameCube emulating a SNES (skipping N64 as presumably too hard), emulating a NES.
This is because largely speaking, the Wii is the same hardware than the GameCube (but with improved speed). That's why the Dophin emulator, originally a GC emu, was able to extend easily to also cover the Wii.
Right. That's why I didn't skip that generation in my example. My hunch was that as a general rule for those earlier generations that it's not possible to emulate the immediately previous generation unless there was special support, but maybe you could go back 2 generations. This project being an exception. That's what lead to the
Wii -> GameCube (similar hardware) -> [not N64] SNES -> NES (demonstrated by this project)
example. Heck, if console manufacturers kept two separate branches "every other generation" emulation (odd/even generations) and then made sure each console generation had extra hardware to play the immediately prior generation's games (as some sometimes do), then every generation could play every other generation. That would be nice for players, but maybe not nice for the companies though.
Arguably the Wii is emulating parts of the GameCube. I was pretty sure that certain MMIO banks on the GameCube were being emulated by starlet rather than having hardware blocks like the GameCube probably did. That's how custom MIOS hacks floating around the internet allow for USB controllers and mass storage from GameCube games.
The Wii already runs GC games (even centres the GC's miniature DVDs if they're loaded into the Wii's slot off centre), and supports GC controllers and GC memory cards.
The GameCube could emulate the N64, that's how games like Ocarina of Time were brought to the GameCube. Remember that the first N64 emulator, UltraHLE, was released in 1999 and could run on a computer less powerful than the GameCube: https://en.wikipedia.org/wiki/UltraHLE
Very cool and something that has been in demand for many years. Ideally this would be bundled into the SD2SNES firmware so you could load NES roms onto the SNES. https://sd2snes.de (it’s open source as well)
Spent at least three years of development, but couldn't be bothered to spend 15 minutes to write up how any living soul would use it without downloading and compiling the project to see help text from the .exe.
It converts an NES rom to an SNES rom. Here are the instructions from the source[1]:
Step 1, click "Open Nes" and select a game.
Step 2, (optional) select or create a profile.
Step 3, click "Save Snes", a file will be created in the same folder as your Nes game.
Step 4, play on Snes hardware or Snes emulator.
Step 5, (Optional) click "Load SRM" and select the SRM file(s) generated by the Nes emulator,
feedback will be saved to the profile so the game can run faster after repeating steps 1-4.
IMHO that's not quite right — there's a lot of simple things Nintendo could have done to make the SNES a lot more backward-compatible with the NES, that would have had to have been done right at the beginning (and so stuck around as additional evidence of this), just as the few places that do line up (e.g. controller read-out + MMIO address) were certainly done right at the beginning.
Instead, I think what Nintendo was imagining, was that — as long as the NES and SNES were still both "alive" in the market concurrently — then companies would want to concurrently release both NES and SNES versions of their games. And, with some careful planning, development studios would be able to have a single 6502 macro-assembler codebase that compiled to either a NES or SNES target. Developers could either add MMIO address pokes / JSRs inside #ifdef-like macro structures; or just write two subroutines, one for NES and one for SNES, and then compile-time branch to determine which would get compiled in.
This makes sense of where the NES and SNES line up in compatibility, and where they diverge: they line up where there's an obvious way to support both consoles with one block of instructions; they diverge where developers would have to write separate subroutines to support the different hardware anyway, so instruction-level / memory-map-level compatibility isn't so crucial.
This seems so obviously "possible" (though definitely challenging!) that I've always wondered whether there are any games that are actual examples of this — where the game came out at the same time in both NES/FC and SNES/SFC releases, and binary analysis reveals that the majority of the game's engine code is shared between the two, with differences mostly in the subroutines/coroutines composing the "graphics engine", and in the static data.
—————
On a separate note, I feel like it would be even simpler — and a lot more performant! — to ahead-of-time recompile a NES game to run on the SNES. The NES doesn't tend to do tricks with dynamic runtime code generation, so NES games are good candidates for being statically analyzed and transpiled. (And the SNES has such a similar architecture, that even those dynamic runtime tricks might map cleanly onto the SNES, too.)