RAM/visual limitations are quite easy to make in my opinion. 8-bit computers like ZX Spectrum had RAM bound to screen, settings, code etc. Implementation could be simulated by byte array. Console code puts data there, host code interprets it. You can have part of code which reads specific range of array data as pixels and display it on screen (GPU), the other range is sound (Sound chip), the other sprites, palette, etc. ROM can be set of sprites, letters, maybe some global parameters like palette. I think the hardest restriction is console language and sandboxing