Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

Chip8 Programming Questions Sticky

A topic by Internet Janitor created Aug 25, 2020 Views: 1,831 Replies: 57
Viewing posts 1 to 17
HostSubmitted(+1)

This thread is for short questions about Chip8 programming and the Octo assembly language. If you have a more involved question, feel free to start a separate thread.

Submitted(+1)

I've noticed is that a lot of the example games have a very obvious flicker when drawing a frame. Is that a limit of the engine?

HostSubmitted(+1)

That's a great question!

The Chip8 virtual machine runs at an arbitrary speed. On the original COSMAC VIP, different instructions took a different amount of time to execute, and execution was generally quite slow. On modern interpreters, including Octo, this is generally abstracted to simply executing some number of instructions per frame. Octo allows you to adjust this speed by using the "Speed" top menu. (30 or less could be considered loosely "realistic" if you want your program to run acceptably on an HP-48 calculator, and the VIP is a little closer to 15.)

While we don't know how fast instructions will execute, we do have one precise timing reference: the "delay" register. If nonzero, the delay register counts down 60 times a second. Thus, if we use delay as a reference, we can make a fast-running interpreter burn off excess time, and allow our games to run at the same apparent speed. There's an example of this delay loop in the Chip8 programming techniques guide, as well as in many examples.

Like the delay register, the Chip8 display is updated 60 times a second- no matter where the program is in execution, or how much progress it has made in updating the display for the main loop. If objects are drawn in time on some frames and too late on other frames, they flicker.

So what do we do about it? Here are a few approaches:

  1. Increase emulation speed. Faster updates will often outright hide flicker. Some example programs suggest an appropriate speed to be run at; if the interpreter is too slow for a program, games will get sluggish and flicker more.
  2. Use a delay loop in combination with more speed.
  3. Minimize how much of the display you're redrawing on each frame. Using "clear" and redrawing everything each time is usually not a good approach for an action game. Use the xor-drawing feature of sprites to erase just the parts of the display you want to change before redrawing them.
  4. Minimize how long objects are erased for before being redrawn. The longer the gap between erasing and redrawing a sprite, the larger the chance it will look flickery. With specially-prepared sprites, it may be possible to "xor" the next frame of animation on top of the old one without erasing, giving buttery-smooth animation on even slow interpreters.
(1 edit) (+2)

> Use a delay loop in combination with more speed.

It seems doing this you could guarantee no flicker... I just glanced at the source and the DT/ST (delay/sound) special registers are decremented right after the browser does a redraw (which is with a 60 FPS setTimeout)... so if you say you have a traditional update/render loop and just put off the render until immediately after DT hits 0 then as long as you're done drawing in 16ms you shouldn't see any flicker at all.   IE for 60fps:  render, set delay 1, process input, etc... when delay is 0, render... rinse repeat.  That's my understanding from just a quick look.

To be clear I'm more interested in the VM on modern hardware... so if you're targeting say a HP-48 this might be much harder in practice.

HostSubmitted (1 edit) (+1)

Yep. If your interpreter is fast enough, and your rendering will never exceed the available time, doing what you describe will work. You could also initialize the delay timer before starting a render, so you're timing render+update and then burning off any excess in the sync loop.

How much speed a game requires is mostly up to the developer. Some very interesting ideas are possible with brute force. Games that have more modest requirements will be playable on a wider range of devices. Working within the performance envelope of old hardware can be a fun creative constraint, but you're welcome to ignore it if it isn't for you.

You could also initialize the delay timer before starting a render, so you’re timing render+update and then burning off any excess in the sync loop.

Not sure I follow, but we both may be saying the same thing.

HostSubmitted(+1)

I was just drawing a distiction between

loop
    # render
    vf := 1
    delay := vf
    # update
    loop
        vf := delay
        if vf != 0 then
    again
again

and

loop
    vf := 1
    delay := vf
    # render
    # update
    loop
        vf := delay
        if vf != 0 then
    again
again
Submitted(+2)

Something to consider, if you do happen to be targeting an HP48: actual pixel on time is very important. This is because the pixels of its LCD are quite slow to transition between on and off states, and you are effectively drawing directly to the display. If your processing time per frame is very high, it's very helpful to avoid the pattern of

  1. Undraw Sprite at old location
  2. Do Heavy Calculations to choose new position
  3. Draw Sprite at new location
  4. Delay for FPS

and instead use a pattern that resembles

  1. Retain Copy of old location
  2. Do Heavy Calculations to choose new position
  3. Undraw Sprite at old location
  4. Draw sprite at new location
  5. Copy New Location over Old Location
  6. Delay for FPS

This approach could be helpful in other situations, too, such as when you're targeting eg 30 or even 15 fps due to massive processing requirements. While the above may seem obvious, the issue it would solve usually seems insignificant in Octo. Both eg my own game, Octopeg and the lander game Sub-Terr8nia, while perfect in Octo, were initially basically unplayable on a real HP48: each game's main sprite was basically invisible. Both games were modified from the first pattern to the second, and then both played perfectly.

Submitted

Thanks for the tip, I'm considering targeting HP48. Quick question: Are the scrolling instructions heavy? How costly are they really? I have no idea how they work behind the scenes.

(2 edits)

I think this advice is relevant HP48 or not (as I read it)… though perhaps the meaning of “heavy calculations” is doing all the work then… since on modern hardware you can just crank up the cycles per frame making “heavy calculations” unimportant, where as I imagine on the HP48 you’re much more limited to what you have.

OR perhaps the HP48 is writing to the screen real-time, instead of buffering? That would make a huge difference also… do you know?

The modern stuff (Octo) is all buffered so you can do anything you like so long as the screen is “ready” before the next 60Hz tick.

Submitted(+2)

It's important to note that Octo, while allowing us to ignore some things, does also serve as a very good platform for developing games that would run on real hardware (from the eras where such hardware existed). Even though written today, this jam does typically receives a few submissions that have been created with this intent - running at realistic speeds and with the appropriate interpreter quirks. A good number of people are fairly passionate about retro computing and enjoy this aspect in particular - myself included. Sometimes the jam has been able to have the games demonstrated on the appropriate hardware, but, emulators for the specific platforms do also exist, which allow us to verify this to some degree.

To answer your question about the HP48 display - it's a little complicated? The HP48 display is updated from a buffer in the calculator's memory, but, it is on a line by line basis & spread out evenly across the entire 'frame' time - there's eg no flip() that updates the entire display all at once. The update rate for the whole display is effectively 64Hz, however the 'next' line is drawn at a rate of 4096Hz by the CPU, interrupting away from eg the interpreter as it goes, but allowing our code to run in between. This allows us to eg turn off a sprite halfway through it being drawn.

This is not quite the same as writing directly to the screen, but it shares some of the issues. The point where the line-by-line update coincides with when you 'delay' may also strobe across the display, as we have little to anchor it in place. If you might only have a sprite drawn into the buffer for eg 10% of that 64Hz tick, that may be the cause of the issues observed.

Even if that assessment is incorrect on a technical level, the strategy proposed prior has been shown to make games significantly more playable on the HP48, besides which it's generally good advice that helps keep flickering to a minimum. If any given sprite is flickering in some way, it will significantly impact its vibrancy on screen on the HP48, as the pixels aren't super quick to respond in either direction. On the plus side, it also means doing pixel queries to the display are relatively hidden.

Well, I hope that was interesting! The jam certainly attracts all sorts so, feel free to focus on the aspects that are most appealing to you!

(+2)

Hey, gearing up for the Jam and playing around with Octo a bit. One thing I'm trying to get my head around is macros and when and how they're allowed to be nested. As far as I can tell, if I try a macro invocation inside a calc or byte, it will be undefined - is that expected? Is there any way I can fully expand a (possibly nested) macro call, before it gets sent to a byte, or am I picturing things completely wrong?

HostSubmitted(+1)

Octo programs are assembled in one long linear pass. There are a few minor exceptions to this, like forward-referenced labels, but the conceptual model is not recursive or scoped. It works very much like a Forth.

A program is a sequence of tokens. Each statement, like "sprite" or "i := NNN", consumes some number of tokens from that sequence. The name of a previously-declared macro is a statement, (an invocation) which consumes zero or more tokens (as arguments), and then dumps a list of tokens (the macro body) into the program's token sequence, substituting arguments as it goes.  Another way to think about it is that macro invocations are always the head of a statement.

Constant expressions (inside { }) are their own separate domain from normal assembly, and macros usually only interact with them while substituting arguments as part of their expansion process. It's also worth noting that bindings created by ":calc" are global (much like aliases), and can be redefined as many times as needed over the course of assembly; they are not evaluated specially in relation to macro bodies or constrained to the scope of a macro where they appear.

Does that all make sense? Could you provide more context for what you're trying to accomplish?

(+2)

I'm weirdly interested in targeting the original CHIP8 implementation, with the goal being to ensure the game at least runs playably (if not particularly quickly) on the old hardware. It'll be a turn based game, so speed isn't critical. I of course lack the original hardware to test with. Questions about the "VIP" compatability mode:

- Sprite drawing trigger vblank: does this mean no more than roughly 60 sprite calls per second, but other instructions have more leeway?
- Is 15 cycles/frame a reasonable approximation of the original hardware?

And a more general question: have I missed something, or is the only way to update I with a runtime value really to use self-modifying code? The resulting code structures feel quite awkward.

Submitted(+2)

Cool! I made two games last year that targeted the original COSMAC VIP CHIP-8 implementation, and I'm working on another now. Historical accuracy is definitely one of the things I enjoy most in Octojam.

"Sprite drawing trigger vblank" means that when Octo encounters a  `sprite` instruction, it will then halt and wait for the next frame is drawn before continuing execution, so drawing sprites will be a bottleneck for speed.

I think 7 or 15 cycles per frame is reasonable. It's all an approximation, of course, since in Octo all instructions take one "cycle" (AFAIK) while it varies between instructions on COSMAC VIP. Here's a table of real-world timing for each instruction.

The best approach is probably to test your game in the Emma02 emulator, which emulates the COSMAC VIP faithfully. As far as I remember it's a little hard to understand how to use it though.

HostSubmitted(+1)

It's true that self-modifying code is necessary if you want to load an arbitrary 12-bit address in one cycle. Depending on the specific circumstances, there may be simpler alternatives:

  • just use `i:=NNN` to set a base address and then add an offset with `i += vx`. This is sufficient for many kinds of array indexing.
  • If your array is no more than 256b and elements are wider than 1 byte, you might be able to use pre-multiplied offsets.
  • If you need an array that is larger than 256b and elements are wider than 1 byte, use `i:=NNN` to set a base address and then add your offset several times: twice for 2-byte-wide elements, 4 times for 4-byte-wide elements, etc.
  • If you only need to choose between a handful of 12-bit addresses, you could use conditionals or a jump table dispatched via `jump0`.

I have some examples and more discussion in the Chip8 programming techniques guide

So, rather naively, I've just had an Octo tab open for a couple of days while I play about. However today after undoing something, the program text switched to "loading...", at which point I couldn't get anything back (I even inspected local storage through the web developer tools). I assume I'm out of luck, but just thought I'd ask if there as any magic to get my code back, before I move to a more robust editor.

HostSubmitted(+1)

You may be out of luck, I'm afraid. Octo's local storage cache is a best-effort measure to rescue your programs from closed tabs. If you want your code to stick around for the long haul, your best bet is to save your code in a local text file or export a cartridge file.

Is there any more “developer like environment” hiding in the Octo repo somewhere? IE, I save my files on disk and the web browser auto-refreshes or at the very least I manually refresh it and my program is running the updated code from disk…

I could hack this together myself but wonder if anyone else has already done the work…

Submitted(+2)

There are syntax definitions for a few text editors here https://github.com/JohnEarnest/Octo and you can use the command line mode if that suits you better.

(+2)

No worries, was due a refactor anyway and this is a good reason to set up a proper build pipeline!

Submitted(+1)

What build pipeline did you end up with? I also lost some work to the dreaded "loading..." message the other day (although I had backed it up manually in a gist a few days before, so it wasn't so bad). I'm mostly using VSCode, and the Octo extension seems outdated and non-functional.

(+1)

Heh, possibly not for everyone, but behold (please don't judge my terrible Octo code, I'm trying, I'm trying):

Somebody has written an Emacs lisp CHIP-8 emulator so I can just work inside there where I'm most happy. Next steps would be to set up Flymake for on the fly compilation and error checking and then just hook it up to rebuild and rerun on every save, and seems like a nice productive environment. At the very least, I can't lose anything anymore.

HostSubmitted(+1)

The Octo mode I'm aware of for emacs is here, and it hasn't been updated with language features over time. If you make any improvements, I'd greatly appreciate it if you submitted pull requests to that repo or provided a link to a more up-to-date mode!

(+2)

Yup, I have considered switching yaks to shave that one, but also really want to finish the jam so I'm trying to control my natural urges!

Submitted(+2)

Hey there,

I have a question regarding labels in Octo. I want to have a look-up table of labels that point to data:

: my-data
  image-1
  image-2
  image-3
: image-1
  0xFF 0x00 0xFF # Et cetera

I use this table to look up the address of the right image and load that in "i" with some self-modifying code that adds a 0xA0 (load i) prefix. And it seemed to be working quite well so far.

However, now that my program has passed the 3,5k mark I'm getting an error "Value '<much>' for label 'image-1' cannot not fit in 12 bits!" (there's a small spelling mistake in that error, by the way). This has lead me to realise that my table of labels doesn't really compile to a table of addresses, but to a table of call statements. And the fact that I AND those statements with the right nibble in my self-modifying code has kept that fact hidden from me.

So the question is... does Octo have a way to just put the address of a given label in the output binary? I looked through the manual, but didn't really find anything.

This example shows a table like this:

: table
  i := 0x123 ;
  i := 0x456 ;
  i := 0x789 ;
  i := 0xABC ;
  i := 0xDEF ;

Which would work with "i := long <label>;" too, I guess. But that would be six bytes per entry, instead of two. Seems like a waste :)

HostSubmitted(+2)

If you want to work with a "raw" 16-bit address, you can use a pair of ":byte" statements:

:const foo 0x1234
...
:byte { foo >> 8 }
:byte { foo }

When I need to do this, I usually generalize it into a macro:

:macro pointer ADDRESS {
    :byte { ADDRESS >> 8 }
    :byte { ADDRESS }
}
...
pointer foo
Submitted(+2)

Ah, that's a neat trick! But does that only work with constants? Or can I use it with labels too? :)

HostSubmitted(+2)

Yes- constants and labels are the same thing!

The only major limitation is that once you start operating on addresses with macros and doing compile-time calculations, you must declare those labels before referencing them in expressions. In some instances this may require you to use `:org` to assemble your program in a non-linear fashion.

Submitted (1 edit) (+2)

Ahh, so that's what's going wrong. That's pretty annoying :/ I have a shit-ton of data that I need to reference in this way. I guess the data is going to the top of the file, cluttering up the place, if there is no other way.

I was just getting used to the jump0 trick, until I realised that I have 48 entries in the table and 47 * 6 = 282, which is more than v0 can hold... So v0 is overflowing and giving me errors... o_0'

If I want to have the data before my code, can I do something like this?

:org end-of-code
  <data>
:org 200
: main
  <code>
: end-of-code

This particular code gives me an "Undefined name 'end-of-code'".

Edit: Never mind! I don't have to move all the code under the data, I just have to move the look-up table under the data. Problem solved! Thanks guys :)

HostSubmitted (1 edit) (+2)

:org can't deal with a forward reference either. It would kinda require the assembler to contain a constraint solver to support that sort of thing in full generality.

For an XO-CHIP game, I'd recommend using a pair of macros something like the following:

:calc CODE_POS { 0x200  }
:calc DATA_POS { 0x1000 }
:macro to-code { :calc DATA_POS { HERE }  :org { CODE_POS } }
:macro to-data { :calc CODE_POS { HERE }  :org { DATA_POS } }

The idea is to reserve the low 4kb of RAM for code, and then position all of your data afterwards. By invoking these macros in turn, we can alternate between defining data and defining code, and thus keep related information together within our source code. Then it's easy enough to define "high RAM" addresses before we need to build pointers to them, for example.

: main
    jump draw-smile
to-data
: smile 0x50 0x00 0x88 0x70
to-code
: draw-smile
    i := long smile
    sprite v0 v0 4

This will often waste a bit of space in the gap between "code" and "data", but you can manually tweak the initial value of "DATA_POS" if things get really tight.

Make sense?

edit: And, yes, you're absolutely right: for your specific case you just have to put the lookup table in the right place. :)

Submitted(+1)

Thanks for the ideas! And the crazily quick replies! :)

(1 edit) (+2)

I think you’re looking for :byte:

:macro pointer ADDR {
	:byte { 0xFF & ADDR >> 8 }
	:byte { 0xFF & ADDR      }
}

Ha, beat me to it!

Submitted(+1)

Another beginner question about the Octo assembler... Is there such a thing as

:unpack long <address>

Because I need the full 16 bit address in v0 and v1, without prefix nibble. Am I missing something, or should this be a feature request? ;)

HostSubmitted (1 edit) (+1)

It's not a built-in feature, but you could write a macro:

:macro unpack-long ADDR {
  :calc hi { 0xFF & ADDR >> 8 }
  :calc lo { 0xFF & ADDR }
  v0 := hi
  v1 := lo
}

`:unpack` itself was added before macros existed, much like `:next`.

Submitted(+1)

That's interesting! The behaviour is a bit different though: I can :unpack a label that occurs later in my code, but with this macro I can only use labels that are placed before it in my code.

The reason I want to have a 16-bit unpack is to be able to dynamically point to data in the lower 61.5k from the code in the top 3.5k. I guess the advice would then still be to restructure the code with the whole :org to-data to-code trick..? :/

Why is it that :unpack only does 12 bits? Is it also from before the 65k expansion?

HostSubmitted(+1)

Yes; :unpack also predates XO-CHIP overall.

Submitted(+2)

Alright, I need to be able to do an "unpack-long" type of thing. Seeing as my file was ~3000 lines anyway, I decided to split it up into smaller files and recombine it in a build stage. This also allows me more flexibility in moving data and code around without getting annoyed by it.

So I'm trying do this:

:org 0x1000
  # Data here
:org 0x0200
: main
  # Code here
  loop
  again

But if you try this in Octo it will give you an error: "Data overlap. Address 0x0200 has already been defined". I guess there's an implicit ":org 0x0200" at the top of the file.

This works:

: main
    jump 0x0202
:org 0x1000
  # Data here
    
:org 0x0202
  # Code here
    loop
    again

But I mean, that's just plain annoying ;P Especially since an "unpack-long" is pretty much the first thing I want to do in my ": main". Do you have any suggestions on how to clean this up..?

HostSubmitted(+2)

Your first example, or anything like

:org 0x210
    0xAB 0xCD
:org 0x200
: main
    loop again

Should work now.

I corrected a flaw in the assembler; give it another shot. It doesn't actually stem from any sort of implicit :org, but rather the fact that the "main" label is special. For programmer convenience, Octo will reserve space for a jump instruction at 0x200 to branch forward to main if it is not the first label defined in the program. Once you start re-orging things this gets a little fiddly, and it broke some assumptions I made very early in Octo's history. It would have worked fine if anything *except* main was the first definition in code-space, but now you can have those precious two bytes back.

Submitted(+1)

Oh no, you didn't just fix this issue in a matter of hours, just because I complained about it, did you? :) I hope I didn't inconvenience you too much... 

It works like a charm now, thanks for saving my two bytes ;P

(+2)

Now that is dedication! :-)

Submitted(+2)

Do you think you can explain how the "random" function works/how to use it? I've read the manual and tried to look through the code for Garlicscape, but I can't wrap my head around it.

Submitted (2 edits) (+2)

As far as I understand it, which is probably not that far ;), random takes a bitmask as an argument. The bitmask is a value that gets ANDed with a random number from 0 to 255. So:

v0 := random 0xFF   # v0 is a random number from 0 to 255
v0 := random 0x01   # v0 is randomly 0 or 1
v0 := random 0x07   # v0 is a random number from 0 to 7
v0 := random 0x08   # Careful: v0 is randomly 0 or 8 ;)
HostSubmitted (1 edit) (+2)

Exactly; the argument is an AND mask.

Sometimes it's clearer to write the argument to `random` in binary (e.g. `0b1100`) instead of hex or decimal. Then just imagine that every 1 in that mask could wind up as either 1 or 0 in the result.

Deleted post
Submitted

Treat i more like a pointer. You can advance it to get to another memory location. https://johnearnest.github.io/Octo/docs/Manual.html#statements

It’s easy enough to load it (using self modifying code)… but you can’t ever retrieve it’s value (that’s hidden inside the emulator)… so if you need that type of indirection then you manually need to use your OWN external pointer and track it’s value, loading it into i when necessary… which of course is slower.

Deleted post
Submitted

Try reading here, search for load and look at the examples how they are used: https://github.com/JohnEarnest/Octo/blob/gh-pages/docs/Chip8%20Programming.md

HostSubmitted(+1)

First, the simplest case: you have a single byte in memory and you want to load it into a register:

: data  0xAB
...
i := data     # set i to the base address
load v0       # load from i into v0

Slightly more complex: you have an array of up to 256 bytes, and you want to load one by index into a register:

: data  1 1 2 3 5 8 11
...
i := data    # set i to the base address
i += v1      # add an offset (in this case, the value in v1)
load v0      # load data[v1] into v0

Same as before, but now we want 2 bytes:

i := data    # set i to the base address
i += v1      # add an offset. items are two bytes wide...
i += v1      # (we alternatively could have pre-doubled v1.)
load v1      # v0=data[2*v1], v1=data[(2*v1)+1]

Are those cases clear?

Deleted post
Submitted(+1)

Another Octo related question: In the generated HTML file, the music doesn't start playing until I press a key. I assume that's one of those browser limitations, where you need a certain amount of "user engagement" before you're allowed to play music?  Or is it a bug in the Javascript?

HostSubmitted(+1)

Yeah; it's a browser limitation, I'm afraid. To start an audio context playing, you must kick it off as a direct result of a user input action. If the subsystem hasn't been set up yet, Octo will make an attempt in response to keyboard or touch input.

Submitted(+2)

Too bad, it ruins the atmosphere of the dramatic entrance to the game ;P But let me add to that: Octo is an amazing platform, and the XO-Chip extensions are a great addition to Chip-8. Thanks a lot for both! :)

Can you just begin the dramatic entrance AFTER the first time a key is seen?

Submitted(+1)

Yes, I could. I thought about adding another "Press any key" to the first screen (that doesn't have any sound, and currently just transitions after a given time). But then I'd have to redesign that screen to make room for the extra text. Ah well, you have to know when to stop sometimes ;) This is not an issue with Octo or my program, it's just a little annoyance because of the fact that we run it in a browser. That's okay with me.

How would I go about simulating an array or stack? Making a FILO stack is easy but a FIFO stack seems really complicated.

HostSubmitted

The Octo Repository has a few examples of different approaches to simple data structures and algorithms, and the FAQ has some simple examples of working with memory.

So long as the number of elements fit in a v-register, indexing into an "array" requires setting "i" to the base of the array, adding your offset, and then using load/save. A stack (FILO) requires persistent storage of the index of the top of the stack, perhaps in a reserved v-register. A queue (FIFO) is most easily implemented as a circular buffer, and requires two indices, for a "head" and "tail".