OK, I now have another program I want to share. Or, kind of two. But one uses the other, so... The combination is one program? Maybe? I dunno.
This program also works in the default mode, and is primarily meant as a demonstration of a pretty old technique for handling code that won't fit in RAM. (I was introduced to it back in my DOS days, using Turbo Pascal... 5 I think? or 6?)
As such, this demonstration program basically goes the other way than my previous one, being far larger - in fact, as currently written, it fills most of the disk drive, taking up 240 addresses. Many of those are due to NILLISTs, though, which I included primarily because they made the development work easier. (More details below.)
The technique is known as overlays, and essentially entails loading only a part of the program at any one time, and then when a different part is needed, that part is loaded instead, on top of the other code, thus overlaying it - hence the name.
There's essentially two main parts of this program - one is the overlay loader, which stays resident in RAM at all times, and can be used by any program to load a chunk of code from the disk - while the other part is the demonstration program that uses overlays.
The overlay loader
In my implementation, the bootloader is also the overlay loader, since the task is essentially the same - in fact, the exact same code is used for the initial bootloading as for the overlay loading. Thus, my overlay loader is in ROM, not on disk - and isn't counted in the size of the demonstration program. This bootloader does still work for programs that don't use overlays, like the original win program.
Each overlay on disk is expected to have a one-word header, which contains the last address to be loaded. This works for non-overlay programs being bootloaded because they start at address 0, which makes the number of words to load be the same as the last address to load.
A few things worth noting:
- The loader initializes register 15 to 1, and expects you to never change it.
- The loader uses register 14 internally, and expects you to never change it.
- When called, the loader expects register 2 to be the disk address to load from.
- The loader changes registers 1, 2 and 3 (and 14), but leaves the rest untouched.
- Calling the loader is typically done with "JMP 3 9".
- Each overlay can be at most 22 words long (so it doesn't overwrite the loader) in the game's default mode.
- After loading an overlay, the loader always jumps to address 0.
Of course, all of these (except the last) can be ignored if you don't intend to use overlays. In that case, the loaded program can use 4 more words (26 total in default mode) before the loader has overwritten so much of itself that it can't load any more.
The main reason for using the suggested way of calling the loader is that it works even if the machine has been given more memory, as long as the loader itself has been adjusted accordingly (which you probably would want to do anyway to enable loading larger programs) - without needing to modify the program or overlay being loaded.
Here's the code for the {boot,overlay }loader:
SET 15 3 1 // R15=1 : initialize register for math use
MOVI 14 29 // R14=ram[29] : load the save-to-RAM instruction into a register
JMP 3 9 // jump to loader; R2 is already 0 which is where we should start
NILLIST 19
GETDATA 1 3 2 // R1=read(disk, R2) : read disk address of last instr. to load
MATH 1 3 5 // R3=R1 : set loop max register to that disk address
SET 14 3 0 // R14[address]=0 : initialize RAM address
MOVO 14 29 // ram[29]=R14 : update the save-to-RAM instruction; start of loop
MATH 15 14 0 // R14++ : increment RAM address
MATH 15 2 0 // R2++ : increment disk address
GETDATA 1 3 2 // R1=read(disk, R2) : read program data from disk
MOVO 1 0 // ram[0]=R1 : save loaded data to RAM; note: self-modified
IFJMP 2 5 3 // loop: jump 5 back if R2<R3
JMP 1 0 // start the loaded program
The demonstration program
The demonstration program is another implementation of the win screen, this time by using hard-coded line drawing operations rather than storing the screen as an image and loading that.
It does this without modifying itself at all, other than by loading overlays - the disk doesn't contain even a single MOVO instruction, nor any disk read/write instructions. (It could, it just doesn't need to, so it doesn't.)
There are two library procedure overlays, one for drawing horizontal lines, and one for drawing vertical lines, with parameters for where to draw that line and with what color (using registers to pass the arguments).
The program starts by calling horz-line twice for the top and bottom borders, then vert-line twice for the left and right borders, then calls horz-line in a loop to fill the background. (Thus showing overlays can be reused.)
Then, for the characters, it has a simplified copy of the vert-line procedure, which it calls 10 times for the various lines of the characters, with a few explicitly set pixels here and there.
Thus, in total it ends up drawing 20 lines plus setting 3 pixels, all done with actual code (instead of looping though a list of lines to draw, as then it would just be another image file viewer, just for a vector image instead of a raster image).
Obviously, all this code doesn't fit in 32 words of RAM, nevermind the 26 that are the most my bootloader can load from disk.
So, instead, I split it up into 10 overlays that are each 22 words or less (since that's the most my loader can load without overwriting itself in the default mode).
The exact length of each overlay varies, and some of them take advantage of this by first doing some initialization and then loading a smaller overlay on top of that initialization code, which runs and then resumes the code of the first overlay without having to reload it.
Of course, having to stop to load the next bit of code means that this runs slower (with pauses) than if everything was in RAM, but in return it's not as limited by RAM size. (Also, this demo program could still be optimized a bit.)
For ease of development (not having to keep modifying the addresses everywhere if I change the size of an overlay), the disk is split up into chunks of 24 words, that each contain one overlay. This "wastes" some space between the overlays, which is filled with NILs, but made things easier on me, so I considered it worth the tradeoff. (Of course, if I had needed more than 10 chunks, I would have run out of space. Luckily, I didn't.)
I actually wrote each overlay in a separate file, and then copied the code into a combined file with the disk formatting (and editing in the actual disk addresses), but since I did have to go back and edit things sometimes, having the chunks was nice. The combined file is the most useful one, so it's the one I've included here.
Here's the code for the demonstration program:
// main 0: draw top and bottom border
// start address: 0 = 00000000000000000000000000000000
// length: 16
DATAC 00000000000000000000000000010000 // last address: 16
JMP 0 4
DATAC 11000000000000000000000000000000 // Start pixel top-left
DATAC 00000000000000000000000000010000 // End X position
DATAC 00000000000000000000000000011000 // Disk address of hline proc
MOVI 10 1 // R10=ram[1] : load start pixel top-left
MOVI 11 2 // R11=ram[2] : load end X position
MOVI 2 3 // R2=ram[3] : load disk address of hline proc
JMP 3 9 // Jump to overlay loader : load and run the hline proc
MATH 10 2 5 // R2=R10 : previous start pixel
MOVI 10 14 // R10=ram[14] : load start pixel bottom-left
MATH 10 3 5 // R3=R10 : new start pixel
IFJMP 1 0 1 // if R2!=R3 (old start pixel is not same as new) then run again
MOVI 2 15 // R2=ram[15] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 11000011100000000000000000000000 // Start pixel bottom-left
DATAC 00000000000000000000000001001000 // Address of next part of main
// last address: 16
NILLIST 7 // alignment for the next overlay part
//
// subproc: draw horizontal line
// inputs:
// R10: start pixel and color
// R11: end X position (not inclusive - will not be drawn)
// overwrites: R1, R2, R3
// start address: 24 = 00000000000000000000000000011000
// length: 8
DATAC 00000000000000000000000000100000 // last address: 32
MATH 11 3 5 // R3=R11 : copy end X position
MATH 10 1 5 // R1=R10 : copy start pixel pos/color
MATH 2 2 1 // R2=0 : clear R2 so start pos will be the only thing in it
PMOV 1 2 2 5 26 1 // R2=R1[posX] : copy start position
SETDATA 0 3 1 // write current pixel to screen
MATH 15 2 0 // R2++ : increment X
PMOV 2 1 28 31 26 0 // R1[posX]=R2 : copy next X position
IFJMP 2 3 3 // if R2<R3 (curX < endX) then continue loop
// last address: 32
NILLIST 15 // alignment for next overlay part
//
// subproc: draw vertical line
// inputs:
// R10: start pixel and color
// R11: end Y position (not inclusive - will not be drawn)
// overwrites: R1, R2, R3
// start address: 48 = 00000000000000000000000000110000
// length: 8
DATAC 00000000000000000000000000111000 // last address: 56
MATH 11 3 5 // R3=R11 : copy end Y position
MATH 10 1 5 // R1=R10 : copy start pixel pos/color
MATH 2 2 1 // R2=0 : clear R2 so start pos will be the only thing in it
PMOV 1 2 6 8 23 1 // R2[posY]=R1[posY] : copy start position
SETDATA 0 3 1 // write current pixel to screen
MATH 15 2 0 // R2++ : increment Y
PMOV 2 1 29 31 23 0 // R1[posY]=R2[posY] : copy next Y position
IFJMP 2 3 3 // if R2<R3 (curY < endY) then continue loop
// last address: 56
NILLIST 15 // alignment for next overlay part
//
// main 1: draw left and right border
// start address: 72 = 00000000000000000000000001001000
// length: 16
DATAC 00000000000000000000000001011000 // last address: 88
JMP 0 4
DATAC 11000000100000000000000000000000 // Start pixel top-left
DATAC 00000000000000000000000000000111 // End Y position
DATAC 00000000000000000000000000110000 // Disk address of vline proc
MOVI 10 1 // R10=ram[1] : load start pixel top-left
MOVI 11 2 // R11=ram[2] : load end Y position
MOVI 2 3 // R2=ram[3] : load disk address of vline proc
JMP 3 9 // Jump to overlay loader : load and run the vline proc
MATH 10 2 5 // R2=R10 : previous start pixel
MOVI 10 14 // R10=ram[14] : load start pixel top-right
MATH 10 3 5 // R3=R10 : new start pixel
IFJMP 1 0 1 // if R2!=R3 (old start pixel is not same as new) then run again
MOVI 2 15 // R2=ram[15] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 11111100100000000000000000000000 // Start pixel top-right
DATAC 00000000000000000000000001100000 // Address of next part of main
// last address: 88
NILLIST 7 // alignment for next overlay part
//
// main 2: draw background
// start address: 96 = 00000000000000000000000001100000
// overwrites: R1, R2, R3, R10, R11, R12
// length: 20
DATAC 00000000000000000000000001110100 // last address: 116
JMP 0 4
DATAC 10000100100000000000000000000000 // Start pixel top-left
DATAC 00000000000000000000000000001111 // End X position
DATAC 00000000000000000000000000011000 // Disk address of hline proc
MOVI 10 1 // R10=ram[1] : load start pixel top-left
MOVI 11 2 // R11=ram[2] : load end X position
MOVI 2 3 // R2=ram[3] : load disk address of hline proc
JMP 0 8 // jump to the rest of the initialization
MATH 15 12 0 // R12++ : increment Y position
PMOV 12 10 29 31 23 0 // R10[posY]=R12[posY] : copy new Y position
MATH 12 2 5 // R2=R12 : copy Y position for comparison
MOVI 3 18 // R3=ram[18] : load end Y position
IFJMP 1 0 3 // if P2<P3 (curY < maxY) then continue loop
MOVI 2 19 // R2=ram[19] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
MATH 12 12 1 // R12=0 : clear value for insertion of Y position bits
PMOV 10 12 6 8 23 1 // R12[posY]=R10[posY] : copy start Y position
JMP 3 9 // Jump to overlay loader : load and run the hline proc
DATAC 00000000000000000000000000000111 // End Y position
DATAC 00000000000000000000000001111000 // Address of next part of main
// last address: 116
NILLIST 3 // alignment for next overlay part
//
// main 3: draw W, part 1 of 2
// start address: 120 = 00000000000000000000000001111000
// overwrites: R1, R2, R3
// length: 22
DATAC 00000000000000000000000010001110 // last address: 142
MOVI 1 12 // R1=ram[12] : copy start pixel pos/color
MOVI 3 13 // R3=ram[13] : copy end Y position
JMP 1 15 // jump to drawing routine
MOVI 3 13 // R3=ram[13] : copy end Y position
IFJMP 0 4 1 // if R2!=R3 then second line is done already
SET 1 0 01001010 // R1[0]=.. : increment X position, keep rest
SET 3 3 00000111 // R3[3]=7 : update end Y position
JMP 1 15 // jump to drawing routine
SETDATA 0 0 0100110110000000000000 // draw top pixel for center of W
SETDATA 0 0 0100111000000000000000 // draw bottom pixel for center of W
MOVI 2 14 // R2=ram[14] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 01000100100000000000000000000000 // Start pixel, top-left
DATAC 00000000000000000000000000000100 // end Y position, middle of W
DATAC 00000000000000000000000010010000 // Address of next part of main
MATH 2 2 1 // R2=0 : clear R2 so start pos will be the only thing in it
PMOV 1 2 6 8 23 1 // R2[posY]=R1[posY] : copy start position
SETDATA 0 3 1 // write current pixel to screen
MATH 15 2 0 // R2++ : increment Y
PMOV 2 1 29 31 23 0 // R1[posY]=R2[posY] : copy next Y position
IFJMP 2 3 3 // if R2<R3 (curY < endY) then continue loop
JMP 1 3 // jump back to resume drawing commands with next line
// last address: 142
NILLIST 1 // alignment for next overlay part
//
// main 4: draw W, part 2 of 2
// note: uses code from the previous part
// start address: 144 = 00000000000000000000000010010000
// overwrites: R1, R2, R3
// length: 13
DATAC 00000000000000000000000010011101 // last address: 157
MOVI 1 10 // R1=ram[10] : copy start pixel pos/color
MOVI 3 11 // R3=ram[11] : copy end Y position
JMP 1 15 // jump to drawing routine
MOVI 3 11 // R3=ram[11] : copy end Y position
IFJMP 0 4 1 // if R2!=R3 then second line is done already
SET 1 0 01010010 // R1[0]=.. : decrement X position, keep rest
SET 3 3 00000111 // R3[3]=7 : update end Y position
JMP 1 15 // jump to drawing routine
MOVI 2 12 // R2=ram[12] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 01010100100000000000000000000000 // Start pixel, top-right
DATAC 00000000000000000000000000000100 // end Y position, middle of W
DATAC 00000000000000000000000010101000 // Address of next part of main
// last address: 157
NILLIST 10 // alignment for next overlay part
//
// main 5: draw I, and draw N part 1 (first line)
// note: uses code from a previous part
// start address: 168 = 00000000000000000000000010101000
// length: 15
DATAC 00000000000000000000000010110111 // last address: 183
MOVI 1 11 // R1=ram[11] : copy start pixel pos/color for I
MOVI 3 13 // R3=ram[13] : copy end Y position
JMP 1 15 // jump to drawing routine
MATH 1 2 5 // R2=R1 : copy current pixel pos/color
MOVI 3 12 // R3=ram[12] : copy start pixel pos/color for N
IFJMP 0 4 2 // if R2>R3 (cur pix > N start) then second line is done already
MATH 3 1 5 // R1=R3 : copy start pixel pos/color for N
MOVI 3 13 // R3=ram[13] : copy end Y position
JMP 1 15 // jump to drawing routine
MOVI 2 14 // R2=ram[14] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 01011100100000000000000000000000 // Start pixel for I
DATAC 01100100100000000000000000000000 // Start pixel for N, top-left
DATAC 00000000000000000000000000000111 // end Y position
DATAC 00000000000000000000000011000000 // Address of next part of main
// last address: 183
NILLIST 8 // alignment for next overlay part
//
// main 6: draw N part 2 (diagonal)
// note: uses code from a previous part
// start address: 192 = 00000000000000000000000011000000
// length: 13
DATAC 00000000000000000000000011001101 // last address: 205
MOVI 1 10 // R1=ram[10] : copy start pixel pos/color
MOVI 3 11 // R3=ram[11] : copy end Y position
JMP 1 15 // jump to drawing routine
MOVI 3 11 // R3=ram[11] : copy end Y position
IFJMP 0 4 1 // if R2!=R3 then second line is done already
SET 1 0 01101110 // R1[0]=.. : increment X position, keep rest
SET 3 3 00000110 // R3[3]=6 : update end Y position
JMP 1 15 // jump to drawing routine
MOVI 2 12 // R2=ram[12] : load disk address of next part of main
JMP 3 9 // Jump to overlay loader : load and run the next part of main
DATAC 01101001000000000000000000000000 // Start pixel, part 1 of diagonal
DATAC 00000000000000000000000000000100 // end Y position, middle of N
DATAC 00000000000000000000000011011000 // Address of next part of main
// last address: 205
NILLIST 10 // alignment for next overlay part
//
// main 7: draw N part 3 (last line), and draw exclamation mark, and halt
// note: uses code from a previous part
// start address: 216 = 00000000000000000000000011011000
// length: 13
DATAC 00000000000000000000000011100101 // last address: 229
MOVI 1 10 // R1=ram[10] : copy start pixel pos/color for N line
MOVI 3 11 // R3=ram[11] : copy end Y position
JMP 1 15 // jump to drawing routine
MOVI 3 11 // R3=ram[11] : copy end Y position
IFJMP 0 4 1 // if R2!=R3 then second line is done already
MOVI 1 12 // R1=ram[12] : copy start pixel pos/color for excl. mark
SET 3 3 00000101 // R3[3]=5 : set end Y position
JMP 1 15 // jump to drawing routine
SETDATA 0 0 0111101100000000000000 // draw bottom point for excl. mark
HLT // We're done drawing the win screen, so halt here
DATAC 01110000100000000000000000000000 // Start pixel, top of N line
DATAC 00000000000000000000000000000111 // end Y position, bottom of N
DATAC 01111000100000000000000000000000 // Start pixel, top of excl. mark
// last address: 229
NILLIST 10 // alignment for next overlay part