Post by edorfaus in Code sharing megathread

Viewing post in Code sharing megathread

I think the start-of-disk function index system will mostly be useful for when you're doing it manually and referring to the functions by ID yourself.

If your assembler/compiler/preprocessor/IDE (or whatever you want to call it) lets you use "FUNC <name>" and inserts the code to jump there on its own, keeping track of the IDs for you - then it must also keep track of the addresses for those IDs, which suggests it can probably just as easily use those addresses directly, without needing the IDs or such an index table.

Though, I suppose we would want to minimize the number of instructions (or other resources) needed to do a call, to maximize what's available for other code, so I guess havin the table might help with that.

Note: the rest of this post ended up being a rather in-depth analysis of the available approaches that such an assembler/etc. could use. Feel free to correct me if I've missed something, but apparently there's no cut-and-dried "best" approach, and it depends on circumstances... Also, I originally intended to respond to other parts of your post as well, but this is too long already, so I'll add separate replies for those later.

Let's see now. I don't think the JMP to the loader routine can be avoided (unless you can put the call at the end of the memory area), so that's one instruction, but since it's unavoidable we can ignore it for the purpose of comparing the different approaches.

So the main thing we need is to somehow get the actual disk address to load the function from into a register for the loader to use.

With index table

Now, if we have and use an index table somewhere on disk, what would the code for that look like?

Well, to start with, it means we must have a GETDATA instruction to load the actual address from the index table, based on the index.

GETDATA supports an immediate (constant) value, so can we use that for the index, to load the address directly, with just one instruction?

*checks the docs* ... well, that constant value gets shifted up by 10 bits (documentation says 11, but I think that's incorrect, and 11 would be even worse for this). This means that it can only be used to load from addresses that are multiples of 1024. Which means we get one function per 4KiB of disk space, minus one for address 0 since that's where the initially bootloaded program is. Not really useful with the default 256-word disk.

So, we can't practically have the table index inside that one instruction, which means it must be stored either in memory or in a register.

The memory variant means having the function index stored in memory somewhere, probably loaded as part of the current function, and having the GETDATA instruction point to that address. The obvious way to do this would leave each call site using two memory addresses.

Trying to be clever and putting the instruction in the loader doesn't really help either, since we'll either be limited to one call per function, or have to spend instructions on modifying memory (either the instruction or the address with the index), which will probably take more addresses total.

The register variant means getting the function index into a register somehow, which will (generally) require at least one instruction, but in return it's possible to move the GETDATA instruction to the loader.

If we dedicate a register to it and limit ourselves to 256 functions (minus the initially bootloaded program), we can use SET to have it use just one instruction/memory address per call site.

If we don't dedicate a register to it, or need more functions, we have to either store the function index in memory somewhere (to use MOVI), or use multiple instructions (e.g. MATH to clear + SET) - so at least two addresses per call site. This is no better than the memory variant, so we can disregard this option.

So, with an index table, we end up with these options to choose from:
- two memory addresses per call site
- one memory address per call site, plus one in the loader and one register, with limitations

This is in addition to the space taken by the table itself, and the JMP for each call site.

Without index table

Now, if we don't have an index table, and just use the address directly, what can we do then?

Using MOVI will work, and will take two memory addresses - one for the instruction, one for the function address. Simple and straightforward.

Using SET can work, and will take just one instruction, but requires all functions to start at addresses before 256 (since we can only use 8 bits), and that we're careful about how the load-address register is used (since the other 3 bytes must remain 0).

However, the loader is currently using R2 for the load address (to use IFJMP), which I doubt we can be that careful with, so we'll need to use another one, and copy it into R2 as needed. That copy instruction can be in the loader, but this will essentially require a dedicated register.

It's also possible to use multiple SET instructions, to ease (or even eliminate) the limitations, but that means using between two and four instructions per call site, without being any better than using MOVI, so we can disregard this option.

Using PMOV or MATH essentially requires already having the address (or something that can easily be turned into the address) in a register, which seems rather unlikely in the general case, without spending extra instructions on it. Generating a function index using this method is somewhat more likely, but not much. So either way, this can only be used for special cases.

So, without an index table, we end up with these options to choose from:
- two memory addresses per call site
- one memory address per call site, plus one in the loader and one register, with limitations

This is in addition to the JMP for each call site, but avoids taking space for a table.

These options are very similar to the ones with a table, though not quite the same, in that the limitations of the second option are a bit different.

Summary of options

The first option for both uses the same number of instructions, addresses and clock cycles, which suggests that with those approaches, the table is an unnecessary indirection that just takes extra disk space.

The second option for both also uses the same number of instructions, addresses and clock cycles. But they have different limitations.

Now, if the disk is no larger than 256 addresses (like the default disk), then the limitations are void for both, since they are based on needing more than 8 bits to address things. So in this case the table again just takes extra disk space.

However, if the disk is larger than that, then the limitations come into play, and then the table is actually useful as it allows us to have functions stored anywhere on the disk instead of just in the first 256 addresses. However, even with the table, we are limited to at most 256 functions, minus the space taken by the initial program.

Which approach to choose

The above suggests that the choice of approach will mainly be based on which resources are most dear, and how many function calls you have.

First off, if you need more than 256 functions, you obviously need to use the first option, and thus don't need the function table.

If you need all the registers you can get, but have room to spare in memory, then again the first option is probably best.

If you can spare a register, but have lots of calls and little room in memory, then the second option is probably best, as it saves you one word of memory per function call.

If you have little disk space, and at least a couple calls, then the second option without a table is probably best, as it saves you both the table and one word per function call (not just in memory, but on disk).

If your dearest resource is your sanity (because you don't have an assembler/etc. that does this work for you), then I think the second option with a table is probably best, closely followed by the first option with a table, with the non-table options pretty far behind...

edorfaus6 years ago

Aaargh. Of course. Just moments after posting, and I realized I missed something rather significant... (The worst part is that I'm pretty sure I had noticed it early on and started to write something about it, but it apparently got lost somewhere while editing/expanding, and then I forgot about it...)

For both of the with-table options, I missed/forgot that GETDATA always puts the result in R1, while the loader expects to get the load address in R2, so it can use it with IFJMP without having to move it there first.

This means that those options require another instruction, to move the address from R1 to R2, either per call site or in the loader itself.

Thus, the with-table options are always more expensive (in both memory size and clock cycles) than the non-table options, instead of being equivalent.

... I do think that most of the post is still valid, though, including the last section.

CliffracerX6 years ago

PSA: There is an edit button, so in the future, if you post something and go "aw crap, I should've remembered to add [insert thing here]!" - you have a way out without double-posting. I know I've done that before - it happens to the best of us! At least this isn't Twitter or something, right? :P

I am inclined to agree with pretty much all your conclusions on efficiency analysis - the tables are pretty much only good for human sanity when developing without an IDE or similar. Having to search & replace for address changes is an absolute pain. I don't have much in the way of comments beyond that - brain is a bit numb from it being late where I am, and from programming a kernel for the past several hours. Will re-read everything in the morning when I'm less zombified. >.<

edorfaus6 years ago(+1)

Yeah, I know there's an edit button, but at the time I really didn't feel like doing the editing required, since it wasn't just a matter of "should have added this" - to edit in the correction I would have had to change and rewrite significant chunks of the text, such as the summary, and it was already late (getting towards early), and it would have taken enough time that others might have read the first version in the meantime, so I'd have to add something about the correction anyway, and... yeah.

I suppose I could have added my reply as an "EDIT:" at the bottom or something, but it felt more appropriate as a reply, since it was a pretty significant correction. (And in my mind, double-posting is something entirely different than responding to yourself. But I guess I might be out of touch on that.)

itch.io

Viewing post in Code sharing megathread

With index table

Without index table

Summary of options

Which approach to choose