Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

UTF-8 (or legacy encoding) support

A topic by antics created 47 days ago Views: 275 Replies: 7
Viewing posts 1 to 8

I know that this has been asked before and that the rational for not implementing it is to keep the code base small. I do respect that but Decker becomes much less accessible for users (especially kids) with other languages than English. In Sweden and Finland, for example, we use åäö and there has been other posts about French and Japanese.

How much of a re factoring of the code would be needed to add UTF-8 support (or legacy or multi-byte encoding)?

(+2)

Well, the problem is, that would preclude the use of simple bitmap fonts. As in, fonts that are nothing more than a (small) bitmap. How do you even map UTF codepoints to something like that?

(+2)

First off, I applaud Decker’s goal of a small, effective code base and I’m totally fine with how it works. I absolutely adore it!

I’m just curious though if Extended ASCII has been considered.

If it’s viable, perhaps the native Decker fonts don’t support it, but import a font with a special “Extended ASCII” property and maybe Decker can apply the character codes from 128 to 255 to that specific font’s rendering. This would support most of Western Europe. UTF-8 wouldn’t be needed.

However, Extended ASCII is 8-bit versus regular ASCII being 7-bit. I have no idea how Decker deals with font characters internally, but I know that Decker renders characters beyond 127 to a question mark so it deals with Extended ASCII at some level. This also means that Extended ASCII would have to be accepted in string values in Decker’s scripting language.

I apologize if Extended ASCII has already been addressed.


Antics, I know it’s not ideal, but you could make your own font that is all uppercase letters that works with accented characters. For example, type in lowercase and the normal Latin uppercase characters are rendered; use capital letters to render accented letters. If you’re versed with programming outside of Decker, you could write a script to convert normal strings of text into your custom font compliant strings. As long as you don’t need more than 26 accented character variations, it’d work (you could use other keyboard character as well). All caps isn’t bad to read on screen for games like RPGs and such, though it might not be good for visual novel games with large amounts of text.

It’s something to consider, at least.

(+2)

I know some non-English decks have been made by basically redefining unused characters in the font to be characters with accents and such - e.g. you might not need a @ or # or $ in your text, so these can be repurposed to characters with accents. And if you're writing outside Decker you can just do a find and replace to convert it before pasting in. Very much depends on how many accented characters you need for that language though. And doesn't work well for input but if you're just looking to display text then it'll work.

(+3)

I just want to chime to signal interest in "more than ASCII". In my case this would be the German umlauts (äöüßÄÖÜ).

I would love to introduce Decker as a vehicle for tiny tools at my workplace. The situation there is probably typical for enterprise offices: Windows, Microsoft Office, the only option for automation being Microsoft Power*. Another option would be Python, but this lacks a nice GUI and all-in-one appearance.

(1 edit) (+2)

UTF-8 seems overkill, but ISO 8859-1 / Latin 1 would be amazing!

(+2)

If Lil would at least just sling bytes around without mangling them, then I could embed lil.js as a more approachable data-exploration language in a work website, or use lilt for a foreign language TUI flashcard program. As it is even `write["out" "тест"]` will write ???? to the file, and `"тест"[0] = "?"` is 1. This feels like an active limitation, that lil is taking extra steps to implement, which limits its use a lot for me when I find it very interesting as a Q-like language.

In Decker, it could be a requirement that *displaying* strings clamps them to the bitmap's range, which mangles unicode only on display - which gives the programmer a chance to map intended ranges of non-ASCII characters to a displayable range.

(+2)

The 'cl' function in c/lil.h clamps bytes to ASCII, replacing other characters with '?'. The 'clchar'/'clchars' functions in js/lil.js do the same. The 'str_add' function in c/lil.h also skips multiple-byte codepoints when filling strings, which happens with `read["somefile"]`

So with a few changes to stub some of this out, you can get a Lil that "just slings bytes around", but there might be a few landmines still with other code expecting the clamped values.