Skip to main content

Indie game storeFree gamesFun gamesHorror games
Game developmentAssetsComics
SalesBundles
Jobs
TagsGame Engines

SilverpineGame

8
Posts
37
Followers
A member registered 67 days ago · View creator page →

Creator of

Recent community posts

I limited it to about 33 words as a lazy way to have guarantees on the maximum possible amounts of tokens left after updating the long term memory of an NPC without having to do multiple API calls asking for tokenization, since I didn't think people would type that much in one message.

I'll find a way to uncap it again for 0.4, which will probably be ready on monday.

I'm at a loss how this bug made it through the cracks since I sold a lot of items during testing but I'll release a patch for it tomorrow.

I double-checked everything on my end, so my only guess at this point is that's your're using an ancient CPU without AVX2 support, or something is preventing KoboldCPP from creating the _MEI folder in temp.

If you're using a very old CPU (pre 2013 for intel and pre 2015 for AMD) you could try replacing your koboldcpp.exe with https://github.com/LostRuins/koboldcpp/releases/download/v1.76/koboldcpp_oldcpu.... and/or selecting "Use Vulkan (Old CPU)".
If you're not using an ancient CPU you could try disabling your antivirus, but I doubt that's the reason.

This seems very exotic so I can't look into it until I come back home, but it should work if you select Vulkan instead of cuBLAS.

If you only have 6 gb of graphics memory combined with 16 gb of ram, it might choke while loading the model if you have your other applications taking up too much ram.

If you have exactly 10 gb of graphics memory, the game will fully offload the model to the GPU which should take 9.17 gb, but I have no way of verifying if this actually works under real-world conditions since I only have 8 gb myself.


To see what the problem is, open a command prompt in the StreamingAssets folder like on the picture and do this:

koboldcpp.exe --model "model.gguf" --usecublas --gpulayers 17 --multiuser --skiplauncher --highpriority

Replace --usecublas with --usevulkan if you don't have an Nvidia GPU.
Replace 17 with 27 if you have 8 gb or vram, or 43 if you have 10 gb or more.

This way it won't close when it throws the error.
If it's running out of memory it should throw something like:

ggml_vulkan: Device memory allocation of size 1442316288 failed.
ggml_vulkan: vk::Device::allocateMemory: ErrorOutOfDeviceMemory
llama_kv_cache_init: failed to allocate buffer for kv cache
llama_new_context_with_model: llama_kv_cache_init() failed for self-attention cache

You can use the KoboldCPP GUI to manually adjust things until it works if this is the case, since the game just use whatever KoboldCPP process is running in the background if you launch the game while KoboldCPP is already running.

Do let me know what the problem is though so I can fix it for the next version.

Yes.

It's just the vanilla instruct version of Mistral AI's NeMo 12B (compressed by bartowski). Nothing you type ever leaves your machine, dialog is only stored in your save file.

Everything should set itself up automatically when you start the game, as long as the model file is in the right place.

If you don't have the game on an SSD it might take a while before it's done loading.