Post by E-Paper Adventures in Local LLM: Text Adventures for All! comments

Local LLM: Text Adventures for All! » Comments

Viewing post in Local LLM: Text Adventures for All! comments

Thank you so much for the feedback!

Unfortunately, Ollama doesn't have a "redo" function, and we apologize for that.

Anyway, the proposed model (quantized to 4 bits) is also available as a non-quantized version (16GB) or quantized to 8 bits (8.5GB). In general, quantization degrades performance, so if you have enough RAM and a sufficiently powerful CPU, you could try using the less quantized versions to see if the issues you encountered are reduced while maintaining a speed that suits your needs.

To download and use the 16GB model, modify the "pull" and "run" files by setting:

```

ollama pull tarruda/neuraldaredevil-8b-abliterated:fp16

ollama run tarruda/neuraldaredevil-8b-abliterated:fp16

```

To download and use the 8.5GB model, modify the "pull" and "run" files by setting:

```

ollama pull lstep/neuraldaredevil-8b-abliterated:q8_0

ollama run lstep/neuraldaredevil-8b-abliterated:q8_0

```

You can find information on these versions here:

https://ollama.com/tarruda/neuraldaredevil-8b-abliterated

https://ollama.com/lstep/neuraldaredevil-8b-abliterated

voidtninja97 days ago (2 edits)

hey thanks, i actually was wondering if you had the larger versions available. i'll test out the 8-bit version fairly soon. if that runs nearly as smoothly, but with better output, i may even have to test the big boy.

edit, how can i modify the batch file to launch in the 8-bit rather than 4-bit model?
second edit, i figured it out, i didnt notice the "prefix" in the model path changed between versions at first.

E-Paper Adventures96 days ago

Good job! :) Please note also the ":q8_0" or ":fp16" at the end of the model version name.

itch.io

Viewing post in Local LLM: Text Adventures for All! comments