Thank you so much for the feedback!
Unfortunately, Ollama doesn't have a "redo" function, and we apologize for that.
Anyway, the proposed model (quantized to 4 bits) is also available as a non-quantized version (16GB) or quantized to 8 bits (8.5GB). In general, quantization degrades performance, so if you have enough RAM and a sufficiently powerful CPU, you could try using the less quantized versions to see if the issues you encountered are reduced while maintaining a speed that suits your needs.
To download and use the 16GB model, modify the "pull" and "run" files by setting:
```
ollama pull tarruda/neuraldaredevil-8b-abliterated:fp16
ollama run tarruda/neuraldaredevil-8b-abliterated:fp16
```
To download and use the 8.5GB model, modify the "pull" and "run" files by setting:
```
ollama pull lstep/neuraldaredevil-8b-abliterated:q8_0
ollama run lstep/neuraldaredevil-8b-abliterated:q8_0
```
You can find information on these versions here: