[Help] Trying to run a local Story telling model with KoboldCpp (kbin.social)

submitted 1 year ago* (last edited 1 year ago) by darkeox@kbin.social to c/localllama@sh.itjust.works

16 comments fedilink hide all child comments

Hi,

Just like the title says:

I'm try to run:

https://huggingface.co/TheBloke/WizardLM-Uncensored-SuperCOT-StoryTelling-30B-SuperHOT-8K-GGML

With:

koboldcpp:v1.43 using HIPBLAS on a 7900XTX / Arch Linux

Running :

--stream --unbantokens --threads 8 --usecublas normal

I get very limited output with lots of repetition.

Illustrattion

I mostly didn't touch the default settings:

Settings

Does anyone know how I can make things run better?

EDIT: Sorry for multiple posts, Fediverse bugged out.

you are viewing a single comment's thread
view the rest of the comments

[-] rufus@discuss.tchncs.de 1 points 1 year ago* (last edited 1 year ago)

Try MythoMax. I've had good results with that. For storytelling and all kinds of stuff. See if it's the model. I think some of the merges or super-somethings had issues or were difficult to pull off correctly.

also try the option --usemirostat 2 5.0 0.1 That overrides most options and automatically adjusts things. In your case it should mostly help rule out some possibilities of misconfiguration.

[-] darkeox@kbin.social 1 points 1 year ago

I'll try that Model. However, your option doesn't work for me:

koboldcpp.py: error: argument model_param: not allowed with argument --model

load more comments (3 replies)

this post was submitted on 11 Sep 2023

9 points (100.0% liked)

LocalLLaMA

2200 readers

25 users here now

Community to discuss about LLaMA, the large language model created by Meta AI.

This is intended to be a replacement for r/LocalLLaMA on Reddit.

founded 1 year ago

MODERATORS

SkySyrup@sh.itjust.works

pax@sh.itjust.works

noneabove1182@sh.itjust.works