6
submitted 1 year ago by Blaed@lemmy.world to c/fosai@lemmy.world

From a recent PR by oobabooga:

This is what I get with 24gb vram (I haven't tested extensively, it may be possible to go higher):

Model Params Maximum context
llama-13b max_seq_len = 8192, compress_pos_emb = 4 6079 tokens
llama-30b max_seq_len = 3584, compress_pos_emb = 2 3100 tokens

I also removed the chat_prompt_size parameter, since truncation_length can be reused for its purpose.

Now possible in text-generation-webui after this PR: https://github.com/oobabooga/text-generation-webui/pull/2875

I didn't do anything other than exposing the compress_pos_emb parameter implemented by turboderp here, which in turn is based on kaiokendev's recent discovery: https://kaiokendev.github.io/til#extending-context-to-8k

How to use it

  • Open the Model tab, set the loader as ExLlama or ExLlama_HF.

  • Set max_seq_len to a number greater than 2048. The length that you will be able to reach will depend on the model size and your GPU memory.

  • Set compress_pos_emb to max_seq_len / 2048. For instance, use 2 for max_seq_len = 4096, or 4 for max_seq_len = 8192.

  • Select the model that you want to load.

  • Set truncation_length accordingly in the Parameters tab. You can set a higher default for this parameter by copying settings-template.yaml to settings.yaml in your text-generation-webui folder, and editing the values in settings.yaml.

  • Those two new parameters can also be used from the command-line. For instance: python server.py --max_seq_len 4096 --compress_pos_emb 2. -

you are viewing a single comment's thread
view the rest of the comments
[-] ArkyonVeil@lemmy.world 3 points 1 year ago

Thanks for reposting the breakthroughs!

Makes me have to visit Reddit less for news.

It even rhymes, how neat is that.

this post was submitted on 27 Jun 2023
6 points (100.0% liked)

Free Open-Source Artificial Intelligence

2797 readers
1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 1 year ago
MODERATORS