17

cross-posted from: https://lemmy.world/post/19925986

https://huggingface.co/collections/Qwen/qwen25-66e81a666513e518adb90d9e

Qwen 2.5 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B just came out, with some variants in some sizes just for math or coding, and base models too.

All Apache licensed, all 128K context, and the 128K seems legit (unlike Mistral).

And it's pretty sick, with a tokenizer that's more efficient than Mistral's or Cohere's and benchmark scores even better than llama 3.1 or mistral in similar sizes, especially with newer metrics like MMLU-Pro and GPQA.

I am running 34B locally, and it seems super smart!

As long as the benchmarks aren't straight up lies/trained, this is massive, and just made a whole bunch of models obsolete.

Get usable quants here:

GGUF: https://huggingface.co/bartowski?search_models=qwen2.5

EXL2: https://huggingface.co/models?sort=modified&search=exl2+qwen2.5

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here
this post was submitted on 18 Sep 2024
17 points (100.0% liked)

Free Open-Source Artificial Intelligence

2930 readers
1 users here now

Welcome to Free Open-Source Artificial Intelligence!

We are a community dedicated to forwarding the availability and access to:

Free Open Source Artificial Intelligence (F.O.S.A.I.)

More AI Communities

LLM Leaderboards

Developer Resources

GitHub Projects

FOSAI Time Capsule

founded 2 years ago
MODERATORS