That's very cool, any idea about tokens/sec performance and on what hardware? For reference my 3070 gets ~19-25 tokens/sec with llama3 7B.
Only works on apple silicon. Am I reading that right?
No, they just mention that only Apple silicon is supported if you're using MacOS
I tried running ollama with the mistral model running, you need a good graphics card to run your own llm, i had to wait 20 minutes for one full response.
Granted, the laptop i was running it with was garbage but it really put into perspective how expensive running an llm can really be.
This shit wont be free forever.
AI Companions
Community to discuss companionship, whether platonic, romantic, or purely as a utility, that are powered by AI tools. Such examples are Replika, Character AI, and ChatGPT. Talk about software and hardware used to create the companions, or talk about the phenomena of AI companionship in general.
Tags:
(including but not limited to)
- [META]: Anything posted by the mod
- [Resource]: Links to resources related to AI companionship. Prompts and tutorials are also included
- [News]: News related to AI companionship or AI companionship-related software
- [Paper]: Works that presents research, findings, or results on AI companions and their tech, often including analysis, experiments, or reviews
- [Opinion Piece]: Articles that convey opinions
- [Discussion]: Discussions of AI companions, AI companionship-related software, or the phenomena of AI companionship
- [Chatlog]: Chats between the user and their AI Companion, or even between AI Companions
- [Other]: Whatever isn't part of the above
Rules:
- Be nice and civil
- Mark NSFW posts accordingly
- Criticism of AI companionship is OK as long as you understand where people who use AI companionship are coming from
- Lastly, follow the Lemmy Code of Conduct