this post was submitted on 14 Feb 2026
142 points (98.0% liked)

Technology

81161 readers
4916 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] wonderingwanderer@sopuli.xyz 9 points 23 hours ago

A token is basically a linguistic unit, like a word or a phrase.

LLMs don't parse text word-by-word because it would miss a lot of idiomatic meaning and other context. "Dave shot a hole in one at the golf course" might be parsed as "{Dave} {shot} {a hole in one} {at the golf course}"

They use NLP to "tokenize" text, meaning parsing it into individual tokens, so depending on the tokenizer I suppose there could be slight variations on how a text is tokenized.

Then the LLM runs each token through layers of matrices on attention heads (basically, vectors) in order to assess the probabilistic relationships between each token, and uses that process to generate a response via next-token prediction.

It's a bit more complex than that, of course. Tensor calculus, billions of weighted parameters, layers divided by hidden sizes, also matmuls, masks, softmax, and dropout. Also the "context window" which is how many tokens it can process at a time. But it's the gist of it.

But a token is just the basic unit that gets run through those processes.