Technology

81161 readers

4916 users here now

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related news or articles.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

founded 2 years ago

MODERATORS

L3s@lemmy.world

enu@lemmy.world

technopagan@lemmy.world

L4s@lemmy.world

L3s@hackingne.ws

142

Cloudflare now serves sites in Markdown to AI agents (www.techzine.eu)

submitted 1 day ago by Beep@lemmus.org to c/technology@lemmy.world

21 comments fedilink hide all child comments

you are viewing a single comment's thread
view the rest of the comments

[–] wonderingwanderer@sopuli.xyz 9 points 23 hours ago

A token is basically a linguistic unit, like a word or a phrase.

LLMs don't parse text word-by-word because it would miss a lot of idiomatic meaning and other context. "Dave shot a hole in one at the golf course" might be parsed as "{Dave} {shot} {a hole in one} {at the golf course}"

They use NLP to "tokenize" text, meaning parsing it into individual tokens, so depending on the tokenizer I suppose there could be slight variations on how a text is tokenized.

Then the LLM runs each token through layers of matrices on attention heads (basically, vectors) in order to assess the probabilistic relationships between each token, and uses that process to generate a response via next-token prediction.

It's a bit more complex than that, of course. Tensor calculus, billions of weighted parameters, layers divided by hidden sizes, also matmuls, masks, softmax, and dropout. Also the "context window" which is how many tokens it can process at a time. But it's the gist of it.

But a token is just the basic unit that gets run through those processes.