this post was submitted on 26 May 2026
5 points (85.7% liked)

Language Learning

965 readers
11 users here now

A community all about learning languages!

Ask / talk about a specific language or language learning in general.

Sopuli's instance rules apply

  1. Remember the human! (no harassment, threats, etc.)
  2. No racism or other discrimination
  3. No Nazis, QAnon or similar whackos and no endorsement of them
  4. No porn
  5. No ads or spam
  6. No content against Finnish law

Other active Lemmy language communities:

Other communities outside Lemmy:


Community banner & icon credits:

Icon: The book cover of Babel (2022 novel by R. F. Kuang)

Banner: Epic of Gilgamesh tablet (© The Trustees of the British Museum)


founded 3 years ago
MODERATORS
 

I just jail broke my kindle and have a few epubs and thought maybe this would be a good time to change my approach to vocabulary.

What I'd like to do is learn the vocabulary for my reading before I read it, instead of after, or as I'm reading it.

My dream piece of software would do the following:

  1. resolve all words down to their most basic form (ie, singular for nouns, infinitive for verbs, etc.) (My Language is French)

  2. count occurences of each word

  3. Filter out words I already know

  4. Define the words with a bilingual dictionary to english, including original context sentence.

  5. Make anki cards for me to study.

(6) God-tier programming: also include idiomatic expressions as vocabulary)

Does this exist?

Edit: Or help me assemble a pipe to get all these tasks done separately.

top 1 comments
sorted by: hot top controversial new old
[–] emb@lemmy.world 3 points 1 hour ago

JPDB.io does something like this for Japanese. Not sure you can really import books, but it basically combines some kind of parser in with a dictionary API, example sentence corpus, and its own spaced repetition system.

Gotta be something along the line out there for most languages, but I can't say I know of the tools. Honestly, the breaking-down-into-a base-word part of it is probably in the dictionary's domain. If you give it a conjugated verb it should usually be able to tell. But then some ambiguities need context, not sure how to account for that.

AnkiConnect lets you tap into the Anki APIs, Wiktionary or (from a quick search) Collins should have a dictionary API available for French-English. If the dictionary APIs are good then you could probably get pretty far with basic sentence parsing.

But yeah, feels like there's gotta be something ready made for it, wish I knew and could point you in a direction.