this post was submitted on 09 Dec 2025
39 points (100.0% liked)

askchapo

23192 readers
235 users here now

Ask Hexbear is the place to ask and answer ~~thought-provoking~~ questions.

Rules:

  1. Posts must ask a question.

  2. If the question asked is serious, answer seriously.

  3. Questions where you want to learn more about socialism are allowed, but questions in bad faith are not.

  4. Try !feedback@hexbear.net if you're having questions about regarding moderation, site policy, the site itself, development, volunteering or the mod team.

founded 5 years ago
MODERATORS
 

How on planet Earth can I change this pdf to epub? I tried everything I could think of in Calibre but the problem is that the pdf has 2 columns of text per page, plus footnotes on each page. When it converts to epub it just prints each line of each text column as a line of text, which makes it totally lose it's meaning. Footnotes are also just added as regular text, as part of a supremely incoherent story with aggressive punctuation.

Has anybody been able to solve this before?

you are viewing a single comment's thread
view the rest of the comments
[–] Edie@hexbear.net 2 points 2 weeks ago* (last edited 2 weeks ago) (1 children)

Tesseract doesn't support PDF input, you'll need some other program like ocrmypdf (which I have used. It uses tesseract), or extract each page to it's own image (which I have also done but I forget how right now.)


This user is suspected of being a cat. Please report any suspicious behavior.

[–] fort_burp@feddit.nl 2 points 2 weeks ago

Thanks again! You're the best :)

This looks like exactly what I need. After getting the formatting right with k2pdf I can then use ocrmypdf to get it back to text form and then just ctrl + a copy to writer and export as epub, since the pdf size is like 15x the epub size.