HACKER Q&A
📣 behnamoh

PDF format sucks. Why didn't the alternatives catch on?


PDF is full of bugs. Editing PDF files (even a simple task like replacing a font with something else) is painfully difficult, ugly, and hacky.

Why hasn't the industry just switched to better solutions yet? Is it because PDF is the thing you get regardless of the typesetting program you used (e.g., Word, Latex, Markdown, HTML->PDF (Save as PDF), etc.)? It seems to me that there must be a better way.


  👤 gettalong Accepted Answer ✓
The content of a PDF file is not like the content of, say, an HTML or ODT file. With the latter you use plain text with formatting instructions and the application needs to do all the layouting stuff, like glyph positioning (which is already a hard task), paragraph layout (Where to break the lines? How many lines for widows? ...) and so on.

A PDF file is essentially pre-rendered. So the application creating the PDF file needs to do all the stuff mentioned above and the PDF itself just contains the instructions at what exact position on the page which glyph should be rendered.

This makes displaying or printing a PDF much easier (but still a hard task). And that is also the reason why editing PDFs is hard because all the additional information like what is a paragraph, a heading ... is usually not available.

FYI: Tagged PDF has all that structural information and there are developments to allow e.g. reflowing of PDFs on smaller devices.


👤 Finnucane
Better in what way? Sure, editing a PDF sucks, but why are you even doing that? The purpose of a pdf is to preserve formatting precisely. If you don’t need that, you could use something else. You could output everything as an epub file, using html.

👤 thesuperbigfrog
An impressive feature set, widely available tools, standardization, and marketing:

https://en.wikipedia.org/wiki/History_of_PDF