Markdown to PDF: missing pieces from various approaches, and beyond HTML
Pacharapol Withayasakpunt
Posted on November 4, 2020
Let me say this first, the best way to create PDF from markdown is via web technology (Chrome / Puppeteer), because it is the closest to WSYIWYG (What You See Is What You Get), but it is not perfect.
It currently misses at least one PDF specific features (and possibly more) -- Table of Contents / Bookmarks.
feature request: add option to generate TOC for pdf output #1778
Since now headers and footers with page numbers work, I now desperately miss an option to generate a Table of Contents (TOC) out of the h1- h7 headers when generating a pdf file (i.e like wkhtmltopdf is doing this). The TOC should be at the start of the pdf and it should not only be clickable (jump to the page) but also generate the outline pdf element, so that the TOC is displayed in the contents view in any viewer. Although this may sound complicated if this functionality is implemented at the right place it is not that complicated (take a look on how wkhtmltopdf is implementing this).
Before posting here I tried a couple of workarounds to achieve this. Some dead ends:
- CSS3 target_counter() -> proposed a long time ago and only some specialised tools do it, to be honest I've given up to think that it will be implemented in chrome some day (reference issue is now: https://bugs.chromium.org/p/chromium/issues/detail?id=368053 )
- Find a tool or tools to extract the table of contents from the generated pdf and generate a preface.pdf with the TOC witch to merge in-front of the original pdf ... With https://github.com/qpdf/qpdf I was able to generate a readable and searchable "pdf text-file" so that theoretically it was possible to find the header in the text file and via reverse search and the added comments find out on wich page it is etc. etc...
Any chance to get this soon? Thanks Ognian
And one of the best tools to create PDF is Visual Studio Code, if you know how to use Markdown Preview Enhanced properly. (I've just noticed that I can use this in Atom as well.)
The trick is, when previewing Markdown, right click on the Preview space to see
-
Open in browser
, to tweak usingInspect Element
-
Chrome (Puppeteer) >> PDF
, for shortcut to export to PDF. (You will also need Puppeteer)
You can use custom CSS's.
Indeed, some CSS's are specific to printing, and you can customize that for Markdown Preview Enhanced (MPE).
I current recommend this LESS.
html, body {
box-sizing: border-box;
height: 100%;
width: 100%;
}
.markdown-preview {
box-sizing: border-box;
position: relative;
@media print, screen {
section {
display: flex;
flex-direction: column;
&[vertical-center] {
min-height: 100%;
justify-content: center;
}
&[horizontal-center] {
align-items: center;
text-align: center;
}
}
section + *, h1 {
page-break-before: always;
}
h1, h2, h3, h4, h5, h6 {
page-break-after: avoid;
}
article {
page-break-inside: avoid;
}
}
}
Also, you can import your own LESS.
Importing other file types is also possible.
Markdown inside HTML, in order to use CSS
This is possible natively with Markdown-it, by leaving at least two new lines after the opening div.
<div class="center">
## Hello World
</div>
Customizing Puppeteer, with YAML frontmatter
https://shd101wyy.github.io/markdown-preview-enhanced/#/puppeteer?id=configure-puppeteer
So, I made it like this.
---
id: print
class: 'title'
puppeteer:
margin:
top: 2cm
bottom: 2cm
left: 2cm
right: 2cm
---
@import "/_styles/print.less"
Going beyond HTML
Actually, I have already figured ways to go beyond HTML, including
- Extending Markdown with template engines. EJS has nice syntax-highlighting inside Markdown in VSCode
- Non-Markdown/HTML - LaTeX or ConTeXt - via pandoc, or natively
- PDF manipulation libraries. Some of my recommendations are
Posted on November 4, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.