Converting website to ebook format


Calibre ebook creation/management package supports converting from html to various ebook formats, including .mobi (Kindle) and .epub (most other e-readers). Calibre can be used either interactively or via command line. I prefer command line:

ebook-convert.exe index.htm ebook.mobi --authors "Frank Revelo" --max-toc-links "1000" --insert-blank-line --insert-blank-line-size "1" --chapter "//*[name()='h2']" --chapter-mark "pagebreak" --breadth-first

ebook-convert.exe index.htm ebook.epub --authors "Frank Revelo" --max-toc-links "1000" --insert-blank-line --insert-blank-line-size "1" --chapter "//*[name()='h2']" --chapter-mark "pagebreak" --breadth-first

Comments:

Ebooks do not handle tables well, because they cannot be reflowed easily. Wherever possible, replace tables with nested lists.

To use Calibre from command line, directory containing "ebook-convert.exe" must be in system path. If Calibre does not add this directory to system path during installation, then add it manually. In windows, use Control Panel'System'Advanced'Environment Variables to show all system variables, then select "path" variable and modify to include calibre directory.

I found it useful to merge all html input files into single file for debugging, using this python program. Merged html file can be used to create PDF, as another ebook format. Simply bring up merged html file in browser, then print to PDF. Also possible to use Calibre's "ebook-convert.exe" program to directly output to PDF format, but I found results inferior to merged html file method.