How to compile and install wkhtmltopdf on Debian

How to compile and install wkhtmltopdf on Debian
(And  consequently wkhtmltoimage).

pdf

I found this procedure in a comment at the end of the wkhtmltopdf install guide which didn’t work for me on Debian and although I usually hate creating redundance, i decided to copy it and write it in a human readable form, here for me whenever I need it, and obviously to share it with you, people.

Details
Worked with
wkhtmltopdf-0.11.0_rc1.tar.bz2 source code package,
git://github.com/antialize/wkhtmltopdf.git for qt source

on
Debian 7

Remove, if any, libqt4-dev and qt4-dev-tools (They should be there if you tried already the aptitude packages)

apt-get remove libqt4-dev qt4-dev-tools

Then

git clone git://gitorious.org/~antialize/qt/antializes-qt.git wkhtmltopdf-qt
cd wkhtmltopdf-qt
git checkout 4.8.4
QTDIR=. ./bin/syncqt
rm -R ../wkqt #(you can use this if you have failed at least once, otherwise the directory is not even there)
./configure -nomake tools,examples,demos,docs,translations -opensource -prefix "../wkqt"
make -j3 #(2 hours for this one in my case, so be patient)
make install
cd .. download wkhtmltopdf-0.11.0_rc1.tar.bz2
tar -xjf wkhtmltopdf-0.11.0_rc1.tar.bz2  
mv wkhtmltopdf-0.11.0_rc1 wkhtmltopdf/ 
cd wkhtmltopdf/
../wkqt/bin/qmake #(this, surprisingly, had no output, anyway...)
make clean (if necessary)
make
make install

This should be all. You just try it: for example type

wkhtmltopdf http://www.nytimes.com nytimes.pd

If you want to know more about it just ask for the help:

wkhtmltopdf -h
Name:
  wkhtmltopdf 0.10.0 rc2

Synopsis:
  wkhtmltopdf [GLOBAL OPTION]... [OBJECT]... 

Document objects:
  wkhtmltopdf is able to put several objects into the output file, an object is
  either a single webpage, a cover webpage or a table of content.  The objects
  are put into the output document in the order they are specified on the
  command line, options can be specified on a per object basis or in the global
  options area. Options from the Global Options section can only be placed in
  the global options area

  A page objects puts the content of a singe webpage into the output document.

  (page)? <input type="text" name="" /> [PAGE OPTION]...
  Options for the page object can be placed in the global options and the page
  options areas. The applicable options can be found in the Page Options and 
  Headers And Footer Options sections.

  A cover objects puts the content of a singe webpage into the output document,
  the page does not appear in the table of content, and does not have headers
  and footers.

  cover <input type="text" name="" /> [PAGE OPTION]...
  All options that can be specified for a page object can also be specified for
  a cover.

  A table of content object inserts a table of content into the output document.

  toc [TOC OPTION]...
  All options that can be specified for a page object can also be specified for
  a toc, further more the options from the TOC Options section can also be
  applied. The table of content is generated via XSLT which means that it can be
  styled to look however you want it to look. To get an aide of how to do this
  you can dump the default xslt document by supplying the
  --dump-default-toc-xsl, and the outline it works on by supplying
  --dump-outline, see the Outline Options section.

Description:
  Converts one or more HTML pages into a PDF document, using wkhtmltopdf patched
  qt.

Global Options:
      --collate                       Collate when printing multiple copies
                                      (default)
      --no-collate                    Do not collate when printing multiple
                                      copies
      --copies                Number of copies to print into the pdf
                                      file (default 1)
  -H, --extended-help                 Display more extensive help, detailing
                                      less common command switches
  -g, --grayscale                     PDF will be generated in grayscale
  -h, --help                          Display help
  -l, --lowquality                    Generates lower quality pdf/ps. Useful to
                                      shrink the result document space
  -O, --orientation      Set orientation to Landscape or Portrait
                                      (default Portrait)
  -s, --page-size               Set paper size to: A4, Letter, etc.
                                      (default A4)
  -q, --quiet                         Be less verbose
      --read-args-from-stdin          Read command line arguments from stdin
      --title                   The title of the generated pdf file (The
                                      title of the first document is used if not
                                      specified)
  -V, --version                       Output version information an exit

Contact:
  If you experience bugs or want to request new features please visit 
  <http://code.google.com/p/wkhtmltopdf/issues/list>, if you have any problems
  or comments please feel free to contact me: see 
  <http://www.madalgo.au.dk/~jakobt/#about>

This is the extended help:

Name:
  wkhtmltopdf 0.10.0 rc2

Synopsis:
  wkhtmltopdf [GLOBAL OPTION]... [OBJECT]... 

Document objects:
  wkhtmltopdf is able to put several objects into the output file, an object is
  either a single webpage, a cover webpage or a table of content.  The objects
  are put into the output document in the order they are specified on the
  command line, options can be specified on a per object basis or in the global
  options area. Options from the Global Options section can only be placed in
  the global options area

  A page objects puts the content of a singe webpage into the output document.

  (page)? <input type="text" name="" /> [PAGE OPTION]...
  Options for the page object can be placed in the global options and the page
  options areas. The applicable options can be found in the Page Options and 
  Headers And Footer Options sections.

  A cover objects puts the content of a singe webpage into the output document,
  the page does not appear in the table of content, and does not have headers
  and footers.

  cover <input type="text" name="" /> [PAGE OPTION]...
  All options that can be specified for a page object can also be specified for
  a cover.

  A table of content object inserts a table of content into the output document.

  toc [TOC OPTION]...
  All options that can be specified for a page object can also be specified for
  a toc, further more the options from the TOC Options section can also be
  applied. The table of content is generated via XSLT which means that it can be
  styled to look however you want it to look. To get an aide of how to do this
  you can dump the default xslt document by supplying the
  --dump-default-toc-xsl, and the outline it works on by supplying
  --dump-outline, see the Outline Options section.

Description:
  Converts one or more HTML pages into a PDF document, using wkhtmltopdf patched
  qt.

Global Options:
      --collate                       Collate when printing multiple copies
                                      (default)
      --no-collate                    Do not collate when printing multiple
                                      copies
      --cookie-jar             Read and write cookies from and to the
                                      supplied cookie jar file
      --copies               Number of copies to print into the pdf
                                      file (default 1)
  -d, --dpi                     Change the dpi explicitly (this has no
                                      effect on X11 based systems)
  -H, --extended-help                 Display more extensive help, detailing
                                      less common command switches
  -g, --grayscale                     PDF will be generated in grayscale
  -h, --help                          Display help
      --htmldoc                       Output program html help
      --image-dpi           When embedding images scale them down to
                                      this dpi (default 600)
      --image-quality       When jpeg compressing images use this
                                      quality (default 94)
  -l, --lowquality                    Generates lower quality pdf/ps. Useful to
                                      shrink the result document space
      --manpage                       Output program man page
  -B, --margin-bottom      Set the page bottom margin (default 10mm)
  -L, --margin-left        Set the page left margin (default 10mm)
  -R, --margin-right       Set the page right margin (default 10mm)
  -T, --margin-top         Set the page top margin (default 10mm)
  -O, --orientation     Set orientation to Landscape or Portrait
                                      (default Portrait)
      --output-format        Specify an output format to use pdf or ps,
                                      instead of looking at the extention of the
                                      output filename
      --page-height        Page height
  -s, --page-size              Set paper size to: A4, Letter, etc.
                                      (default A4)
      --page-width         Page width
      --no-pdf-compression            Do not use lossless compression on pdf
                                      objects
  -q, --quiet                         Be less verbose
      --read-args-from-stdin          Read command line arguments from stdin
      --readme                        Output program readme
      --title                  The title of the generated pdf file (The
                                      title of the first document is used if not
                                      specified)
      --use-xserver                   Use the X server (some plugins and other
                                      stuff might not work without X11)
  -V, --version                       Output version information an exit

Outline Options:
      --dump-default-toc-xsl          Dump the default TOC xsl style sheet to
                                      stdout
      --dump-outline           Dump the outline to a file
      --outline                       Put an outline into the pdf (default)
      --no-outline                    Do not put an outline into the pdf
      --outline-depth         Set the depth of the outline (default 4)

Page Options:
      --allow                  Allow the file or files from the specified
                                      folder to be loaded (repeatable)
      --background                    Do print background (default)
      --no-background                 Do not print background
      --checkbox-checked-svg   Use this SVG file when rendering checked
                                      checkboxes
      --checkbox-svg           Use this SVG file when rendering unchecked
                                      checkboxes
      --cookie         Set an additional cookie (repeatable)
      --custom-header  Set an additional HTTP header (repeatable)
      --custom-header-propagation     Add HTTP headers specified by
                                      --custom-header for each resource request.
      --no-custom-header-propagation  Do not add HTTP headers specified by
                                      --custom-header for each resource request.
      --debug-javascript              Show javascript debugging output
      --no-debug-javascript           Do not show javascript debugging output
                                      (default)
      --default-header                Add a default header, with the name of the
                                      page to the left, and the page number to
                                      the right, this is short for:
                                      --header-left='[webpage]'
                                      --header-right='[page]/[toPage]' --top 2cm
                                      --header-line
      --encoding           Set the default text encoding, for input
      --disable-external-links        Do not make links to remote web pages
      --enable-external-links         Make links to remote web pages (default)
      --disable-forms                 Do not turn HTML form fields into pdf form
                                      fields (default)
      --enable-forms                  Turn HTML form fields into pdf form fields
      --images                        Do load or print images (default)
      --no-images                     Do not load or print images
      --disable-internal-links        Do not make local links
      --enable-internal-links         Make local links (default)
  -n, --disable-javascript            Do not allow web pages to run javascript
      --enable-javascript             Do allow web pages to run javascript
                                      (default)
      --javascript-delay       Wait some milliseconds for javascript
                                      finish (default 200)
      --load-error-handling Specify how to handle pages that fail to
                                      load: abort, ignore or skip (default
                                      abort)
      --disable-local-file-access     Do not allowed conversion of a local file
                                      to read in other local files, unless
                                      explecitily allowed with --allow
      --enable-local-file-access      Allowed conversion of a local file to read
                                      in other local files. (default)
      --minimum-font-size       Minimum font size
      --exclude-from-outline          Do not include the page in the table of
                                      contents and outlines
      --include-in-outline            Include the page in the table of contents
                                      and outlines (default)
      --page-offset          Set the starting page number (default 0)
      --password           HTTP Authentication password
      --disable-plugins               Disable installed plugins (default)
      --enable-plugins                Enable installed plugins (plugins will
                                      likely not work)
      --post           Add an additional post field (repeatable)
      --post-file       Post an additional file (repeatable)
      --print-media-type              Use print media-type instead of screen
      --no-print-media-type           Do not use print media-type instead of
                                      screen (default)
  -p, --proxy                 Use a proxy
      --radiobutton-checked-svg Use this SVG file when rendering checked
                                      radiobuttons
      --radiobutton-svg        Use this SVG file when rendering unchecked
                                      radiobuttons
      --run-script               Run this additional javascript after the
                                      page is done loading (repeatable)
      --disable-smart-shrinking       Disable the intelligent shrinking strategy
                                      used by WebKit that makes the pixel/dpi
                                      ratio none constant
      --enable-smart-shrinking        Enable the intelligent shrinking strategy
                                      used by WebKit that makes the pixel/dpi
                                      ratio none constant (default)
      --stop-slow-scripts             Stop slow running javascripts (default)
      --no-stop-slow-scripts          Do not Stop slow running javascripts
      --disable-toc-back-links        Do not link from section header to toc
                                      (default)
      --enable-toc-back-links         Link from section header to toc
      --user-style-sheet        Specify a user style sheet, to load with
                                      every page
      --username           HTTP Authentication username
      --window-status  Wait until window.status is equal to this
                                      string before rendering page
      --zoom                  Use this zoom factor (default 1)

Headers And Footer Options:
      --footer-center          Centered footer text
      --footer-font-name       Set footer font name (default Arial)
      --footer-font-size       Set footer font size (default 12)
      --footer-html             Adds a html footer
      --footer-left            Left aligned footer text
      --footer-line                   Display line above the footer
      --no-footer-line                Do not display line above the footer
                                      (default)
      --footer-right           Right aligned footer text
      --footer-spacing         Spacing between footer and content in mm
                                      (default 0)
      --header-center          Centered header text
      --header-font-name       Set header font name (default Arial)
      --header-font-size       Set header font size (default 12)
      --header-html             Adds a html header
      --header-left            Left aligned header text
      --header-line                   Display line below the header
      --no-header-line                Do not display line below the header
                                      (default)
      --header-right           Right aligned header text
      --header-spacing         Spacing between header and content in mm
                                      (default 0)
      --replace        Replace [name] with value in header and
                                      footer (repeatable)

TOC Options:
      --disable-dotted-lines          Do not use dottet lines in the toc
      --toc-header-text        The header text of the toc (default Table
                                      of Content)
      --toc-level-indentation For each level of headings in the toc
                                      indent by this length (default 1em)
      --disable-toc-links             Do not link from toc to sections
      --toc-text-size-shrink   For each level of headings in the toc the
                                      font is scaled by this facter (default
                                      0.8)
      --xsl-style-sheet        Use the supplied xsl style sheet for
                                      printing the table of content

Page sizes:
  The default page size of the rendered document is A4, but using this
  --page-size optionthis can be changed to almost anything else, such as: A3,
  Letter and Legal.  For a full list of supported pages sizes please see 
  <http://doc.trolltech.com/4.6/qprinter.html#PageSize-enum>.

  For a more fine grained control over the page size the --page-height and
  --page-width options may be used

Reading arguments from stdin:
  If you need to convert a lot of pages in a batch, and you feel that
  wkhtmltopdf is a bit to slow to start up, then you should try
  --read-args-from-stdin,

  When --read-args-from-stdin each line of input sent to wkhtmltopdf on stdin
  will act as a separate invocation of wkhtmltopdf, with the arguments specified
  on the given line combined with the arguments given to wkhtmltopdf

  For example one could do the following:

  echo "http://doc.trolltech.com/4.5/qapplication.html qapplication.pdf" >> cmds
  echo "cover google.com http://en.wikipedia.org/wiki/Qt_(toolkit) qt.pdf" >> cmds
  wkhtmltopdf --read-args-from-stdin --book < cmds

Specifying A Proxy:
  By default proxy information will be read from the environment variables:
  proxy, all_proxy and http_proxy, proxy options can also by specified with the
  -p switch := "http://" | "socks5://" := (":")? "@" := "None" |?? (":")?

  Here are some examples (In case you are unfamiliar with the BNF):

  http://user:password@myproxyserver:8080
  socks5://myproxyserver
  None

Footers And Headers:
  Headers and footers can be added to the document by the --header-* and
  --footer* arguments respectfully.  In header and footer text string supplied
  to e.g. --header-left, the following variables will be substituted.

   * [page]       Replaced by the number of the pages currently being printed
   * [frompage]   Replaced by the number of the first page to be printed
   * [topage]     Replaced by the number of the last page to be printed
   * [webpage]    Replaced by the URL of the page being printed
   * [section]    Replaced by the name of the current section
   * [subsection] Replaced by the name of the current subsection
   * [date]       Replaced by the current date in system local format
   * [time]       Replaced by the current time in system local format
   * [title]      Replaced by the title of the of the current page object
   * [doctitle]   Replaced by the title of the output document

  As an example specifying --header-right "Page [page] of [toPage]", will result
  in the text "Page x of y" where x is the number of the current page and y is
  the number of the last page, to appear in the upper left corner in the
  document.

  Headers and footers can also be supplied with HTML documents. As an example
  one could specify --header-html header.html, and use the following content in
  header.html:&ltscript type="text/javascript">// <![CDATA[
function subst() {
    var vars={};
    var x=document.location.search.substring(1).split('&');
    for (var i in x) {var z=x[i].split('=',2);vars[z[0]] = unescape(z[1]);}
    var x=['frompage','topage','page','webpage','section','subsection','subsubsection'];
    for (var i in x) {
      var y = document.getElementsByClassName(x[i]);
      for (var j=0; j<y.length; ++j) y[j].textContent = vars[x[i]];     }   }
// ]]></script>Page of  </output>
As can be seen from the example, the arguments are sent to the header/footer html documents in get fashion. Outlines: Wkhtmltopdf with patched qt has support for PDF outlines also known as book marks, this can be enabled by specifying the --outline switch. The outlines are generated based on the <h?> tags, for a in-depth description of how this is done see the Table Of Contest section. The outline tree can sometimes be very deep, if the <h?> tags where spread to generous in the HTML document. The --outline-depth switch can be used to bound this. Table Of Content: A table of content can be added to the document by adding a toc objectto the command line. For example: wkhtmltopdf toc http://doc.trolltech.com/4.6/qstring.html qstring.pdf The table of content is generated based on the H tags in the input documents. First a XML document is generated, then it is converted to HTML using XSLT. The generated XML document can be viewed by dumping it to a file using the --dump-outline switch. For example: wkhtmltopdf --dump-outline toc.xml http://doc.trolltech.com/4.6/qstring.html qstring.pdf The XSLT document can be specified using the --xsl-style-sheet switch. For example: wkhtmltopdf toc --xsl-style-sheet my.xsl http://doc.trolltech.com/4.6/qstring.html qstring.pdf The --dump-default-toc-xsl switch can be used to dump the default XSLT style sheet to stdout. This is a good start for writing your own style sheet wkhtmltopdf --dump-default-toc-xsl The XML document is in the namespace "http://code.google.com/p/wkhtmltopdf/outline" it has a root node called "outline" which contains a number of "item" nodes. An item can contain any number of item. These are the outline subsections to the section the item represents. A item node has the following attributes: * "title" the name of the section. * "page" the page number the section occurs on. * "link" a URL that links to the section. * "backLink" the name of the anchor the the section will link back to. The remaining TOC options only affect the default style sheet so they will not work when specifying a custom style sheet. Contact: If you experience bugs or want to request new features please visit <http://code.google.com/p/wkhtmltopdf/issues/list>, if you have any problems or comments please feel free to contact me: see <http://www.madalgo.au.dk/~jakobt/#about> 

Incoming search terms:

  • HowtocompileandinstallwkhtmltopdfonDebian|GiuseppeUrsoBlog (14)
  • how to compile wkhtmltopdf (1)
(Visited 9,512 times, 1 visits today)

Author: Giuseppe Urso

Giuseppe lives in Haarlem now with his shiny dog, Filippa In 1982 received his first home computer, a Commodore 64, followed by Datasette and a 1541 Floppy Disk Drive. In 1999 he installed his first Linux distro (LRH6). In 2006 he switched to Debian as favourite OS. Giuseppe Urso actively sustains the Free Software Fundation and his founder Richard Mattew Stallman, he speaks to people trying to convince them to join the fight now, and about how important is to use Free Software only. He has a job as Infra Specialist at Hippo Enterprise Java Cms an Open Source Enterprise class Content Management System, one of the coolest company ever, in Amsterdam. He's always ready to install Debian on other people computers for free.

11 thoughts on “How to compile and install wkhtmltopdf on Debian”

  1. Hi there to every body, it’s my first visit of this blog; this blog
    contains awesome and in fact fine information designed for readers.

      1. Hi, although I wrote it for debian 7 it shouldn’t be different for squeeze, can you try it first on a virtual machine? just in case…

        Let me know,
        Giuseppe

        1. Building on Debian 6:

          I didn’t remove libqt4-dev qt4-dev-tools – just being cautious, but I couldn’t work out why, and if that would break anything else… @Giuseppe – what is that step for?

          Then hit ‘Basic XLib functionality test failed!’ during configure. Fix was:
          $ sudo apt-get build-dep qt4-qmake

          Then all else worked fine. I used it with 0.12.1 – this was the download step:

          $ wget http://downloads.sourceforge.net/project/wkhtmltopdf/0.12.1/wkhtmltox-0.12.1.tar.bz2

          and obviously modify the references to wkhtmltopdf-0.11.0_rc1.tar.bz2 in subsequent steps.

          Many thanks Giuseppe for putting this together!

          Cheers,

          Dave

  2. Hey Giuseppe – wonder if you can help with some post-install issues…

    I used a temporary directory to download and run through your script, but now it seems that the executable /bin/wkhtmltoimage is linked directly to some libraries in that temporary directory. In the directory, I have these subdirectories:

    233M wkhtmltopdf
    1.9G wkhtmltopdf-qt
    94M wkqt

    and if I remove the temporary directory, I get this:

    $ wkhtmltoimage http://www.google.com google.png
    wkhtmltoimage: symbol lookup error: wkhtmltoimage: undefined symbol: _ZN9QListData11detach_growEPii

    I’m not even sure what it’s looking for. Do I need to keep all the 2+GB hanging around? Is there a final step or something I missed to put the libraries where they should be? Could this be the result of me not removing those pre-installed qt packages at step 1? Seems unlikely, but you never know…

  3. OK, by trial and error, it seems only the small wkqt directory (93MB) is required, and I suspect it’s only the lib subdirectory in there too.

    So is there an easy way to relocate that, or do I need to start again with the final location of wkqt/lib in mind? Maybe there’s an option in the configure / make chain that would allow a lib location to be specified / searched? I’m really not familiar at all with the build tools to know I’m afraid…

  4. make -j3 #(2 hours for this one in my case, so be patient) <-
    it took 4 days on my raspberry pi

Leave a Reply

Your email address will not be published. Required fields are marked *

 

This site uses Akismet to reduce spam. Learn how your comment data is processed.