API¶
API stability¶
Everything described here is considered “public”: this is what you can rely on. We will try to maintain backward-compatibility, although there is no hard promise until version 1.0.
Anything else should not be used outside of WeasyPrint itself: we reserve
the right to change it or remove it at any point. Use it at your own risk,
or have dependency to a specific WeasyPrint version in your setup.py
or requirements.txt
file.
Command-line API¶
-
weasyprint.__main__.
main
(argv=sys.argv)¶ The
weasyprint
program takes at least two arguments:weasyprint [options] <input> <output>
The input is a filename or URL to an HTML document, or
-
to read HTML from stdin. The output is a filename, or-
to write to stdout.Options can be mixed anywhere before, between, or after the input and output:
-
-e
<input_encoding>
,
--encoding
<input_encoding>
¶ Force the input character encoding (e.g.
-e utf8
).
-
-f
<output_format>
,
--format
<output_format>
¶ Choose the output file format among PDF and PNG (e.g.
-f png
). Required if the output is not a.pdf
or.png
filename.
-
-s
<filename_or_URL>
,
--stylesheet
<filename_or_URL>
¶ Filename or URL of a user CSS stylesheet (see Stylesheet origins) to add to the document (e.g.
-s print.css
). Multiple stylesheets are allowed.
-
-r
<dpi>
,
--resolution
<dpi>
¶ For PNG output only. Set the resolution in PNG pixel per CSS inch. Defaults to 96, which means that PNG pixels match CSS pixels.
-
--base-url
<URL>
¶ Set the base for relative URLs in the HTML input. Defaults to the input’s own URL, or the current directory for stdin.
-
-m
<type>
,
--media-type
<type>
¶ Set the media type to use for
@media
. Defaults toprint
.
-
-a
<file>
,
--attachment
<file>
¶ Adds an attachment to the document. The attachment is included in the PDF output. This option can be used multiple times.
-
-p
,
--presentational-hints
¶
Follow HTML presentational hints.
-
--version
¶
Show the version number. Other options and arguments are ignored.
-
-h
,
--help
¶
Show the command-line usage. Other options and arguments are ignored.
-
Python API¶
-
class
weasyprint.
HTML
(input, **kwargs)¶ Represents an HTML document parsed by html5lib.
You can just create an instance with a positional argument:
doc = HTML(something)
The class will try to guess if the input is a filename, an absolute URL, or a file-like object.Alternatively, use one named argument so that no guessing is involved:
- Parameters
filename – A filename, relative to the current directory, or absolute.
url – An absolute, fully qualified URL.
file_obj – A file-like: any object with a
read()
method.string – A string of HTML source. (This argument must be named.)
Specifying multiple inputs is an error:
HTML(filename="foo.html", url="localhost://bar.html")
will raise a TypeError.You can also pass optional named arguments:
- Parameters
encoding – Force the source character encoding.
base_url – The base used to resolve relative URLs (e.g. in
<img src="../foo.png">
). If not provided, try to use the input filename, URL, orname
attribute of file-like objects.url_fetcher – A function or other callable with the same signature as
default_url_fetcher()
called to fetch external resources such as stylesheets and images. (See URL fetchers.)media_type – The media type to use for
@media
. Defaults to'print'
. Note: In some cases likeHTML(string=foo)
relative URLs will be invalid ifbase_url
is not provided.
-
render
(stylesheets=None, enable_hinting=False, presentational_hints=False, font_config=None)¶ Lay out and paginate the document, but do not (yet) export it to PDF or another format.
This returns a
Document
object which provides access to individual pages and various meta-data. Seewrite_pdf()
to get a PDF directly.New in version 0.15.
- Parameters
stylesheets – An optional list of user stylesheets. List elements are
CSS
objects, filenames, URLs, or file-like objects. (See Stylesheet origins.)enable_hinting (bool) – Whether text, borders and background should be hinted to fall at device pixel boundaries. Should be enabled for pixel-based output (like PNG) but not for vector-based output (like PDF).
presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration
) – A font configuration handling @font-face rules.
- Returns
A
Document
object.
-
write_pdf
(target=None, stylesheets=None, zoom=1, attachments=None, presentational_hints=False, font_config=None)¶ Render the document to a PDF file.
This is a shortcut for calling
render()
, thenDocument.write_pdf()
.- Parameters
target – A filename, file-like object, or
None
.stylesheets – An optional list of user stylesheets. The list’s elements are
CSS
objects, filenames, URLs, or file-like objects. (See Stylesheet origins.)zoom (float) – The zoom factor in PDF units per CSS units. Warning: All CSS units are affected, including physical units like
cm
and named sizes likeA4
. For values other than 1, the physical CSS units will thus be “wrong”.attachments – A list of additional file attachments for the generated PDF document or
None
. The list’s elements areAttachment
objects, filenames, URLs or file-like objects.presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration
) – A font configuration handling @font-face rules.
- Returns
The PDF as byte string if
target
is not provided orNone
, otherwiseNone
(the PDF is written totarget
).
-
write_png
(target=None, stylesheets=None, resolution=96, presentational_hints=False, font_config=None)¶ Paint the pages vertically to a single PNG image.
There is no decoration around pages other than those specified in CSS with
@page
rules. The final image is as wide as the widest page. Each page is below the previous one, centered horizontally.This is a shortcut for calling
render()
, thenDocument.write_png()
.- Parameters
target – A filename, file-like object, or
None
.stylesheets – An optional list of user stylesheets. The list’s elements are
CSS
objects, filenames, URLs, or file-like objects. (See Stylesheet origins.)resolution (float) – The output resolution in PNG pixels per CSS inch. At 96 dpi (the default), PNG pixels match the CSS
px
unit.presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration
) – A font configuration handling @font-face rules.
- Returns
The image as byte string if
target
is not provided orNone
, otherwiseNone
(the image is written totarget
.)
-
class
weasyprint.
CSS
(input, **kwargs)¶ Represents a CSS stylesheet parsed by tinycss2.
An instance is created in the same way as
HTML
, except that thetree
argument is not available. All other arguments are the same.An additional argument called
font_config
must be provided to handle@font-config
rules. The samefonts.FontConfiguration
object must be used for differentCSS
objects applied to the same document.CSS
objects have no public attribute or method. They are only meant to be used in thewrite_pdf()
,write_png()
andrender()
methods ofHTML
objects.
-
weasyprint.
default_url_fetcher
(url)¶ Fetch an external resource such as an image or stylesheet.
Another callable with the same signature can be given as the
url_fetcher
argument toHTML
orCSS
. (See URL fetchers.)- Parameters
url (Unicode string) – The URL of the resource to fetch.
- Raises
An exception indicating failure, e.g.
ValueError
on syntactically invalid URL.- Returns
A dict with the following keys:
One of
string
(a byte string) orfile_obj
(a file-like object)Optionally:
mime_type
, a MIME type extracted e.g. from a Content-Type header. If not provided, the type is guessed from the file extension in the URL.Optionally:
encoding
, a character encoding extracted e.g. from a charset parameter in a Content-Type headerOptionally:
redirected_url
, the actual URL of the resource if there were e.g. HTTP redirects.Optionally:
filename
, the filename of the resource. Usually derived from the filename parameter in a Content-Disposition header
If a
file_obj
key is given, it is the caller’s responsibility to callfile_obj.close()
.
-
class
weasyprint.document.
Document
(pages, metadata, url_fetcher, font_config)¶ A rendered document, with access to individual pages ready to be painted on any cairo surfaces.
Typically obtained from
HTML.render()
, but can also be instantiated directly with a list ofpages
, a set ofmetadata
, and aurl_fetcher
.-
metadata
¶ A
DocumentMetadata
object. Contains information that does not belong to a specific page but to the whole document.
-
url_fetcher
¶ A
url_fetcher
for resources that have to be read when writing the output.
-
copy
(pages='all')¶ Take a subset of the pages.
Examples:
Write two PDF files for odd-numbered and even-numbered pages:
# Python lists count from 0 but pages are numbered from 1. # [::2] is a slice of even list indexes but odd-numbered pages. document.copy(document.pages[::2]).write_pdf('odd_pages.pdf') document.copy(document.pages[1::2]).write_pdf('even_pages.pdf')
Write each page to a numbred PNG file:
for i, page in enumerate(document.pages): document.copy(page).write_png('page_%s.png' % i)
Combine multiple documents into one PDF file, using metadata from the first:
all_pages = [p for p in doc.pages for doc in documents] documents[0].copy(all_pages).write_pdf('combined.pdf')
-
resolve_links
()¶ Resolve internal hyperlinks.
Links to a missing anchor are removed with a warning. If multiple anchors have the same name, the first is used.
- Returns
A generator yielding lists (one per page) like
Page.links
, except thattarget
for internal hyperlinks is(page_number, x, y)
instead of an anchor name. The page number is a 0-based index into thepages
list, andx, y
are in CSS pixels from the top-left of the page.
-
make_bookmark_tree
()¶ Make a tree of all bookmarks in the document.
- Returns
a list of bookmark subtrees. A subtree is
(label, target, children)
.label
is a string,target
is(page_number, x, y)
like inresolve_links()
, andchildren
is a list of child subtrees.
-
write_pdf
(target=None, zoom=1, attachments=None)¶ Paint the pages in a PDF file, with meta-data.
PDF files written directly by cairo do not have meta-data such as bookmarks/outlines and hyperlinks.
- Parameters
target – A filename, file-like object, or
None
.zoom (float) – The zoom factor in PDF units per CSS units. Warning: All CSS units are affected, including physical units like
cm
and named sizes likeA4
. For values other than 1, the physical CSS units will thus be “wrong”.attachments – A list of additional file attachments for the generated PDF document or
None
. The list’s elements areAttachment
objects, filenames, URLs, or file-like objects.
- Returns
The PDF as byte string if
target
isNone
, otherwiseNone
(the PDF is written totarget
).
-
write_png
(target=None, resolution=96)¶ Paint the pages vertically to a single PNG image.
There is no decoration around pages other than those specified in CSS with
@page
rules. The final image is as wide as the widest page. Each page is below the previous one, centered horizontally.
-
-
class
weasyprint.document.
DocumentMetadata
(title=None, authors=None, description=None, keywords=None, generator=None, created=None, modified=None, attachments=None)¶ Contains meta-information about a
Document
that belongs to the whole document rather than specific pages.New attributes may be added in future versions of WeasyPrint.
-
title
¶ The title of the document, as a string or
None
. Extracted from the<title>
element in HTML and written to the/Title
info field in PDF.
The authors of the document as a list of strings. Extracted from the
<meta name=author>
elements in HTML and written to the/Author
info field in PDF.
-
description
¶ The description of the document, as a string or
None
. Extracted from the<meta name=description>
element in HTML and written to the/Subject
info field in PDF.
-
keywords
¶ Keywords associated with the document, as a list of strings. (Defaults to the empty list.) Extracted from
<meta name=keywords>
elements in HTML and written to the/Keywords
info field in PDF.
-
generator
¶ The name of one of the software packages used to generate the document, as a string or
None
. Extracted from the<meta name=generator>
element in HTML and written to the/Creator
info field in PDF.
-
created
¶ The creation date of the document, as a string or
None
. Dates are in one of the six formats specified in W3C’s profile of ISO 8601. Extracted from the<meta name=dcterms.created>
element in HTML and written to the/CreationDate
info field in PDF.
-
modified
¶ The modification date of the document, as a string or
None
. Dates are in one of the six formats specified in W3C’s profile of ISO 8601. Extracted from the<meta name=dcterms.modified>
element in HTML and written to the/ModDate
info field in PDF.
-
-
class
weasyprint.document.
Page
¶ Represents a single rendered page.
New in version 0.15.
Should be obtained from
Document.pages
but not instantiated directly.-
width
¶ The page width, including margins, in CSS pixels.
-
height
¶ The page height, including margins, in CSS pixels.
-
bleed
¶ The page bleed width, in CSS pixels.
-
paint
(cairo_context, left_x=0, top_y=0, scale=1, clip=False)¶ Paint the page in cairo, on any type of surface.
- Parameters
cairo_context – Any
cairocffi.Context
object.left_x (float) – X coordinate of the left of the page, in cairo user units.
top_y (float) – Y coordinate of the top of the page, in cairo user units.
scale (float) – Zoom scale in cairo user units per CSS pixel.
clip (bool) – Whether to clip/cut content outside the page. If false or not provided, content can overflow.
-