API¶
API stability¶
Everything described here is considered “public”: this is what you can rely on. We will try to maintain backward-compatibility, although there is no hard promise until version 1.0.
Anything else should not be used outside of WeasyPrint itself: we reserve
the right to change it or remove it at any point. Use it at your own risk,
or have dependency to a specific WeasyPrint version in your setup.py
or requirements.txt file.
Command-line API¶
-
weasyprint.__main__.main(argv=sys.argv)¶ The
weasyprintprogram takes at least two arguments:weasyprint [options] <input> <output>
The input is a filename or URL to an HTML document, or
-to read HTML from stdin. The output is a filename, or-to write to stdout.Options can be mixed anywhere before, between, or after the input and output:
-
-e<input_encoding>,--encoding<input_encoding>¶ Force the input character encoding (e.g.
-e utf8).
-
-f<output_format>,--format<output_format>¶ Choose the output file format among PDF and PNG (e.g.
-f png). Required if the output is not a.pdfor.pngfilename.
-
-s<filename_or_URL>,--stylesheet<filename_or_URL>¶ Filename or URL of a user CSS stylesheet (see Stylesheet origins) to add to the document (e.g.
-s print.css). Multiple stylesheets are allowed.
-
-r<dpi>,--resolution<dpi>¶ For PNG output only. Set the resolution in PNG pixel per CSS inch. Defaults to 96, which means that PNG pixels match CSS pixels.
-
--base-url<URL>¶ Set the base for relative URLs in the HTML input. Defaults to the input’s own URL, or the current directory for stdin.
-
-m<type>,--media-type<type>¶ Set the media type to use for
@media. Defaults toprint.
-
-a<file>,--attachment<file>¶ Adds an attachment to the document. The attachment is included in the PDF output. This option can be used multiple times.
-
-p,--presentational-hints¶ Follow HTML presentational hints.
-
--version¶ Show the version number. Other options and arguments are ignored.
-
-h,--help¶ Show the command-line usage. Other options and arguments are ignored.
-
Python API¶
-
class
weasyprint.HTML(input, **kwargs)¶ Represents an HTML document parsed by html5lib.
You can just create an instance with a positional argument:
doc = HTML(something)The class will try to guess if the input is a filename, an absolute URL, or a file-like object.Alternatively, use one named argument so that no guessing is involved:
- Parameters
filename – A filename, relative to the current directory, or absolute.
url – An absolute, fully qualified URL.
file_obj – A file-like: any object with a
read()method.string – A string of HTML source. (This argument must be named.)
Specifying multiple inputs is an error:
HTML(filename="foo.html", url="localhost://bar.html")will raise a TypeError.You can also pass optional named arguments:
- Parameters
encoding – Force the source character encoding.
base_url – The base used to resolve relative URLs (e.g. in
<img src="../foo.png">). If not provided, try to use the input filename, URL, ornameattribute of file-like objects.url_fetcher – A function or other callable with the same signature as
default_url_fetcher()called to fetch external resources such as stylesheets and images. (See URL fetchers.)media_type – The media type to use for
@media. Defaults to'print'. Note: In some cases likeHTML(string=foo)relative URLs will be invalid ifbase_urlis not provided.
-
render(stylesheets=None, enable_hinting=False, presentational_hints=False, font_config=None)¶ Lay out and paginate the document, but do not (yet) export it to PDF or another format.
This returns a
Documentobject which provides access to individual pages and various meta-data. Seewrite_pdf()to get a PDF directly.New in version 0.15.
- Parameters
stylesheets – An optional list of user stylesheets. List elements are
CSSobjects, filenames, URLs, or file-like objects. (See Stylesheet origins.)enable_hinting (bool) – Whether text, borders and background should be hinted to fall at device pixel boundaries. Should be enabled for pixel-based output (like PNG) but not for vector-based output (like PDF).
presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration) – A font configuration handling @font-face rules.
- Returns
A
Documentobject.
-
write_pdf(target=None, stylesheets=None, zoom=1, attachments=None, presentational_hints=False, font_config=None)¶ Render the document to a PDF file.
This is a shortcut for calling
render(), thenDocument.write_pdf().- Parameters
target – A filename, file-like object, or
None.stylesheets – An optional list of user stylesheets. The list’s elements are
CSSobjects, filenames, URLs, or file-like objects. (See Stylesheet origins.)zoom (float) – The zoom factor in PDF units per CSS units. Warning: All CSS units are affected, including physical units like
cmand named sizes likeA4. For values other than 1, the physical CSS units will thus be “wrong”.attachments – A list of additional file attachments for the generated PDF document or
None. The list’s elements areAttachmentobjects, filenames, URLs or file-like objects.presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration) – A font configuration handling @font-face rules.
- Returns
The PDF as byte string if
targetis not provided orNone, otherwiseNone(the PDF is written totarget).
-
write_png(target=None, stylesheets=None, resolution=96, presentational_hints=False, font_config=None)¶ Paint the pages vertically to a single PNG image.
There is no decoration around pages other than those specified in CSS with
@pagerules. The final image is as wide as the widest page. Each page is below the previous one, centered horizontally.This is a shortcut for calling
render(), thenDocument.write_png().- Parameters
target – A filename, file-like object, or
None.stylesheets – An optional list of user stylesheets. The list’s elements are
CSSobjects, filenames, URLs, or file-like objects. (See Stylesheet origins.)resolution (float) – The output resolution in PNG pixels per CSS inch. At 96 dpi (the default), PNG pixels match the CSS
pxunit.presentational_hints (bool) – Whether HTML presentational hints are followed.
font_config (
FontConfiguration) – A font configuration handling @font-face rules.
- Returns
The image as byte string if
targetis not provided orNone, otherwiseNone(the image is written totarget.)
-
class
weasyprint.CSS(input, **kwargs)¶ Represents a CSS stylesheet parsed by tinycss2.
An instance is created in the same way as
HTML, except that thetreeargument is not available. All other arguments are the same.An additional argument called
font_configmust be provided to handle@font-configrules. The samefonts.FontConfigurationobject must be used for differentCSSobjects applied to the same document.CSSobjects have no public attribute or method. They are only meant to be used in thewrite_pdf(),write_png()andrender()methods ofHTMLobjects.
-
weasyprint.default_url_fetcher(url)¶ Fetch an external resource such as an image or stylesheet.
Another callable with the same signature can be given as the
url_fetcherargument toHTMLorCSS. (See URL fetchers.)- Parameters
url (Unicode string) – The URL of the resource to fetch.
- Raises
An exception indicating failure, e.g.
ValueErroron syntactically invalid URL.- Returns
A dict with the following keys:
One of
string(a byte string) orfile_obj(a file-like object)Optionally:
mime_type, a MIME type extracted e.g. from a Content-Type header. If not provided, the type is guessed from the file extension in the URL.Optionally:
encoding, a character encoding extracted e.g. from a charset parameter in a Content-Type headerOptionally:
redirected_url, the actual URL of the resource if there were e.g. HTTP redirects.Optionally:
filename, the filename of the resource. Usually derived from the filename parameter in a Content-Disposition header
If a
file_objkey is given, it is the caller’s responsibility to callfile_obj.close().
-
class
weasyprint.document.Document(pages, metadata, url_fetcher, font_config)¶ A rendered document, with access to individual pages ready to be painted on any cairo surfaces.
Typically obtained from
HTML.render(), but can also be instantiated directly with a list ofpages, a set ofmetadata, and aurl_fetcher.-
metadata¶ A
DocumentMetadataobject. Contains information that does not belong to a specific page but to the whole document.
-
url_fetcher¶ A
url_fetcherfor resources that have to be read when writing the output.
-
copy(pages='all')¶ Take a subset of the pages.
Examples:
Write two PDF files for odd-numbered and even-numbered pages:
# Python lists count from 0 but pages are numbered from 1. # [::2] is a slice of even list indexes but odd-numbered pages. document.copy(document.pages[::2]).write_pdf('odd_pages.pdf') document.copy(document.pages[1::2]).write_pdf('even_pages.pdf')
Write each page to a numbred PNG file:
for i, page in enumerate(document.pages): document.copy(page).write_png('page_%s.png' % i)
Combine multiple documents into one PDF file, using metadata from the first:
all_pages = [p for p in doc.pages for doc in documents] documents[0].copy(all_pages).write_pdf('combined.pdf')
-
resolve_links()¶ Resolve internal hyperlinks.
Links to a missing anchor are removed with a warning. If multiple anchors have the same name, the first is used.
- Returns
A generator yielding lists (one per page) like
Page.links, except thattargetfor internal hyperlinks is(page_number, x, y)instead of an anchor name. The page number is a 0-based index into thepageslist, andx, yare in CSS pixels from the top-left of the page.
-
make_bookmark_tree()¶ Make a tree of all bookmarks in the document.
- Returns
a list of bookmark subtrees. A subtree is
(label, target, children).labelis a string,targetis(page_number, x, y)like inresolve_links(), andchildrenis a list of child subtrees.
-
write_pdf(target=None, zoom=1, attachments=None)¶ Paint the pages in a PDF file, with meta-data.
PDF files written directly by cairo do not have meta-data such as bookmarks/outlines and hyperlinks.
- Parameters
target – A filename, file-like object, or
None.zoom (float) – The zoom factor in PDF units per CSS units. Warning: All CSS units are affected, including physical units like
cmand named sizes likeA4. For values other than 1, the physical CSS units will thus be “wrong”.attachments – A list of additional file attachments for the generated PDF document or
None. The list’s elements areAttachmentobjects, filenames, URLs, or file-like objects.
- Returns
The PDF as byte string if
targetisNone, otherwiseNone(the PDF is written totarget).
-
write_png(target=None, resolution=96)¶ Paint the pages vertically to a single PNG image.
There is no decoration around pages other than those specified in CSS with
@pagerules. The final image is as wide as the widest page. Each page is below the previous one, centered horizontally.
-
-
class
weasyprint.document.DocumentMetadata(title=None, authors=None, description=None, keywords=None, generator=None, created=None, modified=None, attachments=None)¶ Contains meta-information about a
Documentthat belongs to the whole document rather than specific pages.New attributes may be added in future versions of WeasyPrint.
-
title¶ The title of the document, as a string or
None. Extracted from the<title>element in HTML and written to the/Titleinfo field in PDF.
The authors of the document as a list of strings. Extracted from the
<meta name=author>elements in HTML and written to the/Authorinfo field in PDF.
-
description¶ The description of the document, as a string or
None. Extracted from the<meta name=description>element in HTML and written to the/Subjectinfo field in PDF.
-
keywords¶ Keywords associated with the document, as a list of strings. (Defaults to the empty list.) Extracted from
<meta name=keywords>elements in HTML and written to the/Keywordsinfo field in PDF.
-
generator¶ The name of one of the software packages used to generate the document, as a string or
None. Extracted from the<meta name=generator>element in HTML and written to the/Creatorinfo field in PDF.
-
created¶ The creation date of the document, as a string or
None. Dates are in one of the six formats specified in W3C’s profile of ISO 8601. Extracted from the<meta name=dcterms.created>element in HTML and written to the/CreationDateinfo field in PDF.
-
modified¶ The modification date of the document, as a string or
None. Dates are in one of the six formats specified in W3C’s profile of ISO 8601. Extracted from the<meta name=dcterms.modified>element in HTML and written to the/ModDateinfo field in PDF.
-
-
class
weasyprint.document.Page¶ Represents a single rendered page.
New in version 0.15.
Should be obtained from
Document.pagesbut not instantiated directly.-
width¶ The page width, including margins, in CSS pixels.
-
height¶ The page height, including margins, in CSS pixels.
-
bleed¶ The page bleed width, in CSS pixels.
-
paint(cairo_context, left_x=0, top_y=0, scale=1, clip=False)¶ Paint the page in cairo, on any type of surface.
- Parameters
cairo_context – Any
cairocffi.Contextobject.left_x (float) – X coordinate of the left of the page, in cairo user units.
top_y (float) – Y coordinate of the top of the page, in cairo user units.
scale (float) – Zoom scale in cairo user units per CSS pixel.
clip (bool) – Whether to clip/cut content outside the page. If false or not provided, content can overflow.
-