Releases: jgm/pandoc
pandoc 3.5
Click to expand changelog
-
Add command-line options
--list-of-figures/--lof
and--list-of-tables/--lot
(#10029, Akash Patel). Only docx, latex, and context are affected by these options currently. Setting thelof
andlot
variables will also work for the formats that are currently supported. -
Defaults files: interpolation of environment variables now works for
to
andfrom
fields (#8024). This is needed because these files can contain paths of custom readers/writers. -
Docx reader:
- Reset lists after headers in same list
numId
(#10258). To accomplish this, we add a Heading constructor to BodyPart and include on it all the information list items have.
- Reset lists after headers in same list
-
DocBook reader:
- Parse id, class, and tabstyle on tables (#10181, Erik Rask). Add parsing of id (xml:id), class, and tabstyle XML attributes for table and informaltable in the DocBook reader. The tabstyle value is put in the ‘custom-style’ attribute.
-
Dokuwiki reader:
- Be more forgiving about misaligned lists, like dokuwiki itself (#8863).
- Improve blockquote parsing in dokuwiki. Allow for quoted code blocks.
- Enable smart extension.
- Properly parse
--
and---
as dashes. - Fix block quote behavior (#6461). Blockquotes are not really block containers in DokuWiki; the lines are interpreted literally (so, e.g., you can’t start a list), and line breaks are added at the ends.
-
EPUB reader:
- Fix links to other files in the EPUB, making them internal links to a fragment derived from the filename (#10207). There was already code to handle links like
#foo
, but not to handle links likech0001.html#foo
.
- Fix links to other files in the EPUB, making them internal links to a fragment derived from the filename (#10207). There was already code to handle links like
-
LaTeX reader:
- Add em, ex, px, mu to list of units for dimension args (#10212).
-
ANSI writer:
- Fix subscripts (Evan Silberman).
-
DokuWiki writer:
- Don’t emit
<HTML>
tags (#7413). The use of these tags is now strongly discouraged for security reasons, and will be removed. We previously used them as a fallback for lists that could not be represented using DokuWiki syntax, e.g. ordered lists with fancy numbers or lists with multiple blocks in their items. We also used them for block quotes with multiple blocks as their contents. We now use the<WRAP>
syntax (from the optional WRAP plugin) to handle lists with multiple blocks as their contents. A new method of handling block quotes with complex contents has the side benefit of also handling nested block quotes, which weren’t supported before.<HTML>
and<html>
tags are only for raw HTML blocks and inlines, and only if theraw_html
extension is enabled. (It is now a valid extension fordokuwiki
, though off by default.)
- Don’t emit
-
Docx writer:
- Support
--list-of-figures
and--list-of-tables
(orlof
andlot
variables) (Akash Patel).
- Support
-
HTML writer:
- Don’t emit missing title/lang warnings if templates does not contain the
pagetitle
orlang
variables respectively (#9370).
- Don’t emit missing title/lang warnings if templates does not contain the
-
LaTeX writer:
- Better fix for lists in definition lists (#10241). In commit a26ec96 we added an empty
\item[]
to the beginning of a list that occurs first in a definition list, to avoid having one item on the line with the label. This gave bad results in some cases (#10241) and there is a more idiomatic solution anyway: using\hfill
. - Avoid error on
refs
div with empty citations (#10185). If there are no citations, don’t emit an empty CSLReferences environment.
- Better fix for lists in definition lists (#10241). In commit a26ec96 we added an empty
-
RST writer:
- Change bullet list hang from 3 to 2. This accords with the style in the RST reference docs.
- Handle cases where indented context starts with block quote (#10236). In these cases we emit an empty comment to fix the point from which indentation is measured; otherwise the block quote is not parsed as a block quote. This affects list items and admonitions.
- Don’t enclose the list table in a
.. table::
; this leads to doubled captions (#10226). - Fix alignment of list table items corresponding to cells (#10227).
-
JATS template:
- Support
floats-group
(Albert Krewinkel, see #10196). The content of thefloats-group
variable is now rendered in a<floats-group>
element when using the publishing or archiving tag sets.
- Support
-
LaTeX and Beamer templates:
- Split old default.latex into two templates,
default.latex
anddefault.beamer
, factoring common parts into partials:fonts.latex
,common.latex
,passoptions.latex
,hypersetup.latex
,after-header-includes.latex
. - Make
default.beamer
the default template for beamer. - Add
shorttitle
,shortsubtitle
,shortauthor
,shortinstitute
,shortdate
variables to beamer template (#10248, Thomas Hodgson). - Make
--number-sections
work with beamer (#12045, Thomas Hodgson). - Support a list of images for
titlegraphic
in beamer template (#10246, Thomas Hodgson). Title graphic options will be applied to each title graphic. Images will be separated by\enspace
. - Beamer theme options (#10243)
- Add theme options to beamer template:
colorthemeoptions
,fontthemeoptions
,innerthemeoptions
,outerthemeoptions
(#10243, Thomas Hodgson). - Don’t load amsmath, amssym in beamer template. These are loaded by beamer automatically.
- Split old default.latex into two templates,
-
Text.Pandoc.SelfContained:
- Improve handling of links to remote CSS (#10261).
-
Text.Pandoc.Class:
- Allow extracting
data:
URIs even in PandocPure (--sandbox
) (#10249). - Export
extractURIData
[API change].
- Allow extracting
-
Text.Pandoc.PDF:
- Read
.toc
and.log
files from output directory (#10186). When this is different from the input directory, this is where.toc
and.log
files are written.
- Read
-
Text.Pandoc.Shared:
- Modify
addPandocAttributes
for changes in commonmark-pandoc. The new commonmark-pandoc version automatically adds the attributewrapper="1"
on all Divs and Spans that are introduced just as containers for attributes that belong properly to their contents. So we don’t need to add the attribute here. This gives much better results in some cases. Previously the wrapper attribute was being added even for explicit Divs and Spans in djot, but it is not needed in these cases.
- Modify
-
Text.Pandoc.Options:
- Add
writerListOfFigures
andwriterListOfTables
fields toWriterOptions
(#8245, Akash Patel). [API change]
- Add
-
Text.Pandoc.App:
- Add
optListOfFigures
andoptListOfTables
toOpt
(#8245) [API change].
- Add
-
Lua subsystem (Albert Krewinkel):
-
Update List module (#9835). The module now comes with a method
:at(index[, def])
that allows to access indices, accepts negative indices to count from the end, and will return thedef
value as a default if the list has no item at the given position. Furthermore, the list constructorpandoc.List
now accepts iterators. E.g.,pandoc.List(text:gmatch '%S+')
returns the list of words intext
. -
Support character styling via
pandoc.layout
. TheDoc
values produced and handled by thepandoc.layout
module can now be styled usingbold
,italic
,underlined
, orstrikeout
. The style is ignored in normal rendering, but becomes visible when rendering to ANSI output. Thepandoc.layout.render
function now takes a third parameter that defines the output style, either plain or ansi. -
It is now possible to return a single filter from a filter file, e.g.
-- Switch single- and double quotes return { Quoted = function (q) elem.quotetype = elem.quotetype == 'SingleQuote' and 'DoubleQuote' or 'SingleQuote' return elem end }
The filter must not contain numerical indexes, or it might be treated as a list of filters.
-
Add
list_of_figures
andlist_of_tables
to writer options (Akash Patel).
-
-
Use latest releases of commonmark, commonmark-pandoc, texmath, djot.
-
Stop depending on package SHA (Albert Krewinkel). Use
crypton
instead. -
linux/make_artifacts.sh
: add riscv64 support (Olivier Benz). -
Fix invalid XML in
test/docx/normalize.docx
(#10242). -
doc/lua-filters.md
: list functions inpandoc.utils
alphabetically (Albert Krewinkel). -
MANUAL.txt:
pandoc 3.4
Click to expand changelog
-
New output format:
ansi
(for formatted console output) (Evan Silberman). Most Pandoc elements are supported and printed in a reasonable way, if not always ideally. This version does no detection of terminal capabilities, nor does it fall back to different output styles for less-capable terminals. -
Add command line options
--table-caption-position
and--figure-caption-position
. These allow the user to specify whether to put captions above or below tables and figures, respectively. The following output formats are supported: HTML (and related such as EPUB), LaTeX (and Beamer), Docx, ODT/OpenDocument, Typst. -
Change default
--pdf-engine
via HTML to WeasyPrint (#10142).wkhtmltopdf
is deprecated.weasyprint
is the easiest-to-install, maintained alternative. For better results, one might preferpagedjs-cli
. -
Org reader:
- Fix parsing of src blocks with an
-i
flag (#10071, Albert Krewinkel). Tabs are now preserved in the contents of src blocks if the the block has the-i
flag.
- Fix parsing of src blocks with an
-
RTF reader:
- Handle images inside
shp
contexts (#10145).
- Handle images inside
-
RST reader:
-
Improve simple table support (#10093). Multiline rows occur only when the first cell is empty; we were previously treating lines with any empty cell as row continuations. In addition, we no longer wrap multiline cells in Para if they can be represented as Plain. This is consistent with docutils behavior.
-
LaTeX reader:
-
Typst reader:
- Change how “block” elements are handled. Previously they were always parsed as divs. But actually they can occur in some “inline” contexts. Now we first try to parse them as inlines, and only as blocks if that fails. A surrounding Div or Span element is added only if there is an identifier.
-
HTML reader:
- Only parse main element’s contents (if present) (#10140). If main has an id or class, we include a div with that id or class; otherwise just the contents.
- Read TeX annotation in MathML content if present (#9971).
- Better handle KaTeX-generated math (#9971). KaTeX emits the mathml followed by a span with an HTML fallback. Previously pandoc was converting both. We now ignore the HTML fallback span, marked with class
katex-html
.
-
New module: Text.Pandoc.Writers.ANSI [API change] (Evan Silberman).
-
Docx writer:
- Add “SuppressAuthor” and “AuthorOnly” to citationMode when
+citations
is used (thomjur). - Support
custom-style
attribute for docx table (Sebbones). - Support
--number-offsets
. - Make table/figure rendering sensitive to caption position settings.
- Add “SuppressAuthor” and “AuthorOnly” to citationMode when
-
OpenDocument writer:
- Make table/figure rendering sensitive to caption position settings.
-
Typst writer/template:
- Implement figure caption positions by triggering a show rule in the default template, which determines caption positions for figures and tables globally.
- Don’t include trailing semicolon after
@
style citations with suffixes (#10148). - Template: move header-includes before show doc (#9996, Gordon Woodhull).
-
LaTeX writer:
-
HTML writer/template:
- Make
<figcaption>
placement sensitive to caption position settings. For tables,<caption>
must be the first element, and positioning is determined by CSS, for here we set a variable which the default template is sensitive to. - Use
makeSectionsWithOffsets
forwriterNumberOffsets
, instead of the old, inefficient code. - Don’t add doc-biblioref role to every link in a citation; only to links to the bibliography (#10156).
- Add
data-
when renderinglabel
attribute (#10048).
- Make
-
Markdown writer:
-
Avoid emitting markdown caption if table has fallen back to raw HTML, which will then contain a
<caption>
tag (#10094). -
Make math sensitive to
tex_math_gfm
extension (#9121). This means that in GFM output, the “new style” math will be used by default, e.g.$`x=y`$ ```math x = y ```
To defeat this and get the older behavior, namely
$x=y$ $$x=y$$
one could use
-t gfm-tex_math_gfm
.
-
-
AsciiDoc writer:
- Add
link:
prefix when needed (#10105). AsciiDoc requires it except forhttp
,https
,irc
,mailto
,ftp
schemes (#10105). - Preserve original base level (#10062). We used to normalize so that the base level is always 1, but asciidoc no longer seems to care about that, and the behavior creates difficulties when we are converting fragments.
- Don’t emit empty figure caption (#10047).
- Add
-
ODT writer:
- Add TableCaption to styles.xml (#10058, Ian Max Andolina).
-
LaTeX template:
- Fix wrong beamer color in (sub)section page (Jonathan).
-
Text.Pandoc.Options:
- Add
CaptionPosition
and newWriterOptions
fieldswriterFigureCaptionPosition
andwriterTableCaptionPosition
[API change].
- Add
-
Text.Pandoc.Opt:
- Change default for optNumberOffset to
[]
. This behaves the same as[0,0,0,0,0]
. - Add
Opt
fieldsoptFigureCaptionPosition
andoptTableCaptionPosition
[API change].
- Change default for optNumberOffset to
-
Text.Pandoc.Format: change
formatFromFilePaths
so that it is smarter about URLs. URLs are parsed, and we take the format from the path component, if present (#10141). This means thathttps://emacs.org/
will be treated as HTML, whilehttps://emacs.org/sample.org
will be treated as Org. -
Text.Pandoc.URI:
- Add unofficial
gemini:
to list of URI schemes (Pau RE).
- Add unofficial
-
Text.Pandoc.Shared:
- Add
makeSectionsWithOffsets
[API change]. - Remove `stripEmptyParagraphs [API change] (Albert Krewinkel). This function is no longer used.
- Add
-
Text.Pandoc.Highlighting: Expose
formatANSI
[API change] (Evan Silberman). -
Text.Pandoc.Writers.Shared: export
to{Sub,Super}scriptInline
[API change] (Evan Silberman). -
Remove use of partial functions (e.g.
head
) in code. -
Use latest skylighting-core, skylighting, doclayout, texmath, typst.
-
pandoc-lua-engine: Add accessors for several writer options, including some that were added in previous releases.
-
pandoc-server: Initialize some missing fields in WriterOptions:
writerEpubTitlePage
,writerChunkTemplate
,writerListTables
,writerFigureCaptionPosition
,writerTableCaptionPosition
. -
CONTRIBUTING.md: Summarize steps for adding a new cli option.
-
MANUAL.txt:
- Clarify that the
--number-offset
option should only directly affect numbering of the first section heading in a document; subsequent headings will increment normally. - Fix asciidoc link (#10039).
- Fix CSL Docs broken link (#10100, Tristano Ajmone).
- Document the use of
luatexja
when CJKmainfont is used with lualatex (#3873, Kolen Cheung). - Add a
citations
(typst) section to the manual (#9127). - Clarify that
citations
affects both input and output fororg
. - Add note on
--citeproc
that you may need to disablecitations
extension on the output format (e.g.,-t markdown-citations
) to see the rendered citation (#9127, #10012).
- Clarify that the
-
INSTALL.md — reorganise info on static binaries and add conda-forge install options (#10098, #10069, Ian Max Andolina).
pandoc 3.3
Click to expand changelog
-
New cli option:
--link-images
. This causes images to be linked rather than embedded in ODT. -
Allow
--number-sections
to take an optionaltrue|false
argument. -
RTF reader:
- Handle
\*\shppict
without dropping image (#10025).
- Handle
-
TWiki Reader:
- Recognize WikiWords as internal links (#9941).
- Avoid partial function.
-
Typst reader:
- Ignore ‘pad’ and just parse its body (#9958).
- Use typst 0.5.0.5. Fixes parsing of equations like
$1.$
.
-
Docx writer:
- Fix regression with nested lists (#9994). The bug affects e.g. ordered lists with bullet sublists; after the sublist the top-level list reverts to bullets instead of being properly numbered. This is a regression introduced in version 3.2.1.
-
BibTeX writer:
- Ensure that “literal” names are enclosed in braces (#9987).
-
Man writer:
- Use default middle header when metadata does not include
header
(#9943). This change causes pandoc to omit the middle header parameter whenheader
is not set, rather than emitting""
. The parameter is optional and man will use a default based on the section if it is not specified.
- Use default middle header when metadata does not include
-
HTML templates: don’t load polyfill (#9918). This was added in a period when MathJaX required polyfill. MathJaX no longer recommends this and polyfill should no longer be necessary on any reasonably modern browser.
-
Translations:
- Add
ua.yaml
(Jens Oehlschlägel). - Add a script (
tools/update-translations.py
) and Makefile target (update-translations
) to update translation data automatically from babel and polyglossia upstream (Stephen Huan). - Use this script to update language data, increasing the number of languages we cover (Stephen Huan). Fix a few small bugs in existing translations.
- Add
-
Fix some mistakes with Japanese language code (#9938). In several places we were mistakenly assuming that the BCP 47 code for Japanese language was
jp
. It isja
. -
Text.Pandoc.Options:
- New field in WriterOptions:
writerLinkImages
[API change] (#9815).
- New field in WriterOptions:
-
Text.Pandoc.App.Opt:
- New field in Opt:
optLinkImages
[API change] (#9815).
- New field in Opt:
-
Lua subsystem:
-
Keep
lpeg
andre
as “loaded” modules (Albert Krewinkel). The moduleslpeg
andre
are now treated as if they had been loaded withrequire
. Previously the modules were only assigned to global values, but could be loaded again viarequire
, thereby allowing to use a system-wide installation. However, this proved to be confusing.The old behavior can be restored by adding the following lines to the top of Lua scripts, or to the
init.lua
in the data dir.debug.registry()['_LOADED'].lpeg = nil debug.registry()['_LOADED'].re = nil
-
-
pandoc-cli
: Include pandoc copyright in Lua version info (Albert Krewinkel). -
pandoc-cli
: Refer printing of version info to the Lua interpreter (Albert Krewinkel). The Lua interpreter no longer terminates when called with-v
or--version
arguments, thus improving compatibility with the defaultlua
interpreter program. -
Avoid partial functions in JATS reader, DocBook writer, Haddock reader.
-
Allow tls 2.1.x.
-
MANUAL.txt:
- Make documentation of extensions clearer (#9060).
- Fix section level for two Extensions entries.
-
lua-filters.md: Partially autogenerate docs for module
pandoc
(Albert Krewinkel). The documentation system isn’t powerful enough to generate the full documentation automatically.
pandoc 3.2.1
Click to expand changelog
-
Fix
gfm_auto_identifiers
to replace emojis with their aliases, as documented (#9876). -
CSV reader:
- Turn line breaks into LineBreaks not SoftBreaks (#9797).
-
Docx reader:
- Support task lists (#8211).
- Fix a small bug in parsing delimiters in numbered lists, which led to the default delimiter being used wrongly in some cases.
- Improve handling of captions.
- Support HorizontalRule. We support both pandoc-style and the style described on a Microsoft support page, an empty paragraph with a bottom border (#6285).
- React to
"left"
value onjc
attribute. - Handle column and cell alignments (#8551). We take the column alignments from the first body row.
- Fix a bug that caused comments inside insertions or deletions to be ignored (#9833).
-
HTML reader:
- Better handle non-
li
elements inul
andol
(#9809). For example, ap
after a closedli
will be incorporated into the previousli
. This mirrors what browsers do with this invalid HTML.
- Better handle non-
-
LaTeX reader:
- Fix parsing of dimensions beginning with
.
, e.g.\kern.1pt
(#9902).
- Fix parsing of dimensions beginning with
-
Markdown reader:
- Allow author-only textual citations (#7219). E.g.
-@reese2002
outside of brackets.
- Allow author-only textual citations (#7219). E.g.
-
RST reader:
-
Textile reader:
- Don’t let spans begin right after a symbol (#9878).
-
Texinfo writer:
- Ensure proper escaping in all node/link contexts.
- Target node rather than anchor when possible in internal links.
- Remove illegal characters from internal link anchors (#6177).
- Use two commas not one in
@ref
. - Don’t add anchors to headings. We don’t need them, now that we make internal links use the node.
- Avoid duplicate node names.
- Improve menus. Properly handle the case where the node name is different from the descriptive title.
-
Texinfo template: add variables for filename and version.
-
Typst reader:
-
Fix an incomplete pattern match (#9807).
-
Handle inline bodies ending in a parbreak. E.g.
`#strong[ test ]
-
-
ConTeXt template: remove
\setupbackend[export=yes]
(#9820). -
Docx writer:
- Omit
jc
attribute on table cells with AlignDefault (#5662). - Better formatting for task lists. Task lists are now properly formatted, with no bullet (#5198).
- Replace an expensive generic traverse to remove Space elements, for better performance.
- The new OpenXML template had spaces for metadata that need to be filled with OpenXML fragments with the proper shape. This patch ensures that everything is the right shape.
- Wrap figures with
id
in a bookmark (#8662). - Add eastAsia font hints to
w:r
(#9817). We do this when the text in the run contains any CJK characters. This ensures that ambiguous code points (e.g. quotation marks) will be represented as “wide” characters when together with CJK characters. - Clean up Abstract Title and Subtitle in default reference docx. Center Subtitle, remove color.
- Allow OpenXML templates to be used with
docx
(#8338, #9069, #7256, #2928). The--reference-doc
option allows customization of styles in docx output, but it does not allow one to adjust the content of the output (e.g., changing the order in which metadata, the table of contents, and the body of the document are displayed), or adding boilerplate text before or after the document body. For these changes, one can now use--template
with an OpenXML template. (See the defaultopenxml
template for a sample.)--include-before-body
and--include-after-body
can also now be used withdocx
output. The included files must be OpenXML fragments suitable for inclusion in the document body. - New unexported module Text.Pandoc.Writers.Docx.OpenXML.
- Omit
-
HTML writer:
-
Ensure URI escaping needed for
html4
(#9905). Unicode characters need not be escaped for html5, and still won’t be. -
Don’t emit unnecessary classes in HTML tables (#9325, Thomas Soeiro). Pandoc used to emit a
header
class on thetr
element that forms the table header. This is no longer needed, becausehead > tr
will do the same thing. Similarly, pandoc used to emiteven
andodd
classes ontr
s, allowing striped styling. This is no longer needed, because one can use e.g.tbody tr:nth-child(2n)
.Compatibility warning: users who relied on these classes to style tables may need to adjust their CSS.
-
-
JATS writer:
- Support
supplementary-material
in metadata forjats_articlepublishing
(#9818).
- Support
-
LaTeX writer:
- New method for ensuring images don’t overflow (#9660). Previously we relied on graphicx internals and made global changes to Gin to force images to be resized if they exceed textwidth. This approach is brittle and caused problems with
\includesvg
(see #9660). The new approach uses a new macro\pandocbounded
that is now defined in the LaTeX template. (Thanks here to Falk Hanisch in mrpiggi/svg#60.) The LaTeX writer has been changed to enclose\includegraphics
and\includesvg
commands in this macro when they don’t explicitly specify a width or height. In addition, the writer now addskeepaspectratio
to the\includegraphics
or\includesvg
options ifheight
is specified without width, or vice versa. Previously, this was set in the preamble as a global option. Users should attend to the following compatibility issues:- If custom templates are used with the new LaTeX writer, they will have to be updated to include the new
\pandocbounded
macro, or an error will be raised because of the undefined macro. - Documents that specify explicit dimensions for an image may render differently, if the dimensions are greater than the line width or page height. Previously pandoc would shrink these images to fit, but the new behavior takes the specified dimensions literally. In addition, pandoc previously always enforced
keepaspectratio
, even when width and height were both specified, so images with width and height specified that do not conform to their intrinsic aspect ratio will appear differently.
- If custom templates are used with the new LaTeX writer, they will have to be updated to include the new
- Task lists must be unordered (#9185).
- Specify language option for
selnolig
and only include it ifenglish
orgerman
is used (#9863). (This includes changes to the LaTeX template.) This should restore proper ligature suppression when lualatex is used. - Fix
--toc-depth
with beamer output (#9861). Previously only top-level sections were ever included in the TOC, regardless of the setting of--toc-depth
. - Use
\linewidth
instead of\columnwidth
or\textwidth
for resizing figures, table cells, etc. in LaTeX (#9775).\linewidth
, unlike the others, is sensitive to indented environments like lists.
- New method for ensuring images don’t overflow (#9660). Previously we relied on graphicx internals and made global changes to Gin to force images to be resized if they exceed textwidth. This approach is brittle and caused problems with
-
LaTeX template: put
babel-lang
in options to beamer (#9868). This is required to make beamer use proper localized terms for things like “Section.” -
Markdown writer:
-
Typst writer:
- Support ‘.typst:no-figure’ and ‘typst:figure:kind=kind’ attributes (#9778, Carlos Scheidegger). This extends support for fine-grained properties in Typst. If the
typst:no-figure
class is present on a Table, the table will not be placed in a figure. If thetypst:figure:kind
attribute is present, its value will be used for the figure’skind
(#9777). These features are documented indoc/typst-property-output.md
.
- Support ‘.typst:no-figure’ and ‘typst:figure:kind=kind’ attributes (#9778, Carlos Scheidegger). This extends support for fine-grained properties in Typst. If the
-
Typst template:
-
Textile writer:
- Get rid of header, odd, even classes on
tr
(#9376).
- Get rid of header, odd, even classes on
-
Text.Pandoc.Class:
fillMediaBag
: Convert IOErrors to warnings when fetching absolute paths (#9859, Albert Krewinkel). This will allow many conversions that would have failed with an error to succeed (albeit without images or other needed resources).
-
Text.Pandoc.ImageSize:
- Don’t prefer exif width/height when they conflict with image width/height (#9871). That was a mistaken call in #6936. Usually when these values disagree, it is because the image has been resized by a tool that leaves the original exif values the same, so the width/height metadata are more likely to be correct that exif width/height.
-
Text.Pandoc.SelfContained:
- Strip CRs from XML before base64 encoding for data URI (so tests can work on Windows).
- Only create
<svg>
elements for SVG images when the image has the classinline-svg
. Otherwise just use adata
URI as we do with other images (#9787).
-
Lua subsystem (Albert Krewinkel):
- Split Init module into more modules. The module has grown unwieldy and is therefore split into three internal Haskell modules,
Init
,Module
, andRun
. - Add function
pandoc.utils.run_lua_filter
(#9803). - Add function
pandoc.template.get
(#9854, co-authored by Carsten Gips). The function allows to specify a template with the same argument value that would be used with the--template
command line parameter. - Keep CommonState object in the registry. The state is an internal value and should be treated as such. The
PANDOC_STATE
global is merely a copy...
- Split Init module into more modules. The module has grown unwieldy and is therefore split into three internal Haskell modules,
pandoc 3.2
Click to expand changelog
-
Change to
--file-scope
behavior (#8741): previously a Div with an identifier derived from the filename would be added around the contents of each file. This caused problems for “chunking” files into chapters, e.g. in EPUB. We no longer add the surrounding Div. This cooperates better with chunking. Note, however, that if you have relied on the old behavior to link to the beginning of the contents of a file using its filename as identifier, that will no longer work. -
Markdown reader:
- Allow repeated labels in numbered example lists. Previously if you tried to use the same label as an earlier example list item, you’d get a new number, not the old one, and references to the label would go to the second occurrence. Now an existing label will be reused, and no new number will be generated. Caveat: this only works reliably when the re-used example list item occurs by itself in a list, or occurs in a list of previously used example list items that occur in exactly the same order as previously.
- Fix
normalCite
so it doesn’t consume past a closing]
boundary (#9710). This was causing an exponential performance bug on long lists of links containing potential emphasis characters. - Generalize
inlinesInBalancedBrackets
toinBalancedBrackets
, with a parameter for the inner parser. - Auto-close unclosed divs (#9635). This applies to both fenced and HTML-ish varieties. Otherwise we face an exponential performance problem with backtracking. A warning is issued when a div is implicitly closed.
-
RST reader:
- Fix
figclass
andalign
annotations for figures (#7473, Gokul Rajiv).
- Fix
-
LaTeX writer:
-
LaTeX reader:
-
LaTeX template: add
titlegraphicoptions
variable (#9207, Guilhem Saurel). -
Docx reader:
-
RTF reader:
- Don’t try to handle non-default code pages (#9683). Emit a warning instead.
-
OpenDocument writer:
- Implement custom-style for spans (#9657).
-
Typst writer:
- Add blank line in definition lists with multiple definitions (see #9704).
- Property output (#9648, Gordon Woodhull). The Typst writer will pass on specially marked attributes as raw Typst parameters on selected elements. This allows extensive customization using filters. A separate document (
doc/typst-property-output.md
) has been added that provides extensive documentation and examples of the use of this feature.
-
Markdown writer:
- Don’t try to align columns in pipe tables with lines greater than COLUMNS. The alignment just reduces readibility when the lines soft wrap.
- Don’t use
raw_attribute
syntax for raw blocks, unless there is no other option (see #9677). Macros in araw_attribute
block don’t get interpreted when it is read again by pandoc’s markdown reader.
-
ConTeXt writer:
- Replace depreciated
\sc
with\setsmallcaps
(#9518, James P. Ascher).
- Replace depreciated
-
Docx writer:
- Use conventional styles/indents for Word bullet lists (#7280).
-
reference.docx
:- Use current standard Word theme (#7280). This includes using the sans-serif font Aptos instead of the serif font Cambria, and default colors for headings.
- Remove duplicate
DefaultParagraphFont
instyles.xml
.
-
New module Text.Pandoc.Transforms [API change] (Albert Krewinkel). This module exports the following functions which were formerly exported from Tetx.Pnadoc.Shared:
headerShift
,filterIpynbOutput
,eastAsianLineBreakFilter
, as well as some functions that were previously not exported. -
Text.Pandoc.Shared:
headerShift
,filterIpynbOutput
, andeastAsianLineBreakFilter
are no longer exported from this module; they are now exported from Text.Pandoc.Transforms (Albert Krewinkel).
-
Text.Pandoc.Error:
- Improve reporting of unsupported extensions errors (#9247, Albert Krewinkel).
-
Text.Pandoc.App:
- Move “transforms” after filters (#9664). This will mean that
--shift-heading-level-by
affects a heading added byreference-section-title
.
- Move “transforms” after filters (#9664). This will mean that
-
Text.Pandoc.App.CommandLineOptions:
- Simplify output for
OptVersion
. Omit the information about versions of dependencies. We no longer emit version info at this level anyway;pandoc-cli
intercepts and handles--version
. This code would only be called if someone used the pandoc library functionhandleWithOptInfo
in their own program.
- Simplify output for
-
Text.Pandoc.ImageSize:
- Export
ImageSize
datatype.
- Export
-
Text.Pandoc.SelfContained:
- Merge class attribute when both img and svg specify it (#9652, Carlos Scheidegger).
-
Text.Pandoc.Logging:
- Add
ScriptingInfo
constructor forLogMessage
[API change] (Albert Krewinkel). - Make
DocxParserWarning
a WARNING, not INFO. [API change]. - Add
UnsupportedCodePage
constructor toLogMessage
[API change]. - Add
UnclosedDiv
constructor forLogMessage
[API change].
- Add
-
Lua subsystem (Albert Krewinkel:
- Add a
pandoc.log
module. - Uupdate to pandoc-lua-marshal version 0.2.7 (#8916). This fixes counterintuitive behavior of the
content
property on BulletList and OrderedList items. Unmarshalling of that field now matches the behavior of the constructor. - Use newest zip module. This adds a
symlink
function to Entry objects, allowing to check if an entry represents a symbolic link. - Improve
pandoc.json.decode
docs. - Update and fix docs for
pandoc.types.Version
andpandoc.utils.type
. - Add new module
pandoc.image
The module provides basic querying functions for image properties. - Bump pandoc-lua-engine to 0.2.1.4.
- Add a
-
Use latest KaTeX CDN asset (#9707, Salim B).
-
pandoc-cli
: ensure UTF8 when emitting version info. -
tools/update-lua-module-docs.lua: improve script-internal docs, cleanup (Albert Krewinkel).
-
Allow network 3.2.
-
Use latest versions of texmath, djot, skylighting-core, skylighting.
-
Fix command test for #9652.
-
Fix some typos in code comments (#9638, guqicun).
-
Command tests: include regular PATH after directory with the test executable (ensures that DLLs will be found on Windows).
-
MANUAL.txt:
- Document
handout
variable for beamer (#9742). - Document formats affected by
--slide-level
(#9745). - Update the list of required LaTeX packages (#9728, Albert Krewinkel).
- Use more descriptive link text for ODT (#9673).
- Add clarification about
toc-title
indocx
,pptx
(#9645). - Better document truthiness for conditionals (#9661).
- Mention that
custom-style
works with ODT (Ian Max Andolina). - Harmonize typographic dashes (#9688, Salim B). Standardize on
---
with no space.
- Document
-
INSTALL.md: Minor tweaks (#9705, Leo Heitmann Ruiz).
pandoc 3.1.13
Click to expand changelog
-
Org reader:
- Fix treatment of
id
property under heading (#9639).
- Fix treatment of
-
DocBook reader:
- Add empty title to admonition div if not present (#9569). This allows admonition elements (e.g.
<note>
) to work withgfm
admonitions even if the<title>
is not present.
- Add empty title to admonition div if not present (#9569). This allows admonition elements (e.g.
-
DokuWiki reader:
-
Typst reader:
- Support Typst 0.11 table features: col/rowspans, table head and foot (#9588).
- Parse cell col/rowspans.
-
CSLJson writer:
- Put
$
or$$
around math incsljson
output (#9616).
- Put
-
ConTeXt writer:
- Fix options order with
\externalfigure
. The dimensions should comebeforeafter the class if both are present.
- Fix options order with
-
Typst writer:
- Put label after Span, not before. Labels get applied to preceding markup item.
- Support Typst 0.11 table features (#9588): colspans, rowspans, cell alignment overrides, relative column widths, header and footer, multiple table bodies with intermediate headers. Row heads are not yet supported.
- The default typst template has been modified so that tables don’t have lines by default. As is standard with pandoc, we only add a line under a header or over a footer. However, a different default stroke pattern can easily be added in a template.
- More reliable escaping in inline
[..]
contexts (#9586). For example, we need to escape[\1. April]
or it will be treated as an ordered list. - Handle
unnumbered
on headings (#9585).
-
LaTeX writer:
- Fix math inside strikeout (#9597).
-
Text.Pandoc.Writers.Shared:
- Export
isOrderedListMarker
[API change].
- Export
-
Change lhs tests so they don’t use
--standalone
. This will avoid test failures due to minor changes in skylighting versions, e.g. #9589. -
Use latest texmath, typst.
-
Require pandoc-lua-marshal 0.2.6 (#9613, Albert Krewinkel). Fixes an issue arising when the value of
content
properties on BlockQuote, Figure, and Div elements was an empty list. -
Update lua-filters.md (#9611, Carlos Scheidegger).
pandoc 3.1.12.3
Click to expand changelog
-
Markdown reader: Fix bug with footnotes at end of fenced div (#9576).
-
LaTeX reader:
- Improve tokenization of
@
(#9555). Make tokenization sensitive to\makeatletter
/\makeatother
. Previously we just always treated@
as a letter. This led to bad results, e.g. with the sequence\@
. E.g.,a\@ b
would parse as “ab” anda\@b
as “a”. - Make
withRaw
work insideparseFromToks
(#9517). This is needed for raw environments to work inside table cells. - Better handling of table colwidths (#9579). Previously the parser just failed if the column width specified in
p{}
wasn’t a multiple of\linewidth
. This led to cases where content was skipped.
- Improve tokenization of
-
Typst writer:
- Add ‘kind’ parameter to figures with tables (#9574).
- Avoid unnecessary box around image in figure (#9236).
- Omit width/height in images unless explicitly specified (#9236). Previously we computed width/heigth for images that didn’t have size information, because otherwise typst would expand the image to fit page width. This typst behavior has changed in 0.11. This change fixes a bug in which images would sometimes overflow page margins, depending on their intrinsic size.
- Don’t add hard-coded
inset
to tables (#9580). Instead, set this globally in the default template, allowing it to be customized.
-
LaTeX template: Fix block headings support for unnumbered paragraphs (#9542, #6018, Oliver Fabel).
-
HTML templates: Replace polyfill provider (#9537, @SukkaW). Replace polyfill.io with cdnjs.cloudflare.com/polyfill. polyfill.io has been acquired by Funnull, and the service has become unstable.
-
Korean translations: delete colon in translation for ‘to’. This was invalid YAML, and not desired anyway, since a colon is added.
-
Use latest commonmark, commonmark-extensions. This fixes a 3.12 regression in parsing of commonmark/gfm autolinks (jgm/commonmark-hs#151).
-
Depend on djot 0.1.1.3, which fixes a serious parsing bug affecting regular paragraphs after lists.
-
Depend on latest skylighting, skylighting-core, typst-hs, texmath.
-
MANUAL.txt: Change broken link to IDML cookbook (#9563).
pandoc 3.1.12.2
Click to expand changelog
-
Docx reader:
-
Markdown reader: fix regression in link parsing with wikilinks extensions (#9481). This fixes a regression introduced in 3.1.12.
-
Org reader/writer: support admonitions (#9475).
-
Org writer: omit extra blank line at end of quote block.
-
Typst writer: ensure that
-
,+
, etc. are escaped at beginning of block (#9478). Our recent relaxing of escaping (#9386) caused problems for things like emphasized-
characters that were rendered using#strong[-]#
. This now gets rendered as#strong[\-]
. -
LaTeX writer: fix bug when a language is specified in two different ways (#9472). If you used
lang: de-DE
but then had a span or div withlang=de
, the preamble would try to loadngerman
twice, leading to an error. This fix ensures that a language is only loaded once. -
Docx writer: Don’t copy over
footnotePr
insettings.xml
from reference.docx (#9522). -
EPUB writer: omit EPUB2-specific meta tag on EPUB3 (#9493). This caused a validation failure in epubs with cover images.
-
Lua: avoid crashing when an error message is not valid UTF-8 (Albert Krewinkel).
-
Text.Pandoc.SelfContained:
- Add
role="img"
to svgs. - Add
aria-label
to svg elements withalt
text if present. Screen readers ignorealt
attributes on svg elements but do pay attention toaria-label
(#9525).
- Add
-
Text.Pandoc.Shared: Fix regression in section numbering in
makeSections
(#9516). Starting with pandoc 3.1.12, unnumbered sections incremented the section number. -
Text.Pandoc.Class: fix
openUrl
TLS negotiation (#9483). With the release of TLS 2.0.0, the TLS library started requiring Extended Main Secret for the TLS handshake. This caused problems connecting to zotero’s server and others that do not support TLS 1.3. This commit relaxes this requirement. -
Depend on djot 0.1.1.0 (fixes rendering on multiline block attributes).
-
Use new releases of skylighting-format-blaze-html (#9520). Fixes auto-wrapping of long source lines in HTML print media.
-
Use new commonmark-extensions (fixes issue with the
rebase_relative_paths
extension when used with commonmark/gfm. -
Makefile: improve epub-validation target (#9493). Use
--epub-cover-image
to catch issues that only arise with that.
pandoc 3.1.12.1
Click to expand changelog
-
EPUB writer: omit EPUBv3-specific accessibility features on epub2 (#9469). Fixes a regression in 3.1.12.
-
More fixes for SVG ids with
--self-contained
(#9467). This generalizes the fix to #9420 so it applies to things likestyle="fill(url(#..."
and should fix problems with SVGs including gradients. -
Powerpoint writer: properly handle math in headings and tables (#9465). This ensures that paragraphs containing math are wrapped in a
mc:AlternateContent
node as required. -
Makefile: make validate-epub check v2 output too.
pandoc 3.1.12
Click to expand changelog
-
Add
djot
as input and output format. Djot is a light markup syntax (https://djot.net).- New module Text.Pandoc.Readers.Djot [API change]. The function
readDjot
is also exported by Text.Pandoc.Readers. - New module Text.Pandoc.Writers.Djot [API change]. The function
writeDjot
is also exported by Text.Pandoc.Writers.
- New module Text.Pandoc.Readers.Djot [API change]. The function
-
--number-sections
now uses the first digit for the number of the top-level section, no matter what its level. So if the top-level section is level-2, numbers will be1
,2
, etc. rather than0.1
,0.2
, as in the past (#5071). For some backwards compatibility, we revert to the old behavior when the--number-offset
option is used. -
DocBook reader:
- Better handling of
<procedure>
and<substeps>
(#9341):<procedure>
now gets parsed as an ordered list, and<substeps>
as a sublist.
- Better handling of
-
Man reader:
- Move spaces outside of emph/strong (#9445).
-
MediaWiki reader:
-
BibTeX reader:
- Support
pagetotal
in converting BibLaTeX.
- Support
-
Markdown reader:
- Fix wikilinks extensions to allow newlines in titles (#9454).
-
EPUB reader:
- Don’t put
#
characters in identifiers.
- Don’t put
-
LaTeX reader:
-
Typst reader:
- Fix handling of
\overline
(#9294). Due to a typo, it was being incorrectly rendered as an\underset
. - Improve handling of inline
#quote
(#9413). - Fix handling of
dot()
,tilde()
,ddot()
(jgm/typst-hs#38). - Fix character used for
norm
(jgm/typst-hs#38).
- Fix handling of
-
Typst writer:
- Use reference form (e.g.
@jones2000[p. 30]
) for citations when possible. - Use
#ref
or@
for links withreference-type="ref"
(#7463). This attribute is added to LaTeX\cref
, for example. - Improve citation support (#9452). Emit
form: "prose"
orform: "year"
qualifiers if the citation is author-in-text or suppress-author. Strip initial comma from suffix, since typst will add an extra one. - Unescape URI escapes in image paths (#9389).
- Handle labels and citaiton ids with spaces and other special characters (#9387). In these cases, we produce an explicit
label()
rather than using<>
or@
. - Avoid producing illegal labels (#9387).
- Avoid unnecessary escapes (#9386).
- Use reference form (e.g.
-
LaTeX writer:
-
HTML writer:
- Add suffix to multiple footnote section ids, so they are unique (Sam May). This is necessary when
--reference-location
isblock
orsection
.
- Add suffix to multiple footnote section ids, so they are unique (Sam May). This is necessary when
-
EPUB writer:
- Add ARA roles for accessibility (#9378, Iacobus1983). Footnote references are given role “doc-noteref”, footnote text gets “doc-footnote”, and nav gets “doc-toc”.
- Ensure that an alt attribute is always added (#9354). This seems to be required by iBooks; even an empty alt attribute will satisfy it.
- Add
xml:lang
to package element (#9372). - Add accessibility metadata to EPUB metadata (#9372, #9400, Iacobus1983 and John MacFarlane). Reasonable default values are used to ensure that pandoc’s EPUBs conform to the EU Accessibilty Act requirements, but values can be overridden using metadata.
-
Docx writer:
-
Man writer:
-
Org writer:
- Escape special lines in code blocks (#9218, Albert Krewinkel).
-
Markdown writer:
- Use different width fences for nested divs (#9450). Outer divs have longer fences. This aids clarity for the reader, making it easier to see where the div ends. It also makes the output compatible with some other implementations, e.g. micromark, which require different-width fences for nesting.
- Fix output for pipe tables with a huge number of columns (#9346). Previously we got invalid pipe tables when the number of table columns exceeded the setting of
--columns
.
-
Powerpoint writer:
- Fix regression in layout for slides with figures (#9442).
- Use internal column widths in pptx writer tables (#5706, Tomas Dahlqvist). The table writer used to only divide all available width evenly for all columns. In this update the code uses the incoming widths if they are available. If they are not set the earlier even distribution is used. Some of the golden templates are adjusted slightly because of different rounding when using the new calculation model.
-
Custom writers:
- Fix handling of common state (#9229, Albert Krewinkel). The CommonState (
PANDOC_STATE
in Lua) may change between the time that a custom writer script is first loaded and when the writer is run. However, the writer was always using the initial state, which led to problems, e.g. when the mediabag was updated in a filter, as those updates where not visible to the writer. The state is now updated right before the writer function runs.
- Fix handling of common state (#9229, Albert Krewinkel). The CommonState (
-
Text.Pandoc.SelfContained:
-
ConTeXt template: support font fallback (#9361, Lawrence Chonavel).
-
Text.Pandoc.Shared:
addPandocAttributes
: usewrapper
attribute, notwrap
, for Divs and Spans added as wrappers to hold attributes on elements that do not accept them.makeSections
behavior changes:- When the optional base level parameter is provided, we no longer ensure that the sequence of heading levels is gapless (#9398). Instead, we set the lowest heading level to the specified base level, and adjust the others accordingly. If an author wants to skip a level, e.g. from level 1 to level 3, they can do that. In general, the heading levels specified in the source document are preserved;
makeSections
only puts them into a hierarchical structure. - Section numbers are now assigned differently, as described above under
--number-sections
changes (#5071).
- When the optional base level parameter is provided, we no longer ensure that the sequence of heading levels is gapless (#9398). Instead, we set the lowest heading level to the specified base level, and adjust the others accordingly. If an author wants to skip a level, e.g. from level 1 to level 3, they can do that. In general, the heading levels specified in the source document are preserved;
- Improve
makeSections
code for section number calculation.
-
Text.Pandoc.Chunks:
- Autogenerate unique ids for sections missing them (#9383). This is needed for TOC generation to work properly. We can’t create TOC links if there are no ids. This fixes some EPUB validation issues we’ve been getting since switching over to Chunks for chunking.
- Improve
fixTOCTreePaths
. We weren’t adding ids for section headings that don’t head a chunk, but these headings are needed for a TOC.
-
Lua: catch encoding error in
pandoc.read
(#9385, Albert Krewinkel). Fixed a bug that could lead to an un-catchable error and program termination whenpandoc.read
was called with invalid UTF-8 input. -
LaTeX template: support font fallback (lawcho). This support is LuaLaTeX-specific. See MANUAL.txt for documentation.
-
Text.Pandoc.Readers: Add
readMan
to exports [API change] (George Stagg). -
Text.Pandoc.PDF:
- Reliably detect when TOC has changed (#9295). Sometimes the TOC changes but there are no warnings: this happens when no labels are present. In this case we must rerun LaTeX. So we now take the SHA1 hash of the TOC file and rerun LaTeX if it changes between runs.
- Increase maximum number of LaTeX runs to 4 (#9299). On some documents, 4 runs are needed (e.g. when a LastPage reference is used).
- Avoid
readFileLazy
, which caused improperly cleaned-up temp directories on Windows (#9460).
-
MANUAL.txt:
-
Makefile: Validate generated EPUB as part of prerelease checks.
-
Add validation for docx golden files to CI (Edwin Török).