Skip to content

Releases: jgm/pandoc

pandoc 3.5

05 Oct 21:13
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Add command-line options --list-of-figures/--lof and --list-of-tables/--lot (#10029, Akash Patel). Only docx, latex, and context are affected by these options currently. Setting the lof and lot variables will also work for the formats that are currently supported.

  • Defaults files: interpolation of environment variables now works for to and from fields (#8024). This is needed because these files can contain paths of custom readers/writers.

  • Docx reader:

    • Reset lists after headers in same list numId (#10258). To accomplish this, we add a Heading constructor to BodyPart and include on it all the information list items have.
  • DocBook reader:

    • Parse id, class, and tabstyle on tables (#10181, Erik Rask). Add parsing of id (xml:id), class, and tabstyle XML attributes for table and informaltable in the DocBook reader. The tabstyle value is put in the ‘custom-style’ attribute.
  • Dokuwiki reader:

    • Be more forgiving about misaligned lists, like dokuwiki itself (#8863).
    • Improve blockquote parsing in dokuwiki. Allow for quoted code blocks.
    • Enable smart extension.
    • Properly parse -- and --- as dashes.
    • Fix block quote behavior (#6461). Blockquotes are not really block containers in DokuWiki; the lines are interpreted literally (so, e.g., you can’t start a list), and line breaks are added at the ends.
  • EPUB reader:

    • Fix links to other files in the EPUB, making them internal links to a fragment derived from the filename (#10207). There was already code to handle links like #foo, but not to handle links like ch0001.html#foo.
  • LaTeX reader:

    • Add em, ex, px, mu to list of units for dimension args (#10212).
  • ANSI writer:

    • Fix subscripts (Evan Silberman).
  • DokuWiki writer:

    • Don’t emit <HTML> tags (#7413). The use of these tags is now strongly discouraged for security reasons, and will be removed. We previously used them as a fallback for lists that could not be represented using DokuWiki syntax, e.g. ordered lists with fancy numbers or lists with multiple blocks in their items. We also used them for block quotes with multiple blocks as their contents. We now use the <WRAP> syntax (from the optional WRAP plugin) to handle lists with multiple blocks as their contents. A new method of handling block quotes with complex contents has the side benefit of also handling nested block quotes, which weren’t supported before. <HTML> and <html> tags are only for raw HTML blocks and inlines, and only if the raw_html extension is enabled. (It is now a valid extension for dokuwiki, though off by default.)
  • Docx writer:

    • Support --list-of-figures and --list-of-tables (or lof and lot variables) (Akash Patel).
  • HTML writer:

    • Don’t emit missing title/lang warnings if templates does not contain the pagetitle or lang variables respectively (#9370).
  • LaTeX writer:

    • Better fix for lists in definition lists (#10241). In commit a26ec96 we added an empty \item[] to the beginning of a list that occurs first in a definition list, to avoid having one item on the line with the label. This gave bad results in some cases (#10241) and there is a more idiomatic solution anyway: using \hfill.
    • Avoid error on refs div with empty citations (#10185). If there are no citations, don’t emit an empty CSLReferences environment.
  • RST writer:

    • Change bullet list hang from 3 to 2. This accords with the style in the RST reference docs.
    • Handle cases where indented context starts with block quote (#10236). In these cases we emit an empty comment to fix the point from which indentation is measured; otherwise the block quote is not parsed as a block quote. This affects list items and admonitions.
    • Don’t enclose the list table in a .. table::; this leads to doubled captions (#10226).
    • Fix alignment of list table items corresponding to cells (#10227).
  • JATS template:

    • Support floats-group (Albert Krewinkel, see #10196). The content of the floats-group variable is now rendered in a <floats-group> element when using the publishing or archiving tag sets.
  • LaTeX and Beamer templates:

    • Split old default.latex into two templates, default.latex and default.beamer, factoring common parts into partials: fonts.latex, common.latex, passoptions.latex, hypersetup.latex, after-header-includes.latex.
    • Make default.beamer the default template for beamer.
    • Add shorttitle, shortsubtitle, shortauthor, shortinstitute, shortdate variables to beamer template (#10248, Thomas Hodgson).
    • Make --number-sections work with beamer (#12045, Thomas Hodgson).
    • Support a list of images for titlegraphic in beamer template (#10246, Thomas Hodgson). Title graphic options will be applied to each title graphic. Images will be separated by \enspace.
    • Beamer theme options (#10243)
    • Add theme options to beamer template: colorthemeoptions, fontthemeoptions, innerthemeoptions, outerthemeoptions (#10243, Thomas Hodgson).
    • Don’t load amsmath, amssym in beamer template. These are loaded by beamer automatically.
  • Text.Pandoc.SelfContained:

    • Improve handling of links to remote CSS (#10261).
  • Text.Pandoc.Class:

    • Allow extracting data: URIs even in PandocPure (--sandbox) (#10249).
    • Export extractURIData [API change].
  • Text.Pandoc.PDF:

    • Read .toc and .log files from output directory (#10186). When this is different from the input directory, this is where .toc and .log files are written.
  • Text.Pandoc.Shared:

    • Modify addPandocAttributes for changes in commonmark-pandoc. The new commonmark-pandoc version automatically adds the attribute wrapper="1" on all Divs and Spans that are introduced just as containers for attributes that belong properly to their contents. So we don’t need to add the attribute here. This gives much better results in some cases. Previously the wrapper attribute was being added even for explicit Divs and Spans in djot, but it is not needed in these cases.
  • Text.Pandoc.Options:

    • Add writerListOfFigures and writerListOfTables fields to WriterOptions (#8245, Akash Patel). [API change]
  • Text.Pandoc.App:

    • Add optListOfFigures and optListOfTables to Opt (#8245) [API change].
  • Lua subsystem (Albert Krewinkel):

    • Update List module (#9835). The module now comes with a method :at(index[, def]) that allows to access indices, accepts negative indices to count from the end, and will return the def value as a default if the list has no item at the given position. Furthermore, the list constructor pandoc.List now accepts iterators. E.g., pandoc.List(text:gmatch '%S+') returns the list of words in text.

    • Support character styling via pandoc.layout. The Doc values produced and handled by the pandoc.layout module can now be styled using bold, italic, underlined, or strikeout. The style is ignored in normal rendering, but becomes visible when rendering to ANSI output. The pandoc.layout.render function now takes a third parameter that defines the output style, either plain or ansi.

    • It is now possible to return a single filter from a filter file, e.g.

      -- Switch single- and double quotes
      return {
        Quoted = function (q)
          elem.quotetype = elem.quotetype == 'SingleQuote'
            and 'DoubleQuote' or 'SingleQuote'
          return elem
        end
      }

      The filter must not contain numerical indexes, or it might be treated as a list of filters.

    • Add list_of_figures and list_of_tables to writer options (Akash Patel).

  • Use latest releases of commonmark, commonmark-pandoc, texmath, djot.

  • Stop depending on package SHA (Albert Krewinkel). Use crypton instead.

  • linux/make_artifacts.sh: add riscv64 support (Olivier Benz).

  • Fix invalid XML in test/docx/normalize.docx (#10242).

  • doc/lua-filters.md: list functions in pandoc.utils alphabetically (Albert Krewinkel).

  • MANUAL.txt:

    • Clarify use of beamerarticle variable (#10250).
    • Add clarification to address user issues like #6704 (Yehuda Katz).

pandoc 3.4

10 Sep 17:33
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • New output format: ansi (for formatted console output) (Evan Silberman). Most Pandoc elements are supported and printed in a reasonable way, if not always ideally. This version does no detection of terminal capabilities, nor does it fall back to different output styles for less-capable terminals.

  • Add command line options --table-caption-position and --figure-caption-position. These allow the user to specify whether to put captions above or below tables and figures, respectively. The following output formats are supported: HTML (and related such as EPUB), LaTeX (and Beamer), Docx, ODT/OpenDocument, Typst.

  • Change default --pdf-engine via HTML to WeasyPrint (#10142). wkhtmltopdf is deprecated. weasyprint is the easiest-to-install, maintained alternative. For better results, one might prefer pagedjs-cli.

  • Org reader:

    • Fix parsing of src blocks with an -i flag (#10071, Albert Krewinkel). Tabs are now preserved in the contents of src blocks if the the block has the -i flag.
  • RTF reader:

    • Handle images inside shp contexts (#10145).
  • RST reader:

  • Improve simple table support (#10093). Multiline rows occur only when the first cell is empty; we were previously treating lines with any empty cell as row continuations. In addition, we no longer wrap multiline cells in Para if they can be represented as Plain. This is consistent with docutils behavior.

  • LaTeX reader:

    • Math environments don’t have bracketed options (#10160).
    • Parse nested tabular environments (#4746).
  • Typst reader:

    • Change how “block” elements are handled. Previously they were always parsed as divs. But actually they can occur in some “inline” contexts. Now we first try to parse them as inlines, and only as blocks if that fails. A surrounding Div or Span element is added only if there is an identifier.
  • HTML reader:

    • Only parse main element’s contents (if present) (#10140). If main has an id or class, we include a div with that id or class; otherwise just the contents.
    • Read TeX annotation in MathML content if present (#9971).
    • Better handle KaTeX-generated math (#9971). KaTeX emits the mathml followed by a span with an HTML fallback. Previously pandoc was converting both. We now ignore the HTML fallback span, marked with class katex-html.
  • New module: Text.Pandoc.Writers.ANSI [API change] (Evan Silberman).

  • Docx writer:

    • Add “SuppressAuthor” and “AuthorOnly” to citationMode when +citations is used (thomjur).
    • Support custom-style attribute for docx table (Sebbones).
    • Support --number-offsets.
    • Make table/figure rendering sensitive to caption position settings.
  • OpenDocument writer:

    • Make table/figure rendering sensitive to caption position settings.
  • Typst writer/template:

    • Implement figure caption positions by triggering a show rule in the default template, which determines caption positions for figures and tables globally.
    • Don’t include trailing semicolon after @ style citations with suffixes (#10148).
    • Template: move header-includes before show doc (#9996, Gordon Woodhull).
  • LaTeX writer:

    • Make table/figure rendering sensitive to caption position settings (#5116).
    • Preserve locator labels with --natbib (#10057).
  • HTML writer/template:

    • Make <figcaption> placement sensitive to caption position settings. For tables, <caption> must be the first element, and positioning is determined by CSS, for here we set a variable which the default template is sensitive to.
    • Use makeSectionsWithOffsets for writerNumberOffsets, instead of the old, inefficient code.
    • Don’t add doc-biblioref role to every link in a citation; only to links to the bibliography (#10156).
    • Add data- when rendering label attribute (#10048).
  • Markdown writer:

    • Avoid emitting markdown caption if table has fallen back to raw HTML, which will then contain a <caption> tag (#10094).

    • Make math sensitive to tex_math_gfm extension (#9121). This means that in GFM output, the “new style” math will be used by default, e.g.

      $`x=y`$
      
      ```math
      x = y
      ```
      

      To defeat this and get the older behavior, namely

      $x=y$
      
      $$x=y$$
      

      one could use -t gfm-tex_math_gfm.

  • AsciiDoc writer:

    • Add link: prefix when needed (#10105). AsciiDoc requires it except for http, https, irc, mailto, ftp schemes (#10105).
    • Preserve original base level (#10062). We used to normalize so that the base level is always 1, but asciidoc no longer seems to care about that, and the behavior creates difficulties when we are converting fragments.
    • Don’t emit empty figure caption (#10047).
  • ODT writer:

    • Add TableCaption to styles.xml (#10058, Ian Max Andolina).
  • LaTeX template:

    • Fix wrong beamer color in (sub)section page (Jonathan).
  • Text.Pandoc.Options:

    • Add CaptionPosition and new WriterOptions fields writerFigureCaptionPosition and writerTableCaptionPosition [API change].
  • Text.Pandoc.Opt:

    • Change default for optNumberOffset to []. This behaves the same as [0,0,0,0,0].
    • Add Opt fields optFigureCaptionPosition and optTableCaptionPosition [API change].
  • Text.Pandoc.Format: change formatFromFilePaths so that it is smarter about URLs. URLs are parsed, and we take the format from the path component, if present (#10141). This means that https://emacs.org/ will be treated as HTML, while https://emacs.org/sample.org will be treated as Org.

  • Text.Pandoc.URI:

    • Add unofficial gemini: to list of URI schemes (Pau RE).
  • Text.Pandoc.Shared:

    • Add makeSectionsWithOffsets [API change].
    • Remove `stripEmptyParagraphs [API change] (Albert Krewinkel). This function is no longer used.
  • Text.Pandoc.Highlighting: Expose formatANSI [API change] (Evan Silberman).

  • Text.Pandoc.Writers.Shared: export to{Sub,Super}scriptInline [API change] (Evan Silberman).

  • Remove use of partial functions (e.g. head) in code.

  • Use latest skylighting-core, skylighting, doclayout, texmath, typst.

  • pandoc-lua-engine: Add accessors for several writer options, including some that were added in previous releases.

  • pandoc-server: Initialize some missing fields in WriterOptions: writerEpubTitlePage, writerChunkTemplate, writerListTables, writerFigureCaptionPosition, writerTableCaptionPosition.

  • CONTRIBUTING.md: Summarize steps for adding a new cli option.

  • MANUAL.txt:

    • Clarify that the --number-offset option should only directly affect numbering of the first section heading in a document; subsequent headings will increment normally.
    • Fix asciidoc link (#10039).
    • Fix CSL Docs broken link (#10100, Tristano Ajmone).
    • Document the use of luatexja when CJKmainfont is used with lualatex (#3873, Kolen Cheung).
    • Add a citations (typst) section to the manual (#9127).
    • Clarify that citations affects both input and output for org.
    • Add note on --citeproc that you may need to disable citations extension on the output format (e.g., -t markdown-citations) to see the rendered citation (#9127, #10012).
  • INSTALL.md — reorganise info on static binaries and add conda-forge install options (#10098, #10069, Ian Max Andolina).

pandoc 3.3

29 Jul 00:13
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • New cli option: --link-images. This causes images to be linked rather than embedded in ODT.

  • Allow --number-sections to take an optional true|false argument.

  • RTF reader:

    • Handle \*\shppict without dropping image (#10025).
  • TWiki Reader:

    • Recognize WikiWords as internal links (#9941).
    • Avoid partial function.
  • Typst reader:

    • Ignore ‘pad’ and just parse its body (#9958).
    • Use typst 0.5.0.5. Fixes parsing of equations like $1.$.
  • Docx writer:

    • Fix regression with nested lists (#9994). The bug affects e.g. ordered lists with bullet sublists; after the sublist the top-level list reverts to bullets instead of being properly numbered. This is a regression introduced in version 3.2.1.
  • BibTeX writer:

    • Ensure that “literal” names are enclosed in braces (#9987).
  • Man writer:

    • Use default middle header when metadata does not include header (#9943). This change causes pandoc to omit the middle header parameter when header is not set, rather than emitting "". The parameter is optional and man will use a default based on the section if it is not specified.
  • HTML templates: don’t load polyfill (#9918). This was added in a period when MathJaX required polyfill. MathJaX no longer recommends this and polyfill should no longer be necessary on any reasonably modern browser.

  • Translations:

    • Add ua.yaml (Jens Oehlschlägel).
    • Add a script (tools/update-translations.py) and Makefile target (update-translations) to update translation data automatically from babel and polyglossia upstream (Stephen Huan).
    • Use this script to update language data, increasing the number of languages we cover (Stephen Huan). Fix a few small bugs in existing translations.
  • Fix some mistakes with Japanese language code (#9938). In several places we were mistakenly assuming that the BCP 47 code for Japanese language was jp. It is ja.

  • Text.Pandoc.Options:

    • New field in WriterOptions: writerLinkImages [API change] (#9815).
  • Text.Pandoc.App.Opt:

    • New field in Opt: optLinkImages [API change] (#9815).
  • Lua subsystem:

    • Keep lpeg and re as “loaded” modules (Albert Krewinkel). The modules lpeg and re are now treated as if they had been loaded with require. Previously the modules were only assigned to global values, but could be loaded again via require, thereby allowing to use a system-wide installation. However, this proved to be confusing.

      The old behavior can be restored by adding the following lines to the top of Lua scripts, or to the init.lua in the data dir.

      debug.registry()['_LOADED'].lpeg = nil
      debug.registry()['_LOADED'].re = nil
      
  • pandoc-cli: Include pandoc copyright in Lua version info (Albert Krewinkel).

  • pandoc-cli: Refer printing of version info to the Lua interpreter (Albert Krewinkel). The Lua interpreter no longer terminates when called with -v or --version arguments, thus improving compatibility with the default lua interpreter program.

  • Avoid partial functions in JATS reader, DocBook writer, Haddock reader.

  • Allow tls 2.1.x.

  • MANUAL.txt:

    • Make documentation of extensions clearer (#9060).
    • Fix section level for two Extensions entries.
  • lua-filters.md: Partially autogenerate docs for module pandoc (Albert Krewinkel). The documentation system isn’t powerful enough to generate the full documentation automatically.

pandoc 3.2.1

24 Jun 21:37
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Fix gfm_auto_identifiers to replace emojis with their aliases, as documented (#9876).

  • CSV reader:

    • Turn line breaks into LineBreaks not SoftBreaks (#9797).
  • Docx reader:

    • Support task lists (#8211).
    • Fix a small bug in parsing delimiters in numbered lists, which led to the default delimiter being used wrongly in some cases.
    • Improve handling of captions.
      • Turn captioned images into Figure elements. Closes #9391.
      • Improve the logic for associating elements with captions (#9358).
      • Ensure that captions that can’t be associated with an element aren’t just silently dropped (#9610).
    • Support HorizontalRule. We support both pandoc-style and the style described on a Microsoft support page, an empty paragraph with a bottom border (#6285).
    • React to "left" value on jc attribute.
    • Handle column and cell alignments (#8551). We take the column alignments from the first body row.
    • Fix a bug that caused comments inside insertions or deletions to be ignored (#9833).
  • HTML reader:

    • Better handle non-li elements in ul and ol (#9809). For example, a p after a closed li will be incorporated into the previous li. This mirrors what browsers do with this invalid HTML.
  • LaTeX reader:

    • Fix parsing of dimensions beginning with ., e.g. \kern.1pt (#9902).
  • Markdown reader:

    • Allow author-only textual citations (#7219). E.g. -@reese2002 outside of brackets.
  • RST reader:

    • Tighten up rules for when emphasis can start (#9805).
    • Support :cite: role with citeproc (#9904). A subset of the functionality of the sphinxcontrib-bibtex extension to Sphinx is supported.
  • Textile reader:

    • Don’t let spans begin right after a symbol (#9878).
  • Texinfo writer:

    • Ensure proper escaping in all node/link contexts.
    • Target node rather than anchor when possible in internal links.
    • Remove illegal characters from internal link anchors (#6177).
    • Use two commas not one in @ref.
    • Don’t add anchors to headings. We don’t need them, now that we make internal links use the node.
    • Avoid duplicate node names.
    • Improve menus. Properly handle the case where the node name is different from the descriptive title.
  • Texinfo template: add variables for filename and version.

  • Typst reader:

    • Fix an incomplete pattern match (#9807).

    • Handle inline bodies ending in a parbreak. E.g.

      `#strong[
      test
      ]
      
  • ConTeXt template: remove \setupbackend[export=yes] (#9820).

  • Docx writer:

    • Omit jc attribute on table cells with AlignDefault (#5662).
    • Better formatting for task lists. Task lists are now properly formatted, with no bullet (#5198).
    • Replace an expensive generic traverse to remove Space elements, for better performance.
    • The new OpenXML template had spaces for metadata that need to be filled with OpenXML fragments with the proper shape. This patch ensures that everything is the right shape.
    • Wrap figures with id in a bookmark (#8662).
    • Add eastAsia font hints to w:r (#9817). We do this when the text in the run contains any CJK characters. This ensures that ambiguous code points (e.g. quotation marks) will be represented as “wide” characters when together with CJK characters.
    • Clean up Abstract Title and Subtitle in default reference docx. Center Subtitle, remove color.
    • Allow OpenXML templates to be used with docx (#8338, #9069, #7256, #2928). The --reference-doc option allows customization of styles in docx output, but it does not allow one to adjust the content of the output (e.g., changing the order in which metadata, the table of contents, and the body of the document are displayed), or adding boilerplate text before or after the document body. For these changes, one can now use --template with an OpenXML template. (See the default openxml template for a sample.) --include-before-body and --include-after-body can also now be used with docx output. The included files must be OpenXML fragments suitable for inclusion in the document body.
    • New unexported module Text.Pandoc.Writers.Docx.OpenXML.
  • HTML writer:

    • Ensure URI escaping needed for html4 (#9905). Unicode characters need not be escaped for html5, and still won’t be.

    • Don’t emit unnecessary classes in HTML tables (#9325, Thomas Soeiro). Pandoc used to emit a header class on the tr element that forms the table header. This is no longer needed, because head > tr will do the same thing. Similarly, pandoc used to emit even and odd classes on trs, allowing striped styling. This is no longer needed, because one can use e.g. tbody tr:nth-child(2n).

      Compatibility warning: users who relied on these classes to style tables may need to adjust their CSS.

  • JATS writer:

    • Support supplementary-material in metadata for jats_articlepublishing (#9818).
  • LaTeX writer:

    • New method for ensuring images don’t overflow (#9660). Previously we relied on graphicx internals and made global changes to Gin to force images to be resized if they exceed textwidth. This approach is brittle and caused problems with \includesvg (see #9660). The new approach uses a new macro \pandocbounded that is now defined in the LaTeX template. (Thanks here to Falk Hanisch in mrpiggi/svg#60.) The LaTeX writer has been changed to enclose \includegraphics and \includesvg commands in this macro when they don’t explicitly specify a width or height. In addition, the writer now adds keepaspectratio to the \includegraphics or \includesvg options if height is specified without width, or vice versa. Previously, this was set in the preamble as a global option. Users should attend to the following compatibility issues:
      • If custom templates are used with the new LaTeX writer, they will have to be updated to include the new \pandocbounded macro, or an error will be raised because of the undefined macro.
      • Documents that specify explicit dimensions for an image may render differently, if the dimensions are greater than the line width or page height. Previously pandoc would shrink these images to fit, but the new behavior takes the specified dimensions literally. In addition, pandoc previously always enforced keepaspectratio, even when width and height were both specified, so images with width and height specified that do not conform to their intrinsic aspect ratio will appear differently.
    • Task lists must be unordered (#9185).
    • Specify language option for selnolig and only include it if english or german is used (#9863). (This includes changes to the LaTeX template.) This should restore proper ligature suppression when lualatex is used.
    • Fix --toc-depth with beamer output (#9861). Previously only top-level sections were ever included in the TOC, regardless of the setting of --toc-depth.
    • Use \linewidth instead of \columnwidth or \textwidth for resizing figures, table cells, etc. in LaTeX (#9775). \linewidth, unlike the others, is sensitive to indented environments like lists.
  • LaTeX template: put babel-lang in options to beamer (#9868). This is required to make beamer use proper localized terms for things like “Section.”

  • Markdown writer:

    • Don’t print extra caption when using implicit_figures.
    • Ensure blank line after HTML blocks in commonmark-based formats (#9792).
    • Fix bug rendering block quotes in lists (#9908).
  • Typst writer:

    • Support ‘.typst:no-figure’ and ‘typst:figure:kind=kind’ attributes (#9778, Carlos Scheidegger). This extends support for fine-grained properties in Typst. If the typst:no-figure class is present on a Table, the table will not be placed in a figure. If the typst:figure:kind attribute is present, its value will be used for the figure’s kind (#9777). These features are documented in doc/typst-property-output.md.
  • Typst template:

    • Add subtitle (#9747, Mickaël Canouil).
    • Use content rather than string for title, author, date, email (#9823). This allows formatting in title, author, date, and email fields. Since the PDF metadata requires a string, and typst only converts the title to a string (not the authors), we use
  • Textile writer:

    • Get rid of header, odd, even classes on tr (#9376).
  • Text.Pandoc.Class:

    • fillMediaBag: Convert IOErrors to warnings when fetching absolute paths (#9859, Albert Krewinkel). This will allow many conversions that would have failed with an error to succeed (albeit without images or other needed resources).
  • Text.Pandoc.ImageSize:

    • Don’t prefer exif width/height when they conflict with image width/height (#9871). That was a mistaken call in #6936. Usually when these values disagree, it is because the image has been resized by a tool that leaves the original exif values the same, so the width/height metadata are more likely to be correct that exif width/height.
  • Text.Pandoc.SelfContained:

    • Strip CRs from XML before base64 encoding for data URI (so tests can work on Windows).
    • Only create <svg> elements for SVG images when the image has the class inline-svg. Otherwise just use a data URI as we do with other images (#9787).
  • Lua subsystem (Albert Krewinkel):

    • Split Init module into more modules. The module has grown unwieldy and is therefore split into three internal Haskell modules, Init, Module, and Run.
    • Add function pandoc.utils.run_lua_filter (#9803).
    • Add function pandoc.template.get (#9854, co-authored by Carsten Gips). The function allows to specify a template with the same argument value that would be used with the --template command line parameter.
    • Keep CommonState object in the registry. The state is an internal value and should be treated as such. The PANDOC_STATE global is merely a copy...
Read more

pandoc 3.2

11 May 21:11
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Change to --file-scope behavior (#8741): previously a Div with an identifier derived from the filename would be added around the contents of each file. This caused problems for “chunking” files into chapters, e.g. in EPUB. We no longer add the surrounding Div. This cooperates better with chunking. Note, however, that if you have relied on the old behavior to link to the beginning of the contents of a file using its filename as identifier, that will no longer work.

  • Markdown reader:

    • Allow repeated labels in numbered example lists. Previously if you tried to use the same label as an earlier example list item, you’d get a new number, not the old one, and references to the label would go to the second occurrence. Now an existing label will be reused, and no new number will be generated. Caveat: this only works reliably when the re-used example list item occurs by itself in a list, or occurs in a list of previously used example list items that occur in exactly the same order as previously.
    • Fix normalCite so it doesn’t consume past a closing ] boundary (#9710). This was causing an exponential performance bug on long lists of links containing potential emphasis characters.
    • Generalize inlinesInBalancedBrackets to inBalancedBrackets, with a parameter for the inner parser.
    • Auto-close unclosed divs (#9635). This applies to both fenced and HTML-ish varieties. Otherwise we face an exponential performance problem with backtracking. A warning is issued when a div is implicitly closed.
  • RST reader:

    • Fix figclass and align annotations for figures (#7473, Gokul Rajiv).
  • LaTeX writer:

    • Use polytonicgreek instead of polutonikogreek with babel (#9698). polutonikogreek is outdated. Also recognize both in the LaTeX reader.
    • Improve treatment of math inside soul commands (#1294, #5529). soul commands (ul, hl, st) are very fragile and the math must be handled specially.
  • LaTeX reader:

    • Fix over-eager macro expansion in conditionals (#9676).
    • Parse flalign, flalign* math environments (#9679). We parse these as Math elements with an aligned environment. Semantically it’s not exactly the same, but better than falling back to raw LaTeX.
  • LaTeX template: add titlegraphicoptions variable (#9207, Guilhem Saurel).

  • Docx reader:

    • Issue warning rather than error when we can’t parse EndNote citations (see #8433).
    • Fix anchor in header after anchor (#9626, mbracke).
  • RTF reader:

    • Don’t try to handle non-default code pages (#9683). Emit a warning instead.
  • OpenDocument writer:

    • Implement custom-style for spans (#9657).
  • Typst writer:

    • Add blank line in definition lists with multiple definitions (see #9704).
    • Property output (#9648, Gordon Woodhull). The Typst writer will pass on specially marked attributes as raw Typst parameters on selected elements. This allows extensive customization using filters. A separate document (doc/typst-property-output.md) has been added that provides extensive documentation and examples of the use of this feature.
  • Markdown writer:

    • Don’t try to align columns in pipe tables with lines greater than COLUMNS. The alignment just reduces readibility when the lines soft wrap.
    • Don’t use raw_attribute syntax for raw blocks, unless there is no other option (see #9677). Macros in a raw_attribute block don’t get interpreted when it is read again by pandoc’s markdown reader.
  • ConTeXt writer:

    • Replace depreciated \sc with \setsmallcaps (#9518, James P. Ascher).
  • Docx writer:

    • Use conventional styles/indents for Word bullet lists (#7280).
  • reference.docx:

    • Use current standard Word theme (#7280). This includes using the sans-serif font Aptos instead of the serif font Cambria, and default colors for headings.
    • Remove duplicate DefaultParagraphFont in styles.xml.
  • New module Text.Pandoc.Transforms [API change] (Albert Krewinkel). This module exports the following functions which were formerly exported from Tetx.Pnadoc.Shared: headerShift, filterIpynbOutput, eastAsianLineBreakFilter, as well as some functions that were previously not exported.

  • Text.Pandoc.Shared:

    • headerShift, filterIpynbOutput, and eastAsianLineBreakFilter are no longer exported from this module; they are now exported from Text.Pandoc.Transforms (Albert Krewinkel).
  • Text.Pandoc.Error:

    • Improve reporting of unsupported extensions errors (#9247, Albert Krewinkel).
  • Text.Pandoc.App:

    • Move “transforms” after filters (#9664). This will mean that --shift-heading-level-by affects a heading added by reference-section-title.
  • Text.Pandoc.App.CommandLineOptions:

    • Simplify output for OptVersion. Omit the information about versions of dependencies. We no longer emit version info at this level anyway; pandoc-cli intercepts and handles --version. This code would only be called if someone used the pandoc library function handleWithOptInfo in their own program.
  • Text.Pandoc.ImageSize:

    • Export ImageSize datatype.
  • Text.Pandoc.SelfContained:

    • Merge class attribute when both img and svg specify it (#9652, Carlos Scheidegger).
  • Text.Pandoc.Logging:

    • Add ScriptingInfo constructor for LogMessage [API change] (Albert Krewinkel).
    • Make DocxParserWarning a WARNING, not INFO. [API change].
    • Add UnsupportedCodePage constructor to LogMessage [API change].
    • Add UnclosedDiv constructor for LogMessage [API change].
  • Lua subsystem (Albert Krewinkel:

    • Add a pandoc.log module.
    • Uupdate to pandoc-lua-marshal version 0.2.7 (#8916). This fixes counterintuitive behavior of the content property on BulletList and OrderedList items. Unmarshalling of that field now matches the behavior of the constructor.
    • Use newest zip module. This adds a symlink function to Entry objects, allowing to check if an entry represents a symbolic link.
    • Improve pandoc.json.decode docs.
    • Update and fix docs for pandoc.types.Version and pandoc.utils.type.
    • Add new module pandoc.image The module provides basic querying functions for image properties.
    • Bump pandoc-lua-engine to 0.2.1.4.
  • Use latest KaTeX CDN asset (#9707, Salim B).

  • pandoc-cli: ensure UTF8 when emitting version info.

  • tools/update-lua-module-docs.lua: improve script-internal docs, cleanup (Albert Krewinkel).

  • Allow network 3.2.

  • Use latest versions of texmath, djot, skylighting-core, skylighting.

  • Fix command test for #9652.

  • Fix some typos in code comments (#9638, guqicun).

  • Command tests: include regular PATH after directory with the test executable (ensures that DLLs will be found on Windows).

  • MANUAL.txt:

    • Document handout variable for beamer (#9742).
    • Document formats affected by --slide-level (#9745).
    • Update the list of required LaTeX packages (#9728, Albert Krewinkel).
    • Use more descriptive link text for ODT (#9673).
    • Add clarification about toc-title in docx, pptx (#9645).
    • Better document truthiness for conditionals (#9661).
    • Mention that custom-style works with ODT (Ian Max Andolina).
    • Harmonize typographic dashes (#9688, Salim B). Standardize on --- with no space.
  • INSTALL.md: Minor tweaks (#9705, Leo Heitmann Ruiz).

pandoc 3.1.13

07 Apr 15:51
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Org reader:

    • Fix treatment of id property under heading (#9639).
  • DocBook reader:

    • Add empty title to admonition div if not present (#9569). This allows admonition elements (e.g. <note>) to work with gfm admonitions even if the <title> is not present.
  • DokuWiki reader:

    • Link text cannot contain formatting (e.g., // is not italics) (#9630).
    • An explicitly empty link text ([[url|]]) works the same as an omitted link text (#9632).
  • Typst reader:

    • Support Typst 0.11 table features: col/rowspans, table head and foot (#9588).
    • Parse cell col/rowspans.
  • CSLJson writer:

    • Put $ or $$ around math in csljson output (#9616).
  • ConTeXt writer:

    • Fix options order with \externalfigure. The dimensions should come beforeafter the class if both are present.
  • Typst writer:

    • Put label after Span, not before. Labels get applied to preceding markup item.
    • Support Typst 0.11 table features (#9588): colspans, rowspans, cell alignment overrides, relative column widths, header and footer, multiple table bodies with intermediate headers. Row heads are not yet supported.
    • The default typst template has been modified so that tables don’t have lines by default. As is standard with pandoc, we only add a line under a header or over a footer. However, a different default stroke pattern can easily be added in a template.
    • More reliable escaping in inline [..] contexts (#9586). For example, we need to escape [\1. April] or it will be treated as an ordered list.
    • Handle unnumbered on headings (#9585).
  • LaTeX writer:

    • Fix math inside strikeout (#9597).
  • Text.Pandoc.Writers.Shared:

    • Export isOrderedListMarker [API change].
  • Change lhs tests so they don’t use --standalone. This will avoid test failures due to minor changes in skylighting versions, e.g. #9589.

  • Use latest texmath, typst.

  • Require pandoc-lua-marshal 0.2.6 (#9613, Albert Krewinkel). Fixes an issue arising when the value of content properties on BlockQuote, Figure, and Div elements was an empty list.

  • Update lua-filters.md (#9611, Carlos Scheidegger).

pandoc 3.1.12.3

18 Mar 05:28
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Markdown reader: Fix bug with footnotes at end of fenced div (#9576).

  • LaTeX reader:

    • Improve tokenization of @ (#9555). Make tokenization sensitive to \makeatletter/\makeatother. Previously we just always treated @ as a letter. This led to bad results, e.g. with the sequence \@. E.g., a\@ b would parse as “ab” and a\@b as “a”.
    • Make withRaw work inside parseFromToks (#9517). This is needed for raw environments to work inside table cells.
    • Better handling of table colwidths (#9579). Previously the parser just failed if the column width specified in p{} wasn’t a multiple of \linewidth. This led to cases where content was skipped.
  • Typst writer:

    • Add ‘kind’ parameter to figures with tables (#9574).
    • Avoid unnecessary box around image in figure (#9236).
    • Omit width/height in images unless explicitly specified (#9236). Previously we computed width/heigth for images that didn’t have size information, because otherwise typst would expand the image to fit page width. This typst behavior has changed in 0.11. This change fixes a bug in which images would sometimes overflow page margins, depending on their intrinsic size.
    • Don’t add hard-coded inset to tables (#9580). Instead, set this globally in the default template, allowing it to be customized.
  • LaTeX template: Fix block headings support for unnumbered paragraphs (#9542, #6018, Oliver Fabel).

  • HTML templates: Replace polyfill provider (#9537, @SukkaW). Replace polyfill.io with cdnjs.cloudflare.com/polyfill. polyfill.io has been acquired by Funnull, and the service has become unstable.

  • Korean translations: delete colon in translation for ‘to’. This was invalid YAML, and not desired anyway, since a colon is added.

  • Use latest commonmark, commonmark-extensions. This fixes a 3.12 regression in parsing of commonmark/gfm autolinks (jgm/commonmark-hs#151).

  • Depend on djot 0.1.1.3, which fixes a serious parsing bug affecting regular paragraphs after lists.

  • Depend on latest skylighting, skylighting-core, typst-hs, texmath.

  • MANUAL.txt: Change broken link to IDML cookbook (#9563).

pandoc 3.1.12.2

01 Mar 06:00
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Docx reader:

    • Ensure that table captions are counted (#9518).
    • Detect caption by style name not id (#9518). The styleId can change depending on the localization.
    • Avoid emitting empty paragraph where caption was.
  • Markdown reader: fix regression in link parsing with wikilinks extensions (#9481). This fixes a regression introduced in 3.1.12.

  • Org reader/writer: support admonitions (#9475).

  • Org writer: omit extra blank line at end of quote block.

  • Typst writer: ensure that -, +, etc. are escaped at beginning of block (#9478). Our recent relaxing of escaping (#9386) caused problems for things like emphasized - characters that were rendered using #strong[-]#. This now gets rendered as #strong[\-].

  • LaTeX writer: fix bug when a language is specified in two different ways (#9472). If you used lang: de-DE but then had a span or div with lang=de, the preamble would try to load ngerman twice, leading to an error. This fix ensures that a language is only loaded once.

  • Docx writer: Don’t copy over footnotePr in settings.xml from reference.docx (#9522).

  • EPUB writer: omit EPUB2-specific meta tag on EPUB3 (#9493). This caused a validation failure in epubs with cover images.

  • Lua: avoid crashing when an error message is not valid UTF-8 (Albert Krewinkel).

  • Text.Pandoc.SelfContained:

    • Add role="img" to svgs.
    • Add aria-label to svg elements with alt text if present. Screen readers ignore alt attributes on svg elements but do pay attention to aria-label (#9525).
  • Text.Pandoc.Shared: Fix regression in section numbering in makeSections (#9516). Starting with pandoc 3.1.12, unnumbered sections incremented the section number.

  • Text.Pandoc.Class: fix openUrl TLS negotiation (#9483). With the release of TLS 2.0.0, the TLS library started requiring Extended Main Secret for the TLS handshake. This caused problems connecting to zotero’s server and others that do not support TLS 1.3. This commit relaxes this requirement.

  • Depend on djot 0.1.1.0 (fixes rendering on multiline block attributes).

  • Use new releases of skylighting-format-blaze-html (#9520). Fixes auto-wrapping of long source lines in HTML print media.

  • Use new commonmark-extensions (fixes issue with the rebase_relative_paths extension when used with commonmark/gfm.

  • Makefile: improve epub-validation target (#9493). Use --epub-cover-image to catch issues that only arise with that.

pandoc 3.1.12.1

18 Feb 01:42
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • EPUB writer: omit EPUBv3-specific accessibility features on epub2 (#9469). Fixes a regression in 3.1.12.

  • More fixes for SVG ids with --self-contained (#9467). This generalizes the fix to #9420 so it applies to things like style="fill(url(#..." and should fix problems with SVGs including gradients.

  • Powerpoint writer: properly handle math in headings and tables (#9465). This ensures that paragraphs containing math are wrapped in a mc:AlternateContent node as required.

  • Makefile: make validate-epub check v2 output too.

pandoc 3.1.12

15 Feb 18:18
@jgm jgm
Compare
Choose a tag to compare
Click to expand changelog
  • Add djot as input and output format. Djot is a light markup syntax (https://djot.net).

    • New module Text.Pandoc.Readers.Djot [API change]. The function readDjot is also exported by Text.Pandoc.Readers.
    • New module Text.Pandoc.Writers.Djot [API change]. The function writeDjot is also exported by Text.Pandoc.Writers.
  • --number-sections now uses the first digit for the number of the top-level section, no matter what its level. So if the top-level section is level-2, numbers will be 1, 2, etc. rather than 0.1, 0.2, as in the past (#5071). For some backwards compatibility, we revert to the old behavior when the --number-offset option is used.

  • DocBook reader:

    • Better handling of <procedure> and <substeps> (#9341): <procedure> now gets parsed as an ordered list, and <substeps> as a sublist.
  • Man reader:

    • Move spaces outside of emph/strong (#9445).
  • MediaWiki reader:

    • Don’t make leading blanks underscores in image links (#9425).
    • Allow lowercase image: (#9424).
  • BibTeX reader:

    • Support pagetotal in converting BibLaTeX.
  • Markdown reader:

    • Fix wikilinks extensions to allow newlines in titles (#9454).
  • EPUB reader:

    • Don’t put # characters in identifiers.
  • LaTeX reader:

    • Improve treatment of \cref, \Cref (#7463). Use the reference-type ref+label and ref+Label. Also, associate with \vref ref instead of ref+page.
    • Limited support for \Cref (#7463).
    • Generate relative widths for \linewidth, \textheight (#9388).
  • Typst reader:

    • Fix handling of \overline (#9294). Due to a typo, it was being incorrectly rendered as an \underset.
    • Improve handling of inline #quote (#9413).
    • Fix handling of dot(), tilde(), ddot() (jgm/typst-hs#38).
    • Fix character used for norm (jgm/typst-hs#38).
  • Typst writer:

    • Use reference form (e.g. @jones2000[p. 30]) for citations when possible.
    • Use #ref or @ for links with reference-type="ref" (#7463). This attribute is added to LaTeX \cref, for example.
    • Improve citation support (#9452). Emit form: "prose" or form: "year" qualifiers if the citation is author-in-text or suppress-author. Strip initial comma from suffix, since typst will add an extra one.
    • Unescape URI escapes in image paths (#9389).
    • Handle labels and citaiton ids with spaces and other special characters (#9387). In these cases, we produce an explicit label() rather than using <> or @.
    • Avoid producing illegal labels (#9387).
    • Avoid unnecessary escapes (#9386).
  • LaTeX writer:

    • Make writer sensitive to empty_paragraphs extension (#9443).
    • Fix beamer highlighting (mh4ckt3mh4ckt1c4s).
    • Create valid table even when table is empty (#9350).
    • Set font fallback for babel main font (Max Heller).
    • Add some kerns where needed between quotes (#9371).
  • HTML writer:

    • Add suffix to multiple footnote section ids, so they are unique (Sam May). This is necessary when --reference-location is block or section.
  • EPUB writer:

    • Add ARA roles for accessibility (#9378, Iacobus1983). Footnote references are given role “doc-noteref”, footnote text gets “doc-footnote”, and nav gets “doc-toc”.
    • Ensure that an alt attribute is always added (#9354). This seems to be required by iBooks; even an empty alt attribute will satisfy it.
    • Add xml:lang to package element (#9372).
    • Add accessibility metadata to EPUB metadata (#9372, #9400, Iacobus1983 and John MacFarlane). Reasonable default values are used to ensure that pandoc’s EPUBs conform to the EU Accessibilty Act requirements, but values can be overridden using metadata.
  • Docx writer:

    • Restore ability to center-justify table (#9393). The fix to #5947 caused all tables to be left indented. This was necessary to avoid extra indentation in table cells when a table appeared in a list item. This change makes the changes conditional, so that they only affect tables in list items.
  • Man writer:

    • Fix bug with long URLs (#9458). URLs with more than 68 characters didn’t display properly because of wrapping.
    • Support (limited) syntax highlighting in code blocks (#9446). Currently only boldface and italics are supported. The monochrome style might be of use for those generating man pages.
  • Org writer:

    • Escape special lines in code blocks (#9218, Albert Krewinkel).
  • Markdown writer:

    • Use different width fences for nested divs (#9450). Outer divs have longer fences. This aids clarity for the reader, making it easier to see where the div ends. It also makes the output compatible with some other implementations, e.g. micromark, which require different-width fences for nesting.
    • Fix output for pipe tables with a huge number of columns (#9346). Previously we got invalid pipe tables when the number of table columns exceeded the setting of --columns.
  • Powerpoint writer:

    • Fix regression in layout for slides with figures (#9442).
    • Use internal column widths in pptx writer tables (#5706, Tomas Dahlqvist). The table writer used to only divide all available width evenly for all columns. In this update the code uses the incoming widths if they are available. If they are not set the earlier even distribution is used. Some of the golden templates are adjusted slightly because of different rounding when using the new calculation model.
  • Custom writers:

    • Fix handling of common state (#9229, Albert Krewinkel). The CommonState (PANDOC_STATE in Lua) may change between the time that a custom writer script is first loaded and when the writer is run. However, the writer was always using the initial state, which led to problems, e.g. when the mediabag was updated in a filter, as those updates where not visible to the writer. The state is now updated right before the writer function runs.
  • Text.Pandoc.SelfContained:

    • Fix id replacements in SVGs with clipping paths (#9420). This fixes --embed-resources when SVGs have clip-path attributes.
    • Fix size of duplicated SVGs with --embed-resources (#9439).
  • ConTeXt template: support font fallback (#9361, Lawrence Chonavel).

  • Text.Pandoc.Shared:

    • addPandocAttributes: use wrapper attribute, not wrap, for Divs and Spans added as wrappers to hold attributes on elements that do not accept them.
    • makeSections behavior changes:
      • When the optional base level parameter is provided, we no longer ensure that the sequence of heading levels is gapless (#9398). Instead, we set the lowest heading level to the specified base level, and adjust the others accordingly. If an author wants to skip a level, e.g. from level 1 to level 3, they can do that. In general, the heading levels specified in the source document are preserved; makeSections only puts them into a hierarchical structure.
      • Section numbers are now assigned differently, as described above under --number-sections changes (#5071).
    • Improve makeSections code for section number calculation.
  • Text.Pandoc.Chunks:

    • Autogenerate unique ids for sections missing them (#9383). This is needed for TOC generation to work properly. We can’t create TOC links if there are no ids. This fixes some EPUB validation issues we’ve been getting since switching over to Chunks for chunking.
    • Improve fixTOCTreePaths. We weren’t adding ids for section headings that don’t head a chunk, but these headings are needed for a TOC.
  • Lua: catch encoding error in pandoc.read (#9385, Albert Krewinkel). Fixed a bug that could lead to an un-catchable error and program termination when pandoc.read was called with invalid UTF-8 input.

  • LaTeX template: support font fallback (lawcho). This support is LuaLaTeX-specific. See MANUAL.txt for documentation.

  • Text.Pandoc.Readers: Add readMan to exports [API change] (George Stagg).

  • Text.Pandoc.PDF:

    • Reliably detect when TOC has changed (#9295). Sometimes the TOC changes but there are no warnings: this happens when no labels are present. In this case we must rerun LaTeX. So we now take the SHA1 hash of the TOC file and rerun LaTeX if it changes between runs.
    • Increase maximum number of LaTeX runs to 4 (#9299). On some documents, 4 runs are needed (e.g. when a LastPage reference is used).
    • Avoid readFileLazy, which caused improperly cleaned-up temp directories on Windows (#9460).
  • MANUAL.txt:

    • Harmonize spelling of Markdown and MultiMarkdown (#9402, Salim B).
    • Add <pre> to list of exceptions for markdown_in_html_blocks extension (#9305).
    • Add clarification to docs for --resource-path (#9417).
  • Makefile: Validate generated EPUB as part of prerelease checks.

  • Add validation for docx golden files to CI (Edwin Török).