HTMLarkdown is a HTML-to-Markdown converter that's able to output HTML-syntax when required.
Like when center-aligning, or resizing images:

Switching to HTML showcase

Written completely in TypeScript.
Has many Jest tests, covering many edge-case conversions.

Leave a issue/PR if you can think of more!
For now, is designed for GFM.
Try it out at the demo site below!
https://evitanrelta.github.io/htmlarkdown

How is this different?

Switching to HTML-syntax

Whenever elements cannot be represented in markdown-syntax, HTMLarkdown will switch to HTML-syntax:

Input HTML

Output Markdown

<h1>Normal-heading is <strong>boring</strong></h1>

<h1 align="center">
  Centered-heading is <strong>da wae</strong>
</h1>

<p><img src="https://image.src" /></p>

<p><img width="80%" src="https://image.src" /></p>

# Normal-heading is **boring**

<h1 align="center">
  Centered-heading is <b>da wae</b>
</h1>

![](https://image.src)

<img width="80%" src="https://image.src" />

Note: The HTML-switching is controlled by the rules' Rule.toUseHtmlPredicate.

But HTMLarkdown tries to use as little HTML-syntax as possible. Mixing markdown and HTML if needed:

Input HTML

Output Markdown

<blockquote>
  <p align="center">
    Centered-paragraph
  </p>
  <p>Below is a horizontal-rule in blockquote:</p>
  <hr>
</blockquote>

> <p align="center">
>   Centered-paragraph
> </p>
> Below is a horizontal-rule in blockquote:
> 
> <hr>

Depending on the situation, HTMLarkdown will switch between markdown's backslash-escaping or HTML-escaping:

Input HTML

Output Markdown

<!-- In markdown -->
<p>&lt;TAG&gt;, **NOT BOLD**</p>

<!-- In in-line HTML -->
<p>
  <sup>&lt;TAG&gt;, **NOT BOLD**</sup>
</p>

<!-- In block HTML -->
<p align="center">
  &lt;TAG&gt;, **NOT BOLD**
</p>

\<TAG>, \*\*NOT BOLD\*\*

<sup>\<TAG>, \*\*NOT BOLD\*\*</sup>

<p align="center">
  &lt;TAG>, **NOT BOLD**
</p>

Handling of edge cases

Adding separators in-between adjacent lists to prevent them from being combined by markdown-renderers:

Input HTML

Output Markdown

<ul>
  <li>List 1 > item 1</li>
  <li>List 1 > item 2</li>
</ul>
<ul>
  <li>List 2 > item 1</li>
  <li>List 2 > item 2</li>
</ul>

- List 1 > item 1
- List 1 > item 2

<!-- LIST_SEPARATOR -->

- List 2 > item 1
- List 2 > item 2

And more!
But this section is getting too long so...

Installation

npm install htmlarkdown

Usage

Markdown conversion (either from `Element` or `string`)

import { HTMLarkdown } from 'htmlarkdown'

/** Convert an element! */
const htmlarkdown = new HTMLarkdown()
const container = document.getElementById('container')
console.log(container.outerHTML)
// => '<div id="container"><h1>Heading</h1></div>'
htmlarkdown.convert(container)
// => '# Heading'


/** 
 * Or a HTML string! 
 * Whichever u prefer. It's 2022, I don't judge :^)
 */
const htmlString = `
<h1>Heading</h1>
<p>Paragraph</p>
`
const htmlStrWithContainer = `<div>${htmlString}</div>`
htmlarkdown.convert(htmlString)
// Set 2nd param 'hasContainer' to true, for container-wrapped string.
htmlarkdown.convert(htmlStrWithContainer, true)
// Both output => '# Heading\n\nParagraph'

Note: If an element is given to convert, it's deep-cloned before any processing/conversion.
Thus, you don't have to worry about it mutating the original element :)

Configuring

/** Configure when creating an instance. */
const htmlarkdown = new HTMLarkdown({
    htmlEscapingMode: '&<>',
    maxPrettyTableWidth: Number.POSITIVE_INFINITY,
    addTrailingLinebreak: true
})

/** Or on an existing instance. */
htmlarkdown.options.maxPrettyTableWidth = -1

Plugins

Plugins are of type (htmlarkdown: HTMLarkdown): void.
They take in a HTMLarkdown instance and configure it by mutating it.

There's 2 plugin-options available in the options object: preloadPlugins and plugins.
The difference is:

preloadPlugins loads the plugins first, before your other options. (likes "presets")
Allowing you to overwrite the plugins' changes:

const enableTrailingLinebreak: Plugin = (htmlarkdown) => {
    htmlarkdown.options.addTrailingLinebreak = true
}
const htmlarkdown = new HTMLarkdown({
    addTrailingLinebreak: false,
    preloadPlugins: [enableTrailingLinebreak],
})
htmlarkdown.options.preloadPlugins // false

plugins loads the plugins after your other options.
Meaning, plugins can overwrite your options.

const enableTrailingLinebreak: Plugin = (htmlarkdown) => {
    htmlarkdown.options.addTrailingLinebreak = true
}
const htmlarkdown = new HTMLarkdown({
    addTrailingLinebreak: false,
    plugins: [enableTrailingLinebreak],
})
htmlarkdown.options.preloadPlugins // true

You can also load plugins on existing instances:

htmlarkdown.loadPlugins([myPlugin])

Making a copy of an instance

The conversion of a HTMLarkdown instance solely depends on its options property.
Meaning, you create a copy of an instance like this:

const htmlarkdown = new HTMLarkdown()
const copy = new HTMLarkdown(htmlarkdown.options)

Configuring rules/processes

See this section for info on what the rules/processes do.

/**
 * Overwriting default rules/processes.
 * (does NOT include the defaults)
 */
const htmlarkdown = new HTMLarkdown({
    preProcesses: [myPreProcess1, myPreProcess2],
    rules: [myRule1, myRule2],
    textProcesses: [myTextProcess1, myTextProcess2],
    postProcesses: [myPostProcess1, myPostProcess2]
})

/**
 * Adding on to default rules/processes.
 * (includes the defaults)
 */
const htmlarkdown = new HTMLarkdown()
htmlarkdown.addPreProcess(myPreProcess)
htmlarkdown.addRule(myRule)
htmlarkdown.addTextProcess(myTextProcess)
htmlarkdown.addPostProcess(myPostProcess)

How it works

HTMLarkdown has 3 distinct phases:

Pre-processing
The container-element that's received (and deep-cloned) by the convert method is passed consecutively to each PreProcess in options.preProcesses.
Conversion
The pre-processed container-element is then recursively converted to markdown.
Elements are converted by Rule in options.rules.
Text-nodes are converted by TextProcess in options.textProcesses.
The rule/text-process outputs strings are then appended to each other, to give the raw markdown.
Post-processing
The raw markdown string is then passed consecutively to each PostProcess in options.postProcess, to give the final markdown.

Rule-processes flowchart
(image: the general conversion flow of HTMLarkdown)

Contributing

Bugs

HTMLarkdown is still under-development, so there'll likely be bugs.

So the easiest way to contribute is submit an issue (with the bug label), especially for any incorrect markdown-conversions :)

For any incorrect markdown-conversions, state the:

input HTML
current incorrect markdown output
expected markdown output

New conversions, ideas, features, tests

If you have any new elements-conversions / ideas / features / tests that you think should be added, leave an issue with feature or improve label!

feature label is for new features

improve label is for improvements on existing features

Understandably, there are gray areas on what is a "feature" and what is an "improvement". So just go with whichever seems more appropriate :)

Other markdown specs

Currently, HTMLarkdown has been designed to output markdown for GitHub specifically (ie. GFM).
BUT, if there's another markdown spec. that you'd like to design for (maybe as a plugin?), do leave an issue/discussion :D

Coding-related stuff

Code-formatting is handled by Prettier, so no need to worry bout it :)

Any new feature should

be documented via TSDoc
come with new unit-tests for them
and should pass all new/existing tests

As for which merging method to use, check out the discussion.

Contributors

So far it's just me, so pls send help! :^)

Roadmap

If you've any new ideas / features, check out the Contributing section for it!

Element conversions

Block-elements:

Headings (For now, only ATX-style)
Paragraph
Codeblock
Blockquote
Lists
(ordered, unordered, tight and loose)
(GFM) Table
(GFM) Task-list

(Below are some planned block-elements that don't have markdown-equivalent)
<span> (handled by a noop-rule)
<div> (For now, handled by a noop-rule)
Definition list (ie. <dl>, <dt>, <dd>)
Collapsible section (ie. <details>)

Text-formattings:

Bold (For now, only outputs in asterisks **BOLD**)
Italic (For now, only outputs in asterisks *ITALIC*)
(GFM) ~~Strikethrough~~
Code
Link (For now, only inline links)
^Superscript (ie. <sup>)
_Subscript (ie. <sub>)
Underline (ie. <u>, <ins>)
(didn't know underlines possible till recently)

Misc:

Images (For now, only inline links)
Horizontal-rule (ie. <hr>)
Linebreaks (ie. <brr>)
Preserved HTML comments (Issue #25) (eg. )

Features to be added:

Custom id attributes

Go to [section with id](#my-section)

<p id="my-section">
  My section
</p>

Reversing GitHub's Issue/PR autolinks

Input HTML	Output Markdown
<p> Issue autolink: <a href="https://github.com/user/repo/issues/7">#7</a> </p>	Issue autolink: #7

Ability to customise how codeblock's syntax-highlighting langauge is obtained from the <pre><code> elements

noop-rule:
They only pass-on their converted inner-contents to their parents.
They themselves don't have any markdown conversions, not even in HTML-syntax.

License

The MIT License (MIT).
So it's freeeeeee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

How is this different?

Switching to HTML-syntax

Handling of edge cases

Installation

Usage

Markdown conversion (either from `Element` or `string`)

Configuring

Plugins

Making a copy of an instance

Configuring rules/processes

How it works

Contributing

Bugs

New conversions, ideas, features, tests

Other markdown specs

Coding-related stuff

Contributors

Roadmap

Element conversions

Block-elements:

Text-formattings:

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

How is this different?

Switching to HTML-syntax

Handling of edge cases

Installation

Usage

Markdown conversion (either from Element or string)

Configuring

Plugins

Making a copy of an instance

Configuring rules/processes

How it works

Contributing

Bugs

New conversions, ideas, features, tests

Other markdown specs

Coding-related stuff

Contributors

Roadmap

Element conversions

Block-elements:

Text-formattings:

License

Markdown conversion (either from `Element` or `string`)