Some Awk scripts to generate documentation from Markdown-formatted comments in source code.
The d.awk script creates documentation for languages that use /* */
for multiline comments, like C, C++, Java, C#, JavaScript.
The file hashd.awk does the same, but for languages that use #
symbols
for comments, like Perl, Python, Ruby, and others.
For example, add a comment like this to your source file:
/**
* My Project
* ==========
*
* This is some _Markdown documentation_ in a `source
* file`.
*/
int main(int argc, char *argv[]) {
printf("hello, world");
return 0;
}
Then use Awk to run the d.awk
script on it like so:
# Run the script on a file:
./d.awk file.c > doc.html
# alternatively: awk -f d.awk file.c > doc.html
The text within the /** */
comment blocks are parsed as Markdown, and
rendered as HTML. Comments may also start with three slashes: /// Markdown here
.
A typical use case to bundle the d.awk
script with your project's source and
to then add a docs
target to the Makefile:
docs: api-doc.html
api-doc.html: header.h d.awk
$(AWK) -f d.awk $< > $@
The script can also generate HTML from a normal Markdown document using the -v Clean=1
command-line option:
./d.awk -v Clean=1 README.md > README.html
There are additional scripts in the distribution:
- hashd.awk - Like
d.awk
, but for languages that use#
symbols for comments - mdown.awk - Generates HTML from a normal Markdown file.
- xtract.awk - Extracts the Markdown comments of a source file.
- wrap.awk - Formats a Markdown text file to fit on a page.
It supports most of Markdown:
- Bold, italic and
monospaced
text. - Both header styles
- Horizontal rules
- Ordered and Unordered lists
- Code blocks and block quotes
- Hyperlinks and images
- A large number of HTML tags can be embedded in a document
- The output has a dark mode toggle.
It also supports a number of extensions, mostly based on GitHub syntax:
```
-style code blocks- You can specify a language according to Github's Syntax Highlighting
rules, for example
```java
- It uses Google's code-prettify library for the syntax highlighting.
- This causes the generated HTML to pull in a third-party script.
It can be disabled by specifying
-vPretty=0
on the command line.
- You can specify a language according to Github's Syntax Highlighting
rules, for example
- Tables, using the same syntax as GitHub-flavoured markdown.
- Mermaid diagrams are supported through the same
```mermaid
syntax as in GitHub-flavoured markdown- This causes the generated HTML to pull in a third-party script.
It can be disabled by specifying
-vMermaid=0
on the command line.
- This causes the generated HTML to pull in a third-party script.
It can be disabled by specifying
- MathJax support for rendering mathematical expressions, using the same sytax
as GitHub-flavoured markdown.
- This causes the generated HTML to pull in a third-party script.
It can be disabled by specifying
-vMathjax=0
on the command line.
- This causes the generated HTML to pull in a third-party script.
It can be disabled by specifying
[x]
GitHub-style task lists- MultiMarkdown-style footnotes and abbreviations.
- Backslash at the end of a line
forces a line break. - There is a special
\![toc]
mode that generates a Table of Contents automatically.
The file demo.c in the distribution serves as an example, user guide and test at the same time.
d.awk
is a inspired by the Javadoc and Doxygen tools which generate
HTML documentation from comments in source code.
It is meant for programming languages like C, C++ or JavaScript that use the
/* */
syntax for comments (it will work with Java and C#, though the
existence of bundled documentation tools for those languages makes it
redundant).
It has two distinguishing features:
Firstly, it is written in the ubiquitous Awk language. You can distribute the
d.awk
script with your project's source code and your users will be able to
generate documentation without requiring additional 3rd party tools.
Secondly, the documentation use Markdown for text formatting, which has several advantages:
- It is well known and widely used.
- It reads easily and won't clutter your code comments with markup tokens.
The included Makefile demonstrates what the different scripts in the repository are and how they're meant to be used.
Comments must start with /**
, and each line in the comment must start with a
*
- this is so you can control which comments are included in the
documentation.
To generate documentation from a file demo.c
, run the d.awk
script on it
like so:
./d.awk demo.c > doc.html
Or to use it in clean mode, which treats the input file as a normal Markdown file:
./d.awk -v Clean=1 README.md > doc.html
The file demo.c
in the distribution provides a demonstration of all the
features and the supported syntax.
Configuration options can be set in the BEGIN
block of the script, or passed
to the script through Awk's -v
command-line option:
-v Title="My Document Title"
to set the<title/>
of the HTML-v Clean=1
to treat the input file as a normal Markdown file. Use this option to create HTML documents from your project's README.md and related files.-v StyleSheet=style.css
to use a separate file as style sheet.-v TopLinks=1
to have links to the top of the document next to headers.-v Pretty=0
disable syntax highlighting.
By default a```lang
-style block will cause the library to pull in Google's code-prettify library to syntax highlight the block in the languagelang
.
This switch disables that functionality.-vMermaid=0
disable Mermaid diagrams.-vMathjax=0
disable MathJax mathematical expression rendering.-v HideToCLevel=n
specifies the level of the Table of Contents that should be collapsed by default. For example, a value of 3 means that headers above level 3 will be collapsed in the Table of Contents initially.-v classic_underscore=1
words_with_underscores behave like old markdown where the underscores in the word counts as emphasis. The default behaviour is to havewords_like_this
not contain any emphasis.
The stylesheet for the output HTML can also be modified at the bottom of the script.
Like d.awk
, but generates documentation for programming languages that uses
#
symbols for comments.
For example, to generate an HTML file from the comments at the top of the d.awk script use the this command:
./hashd.awk d.awk > d.awk.html
The first comment must start with two #
symbols. The following is an example
in Python:
##
# My Project
# ==========
#
# This is some _Markdown documentation_ in a `source
# file`.
#
print("Hello, World!")
If you have a language that uses a different symbol for comments, you can use this file and modify the regular expressions at the top to match your language's comment syntax.
Creates an HTML document from a Markdown file.
It is functionally equivalent to using d.awk
with the -v Clean=1
command
line option.
For example, to generate HTML from this README.md
file, type:
./mdown.awk README.md > README.html
The command line options are the same as d.awk
's.
This script extracts the comments from a source file, without processing it as Markdown.
./xtract.awk demo.c > demo.md
A use case is to extract the comments from a source file into a new Markdown document, such as a GitHub wiki page.
wrap.awk
makes a Markdown document more readable by word wrapping long lines
to fit into 80 characters.
For example, to use it on this README.md
file, run
cp README.md README.md~
./wrap.awk README.md~ > README.md
To specify a different width, use -v Width=60
from the command line.
The license is officially the MIT-0 license (see the file LICENSE for details), but the individual scripts may be redistributed with this notice:
(c) 2016-2023 Werner Stoop
Copying and distribution of this file, with or without modification,
are permitted in any medium without royalty provided the copyright
notice and this notice are preserved. This file is offered as-is,
without any warranty.
The reasoning is that if you're just using one of the scripts in this repository to create documentation for your projects then I'd like for you to be able to include the script in your project without worries.
- https://en.wikipedia.org/wiki/AWK
- https://en.wikipedia.org/wiki/Markdown
- https://tools.ietf.org/html/rfc7764
- http://daringfireball.net/projects/markdown/syntax
- https://guides.github.com/features/mastering-markdown/
- http://fletcher.github.io/MultiMarkdown-4/syntax
- http://spec.commonmark.org
r-lyeh's stddoc.c also generates HTML documentation from Markdown comments in source code, but takes a very different approach to achieve it: It simply extracts the comments, and appends Markdeep's tags to the output.
Here is an Awk script that more or less achieves the same thing:
#! /usr/bin/awk -f
BEGIN { print "<meta charset=\"utf-8\">" }
/\/\*\*/ {
sub(/^.*\/\*/,"");
incomment=1;
}
incomment && /\*\// {
incomment=0;
sub(/[[:space:]]*\*\/.*/,"");
sub(/^[[:space:]]*\*[[:space:]]?/,"");
print
}
incomment && /^[[:space:]]*\*/ {
sub(/^[[:space:]]*\*[[:space:]]?/,"");
print
}
!incomment && /\/\/\// {
sub(/.*\/\/\/[[:space:]]?/,"");
print
}
END {
print "<!-- Markdeep: -->";
print "<style class=\"fallback\">body{visibility:hidden;white-space:pre;font-family:monospace}</style>";
print "<script>markdeepOptions={tocStyle:\"auto\"};</script>";
print "<script src=\"https://morgan3d.github.io/markdeep/latest/markdeep.min.js\" charset=\"utf-8\"></script>";
print "<script>window.alreadyProcessedMarkdeep||(document.body.style.visibility=\"visible\")</script>"
}
Markdeep has significantly more features than d.awk
, but the tradeoff is that it
has some incompatibilities with GitHub-flavoured Markdown and it requires the
markdeep.js
file to be distributed with the documentation.
There is also TeXMe as an alternative to Markdeep.
yiyus' md2html.awk is an Awk script that generates HTML from Markdown with a much cleaner parser. I only discovered it long after I wrote my own Markdown parser.
Things I'd like to add in the future:
wrap.awk
adds too much whitespace to code blocks...- It is known to not work with versions of mawk prior to 1.3.4
(The default Awk on Raspian as of this writing is version 1.3.3). Please upgrade mawk, or use Gawk instead. - The table of contents is in a
<div>
that ends up inside a<p>
, which is incorrect. - Google's code-prettify library is no longer maintained. I've been looking towards highlightjs as an alternative, but haven't made a decision yet.
- The Mermaid styles doesn't change if dark-mode is toggled, but it turned out to be surprisingly difficult.