Skip to content

Formatting html

Krzysiek Madejski edited this page May 17, 2018 · 3 revisions

DIndent

Indents using regular expressions. No extra dependencies. Apart from indenting rules gives nicer output than HTidy

https://github.com/gajus/dindent

$indenter = new \Gajus\Dindent\Indenter();
$from = $indenter->indent($from);
$to = $indenter->indent($to);

Tidy

Devoted to cleaning HTML code, closing tags, adding docstrings, etc. Formatting is just one of a dozen of features.

Leaves a lot of notifications in $tidy->errorBuffer that are informing how well-formed is HTML. Would be cool for giving more info on gov websites.

Written in C, php-extension available.

http://www.html-tidy.org/

Installation: sudo apt install libtidy-dev libtidy5 php-tidy

            $tidy_config = array(
                'output-html' => true,
                'markup' => true,
                'indent' => true,

                'drop-empty-elements' => false,
                'drop-empty-paras' => false,
                'merge-divs' => false,
                'merge-spans' => false
                // TODO do we want to show non-important changes?
            );

            $tidy = new \tidy();
            $tidy->parseString($from, $tidy_config, 'UTF8');
            $tidy->cleanRepair();
            $from = \tidy_get_output($tidy);

            $tidy = new \tidy();
            $tidy->parseString($to, $tidy_config, 'UTF8');
            $tidy->cleanRepair();
            $to = \tidy_get_output($tidy);

Performance

For a main KRS page DIndent was 10x slower, but for long pages DOM building will be quite demanding.

Clone this wiki locally