Skip to content

Commit 7b40373

Browse files
committed
Handle mixed case void tags (fixes snoyberg#167)
1 parent 9d7c4bd commit 7b40373

File tree

5 files changed

+27
-3
lines changed

5 files changed

+27
-3
lines changed

html-conduit/ChangeLog.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# ChangeLog for `html-conduit`
22

3+
## 1.3.2.2
4+
5+
* Fix handling of mixed case void tags [#167](https://github.com/snoyberg/xml/issues/167)
6+
37
## 1.3.2.1
48

59
* Allow xml-conduit 1.9

html-conduit/html-conduit.cabal

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
Name: html-conduit
2-
Version: 1.3.2.1
2+
Version: 1.3.2.2
33
Synopsis: Parse HTML documents using xml-conduit datatypes.
44
Description: This package uses tagstream-conduit for its parser. It automatically balances mismatched tags, so that there shouldn't be any parse failures. It does not handle a full HTML document rendering, such as adding missing html and head tags. Note that, since version 1.3.1, it uses an inlined copy of tagstream-conduit with entity decoding bugfixes applied.
55
Homepage: https://github.com/snoyberg/xml

html-conduit/src/Text/HTML/DOM.hs

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,10 @@ eventConduit' =
7777
toName l = XT.Name l Nothing Nothing
7878
closeStack = mapM_ (yield . XT.EventEndElement)
7979

80-
isVoid = flip Set.member $ Set.fromList
80+
isVoid name = Set.member (T.toLower name) voidSet
81+
82+
voidSet :: Set.Set T.Text
83+
voidSet = Set.fromList
8184
[ "area"
8285
, "base"
8386
, "br"

html-conduit/test/main.hs

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,23 @@ main = hspec $ do
9797
]
9898
]
9999
in H.parseLBS html @?= doc
100+
it "Mixed case br #167" $
101+
let html = "<html><head><title>foo</title></head><body><bR><p>Hello World</p><BR>done</body></html>"
102+
doc = X.Document (X.Prologue [] Nothing []) root []
103+
root = X.Element "html" Map.empty
104+
[ X.NodeElement $ X.Element "head" Map.empty
105+
[ X.NodeElement $ X.Element "title" Map.empty
106+
[X.NodeContent "foo"]
107+
]
108+
, X.NodeElement $ X.Element "body" Map.empty
109+
[ X.NodeElement $ X.Element "bR" Map.empty []
110+
, X.NodeElement $ X.Element "p" Map.empty
111+
[X.NodeContent "Hello World"]
112+
, X.NodeElement $ X.Element "BR" Map.empty []
113+
, X.NodeContent "done"
114+
]
115+
]
116+
in H.parseLBS html @?= doc
100117

101118
it "doesn't double unescape" $
102119
let html = "<p>Hello &amp;gt; World</p>"

stack.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
1+
resolver: lts-18.5
12
packages:
23
- xml-conduit/
34
- xml-hamlet/
45
- html-conduit/
5-
resolver: lts-17.4

0 commit comments

Comments
 (0)