diff --git a/README.md b/README.md index b1ac107..d5506ef 100644 --- a/README.md +++ b/README.md @@ -32,24 +32,39 @@ to care about it. this shiz aims to do the same but for HTML, such that: -> -X +> X becomes: -> <html><head><meta_httpequiv=utf8></meta> - -- a -- b -- c -- d -- e -- f -- g -- h -- i -- j -- k -- l +> < +> html +> > +> < +> head +> > +> < +> meta +> _ +> http +> equiv +> = +> utf +> 8 +> > +> </ +> meta +> > +> a +> b +> c +> d +> e +> f +> g +> h +> i +> j +> k +> l tokenizers for generation need to be able to decode reversibly,