Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The URL_RE doesn't work if the font URL contains a ")" close paren character #296

Open
alex-sherwin opened this issue Nov 19, 2023 · 4 comments
Labels
bug Something isn't working

Comments

@alex-sherwin
Copy link

alex-sherwin commented Nov 19, 2023

🐛 The bug

There are some packages, for example, the inter-ui NPM package (which is recommended by the official inter GH readme for NPM) which, very unfortunately, have parens in some of the folder names.

For example from the inter-ui package the fonts are under folders with parens in their names.

Example font file: inter-ui/Inter (web)/Inter-Regular.woff2

The URL_RE that fontaine is using is defined as

const URL_RE = createRegExp(
  exactly("url(").and(charNotIn(")").times.any().as("url")).and(")"),
  ["g"]
);

So no matter what it breaks with these URLs.

The problem is that source URL would get regex'd out as inter-ui/Inter (web and then the next check is to see if the source ends with one of the supported extensions (and since the regex mangled the URL, the extension was lost).

🛠️ To reproduce

https://stackblitz.com/edit/github-hdufeg?file=index.css

🌈 Expected behaviour

Parsing of the source url should support any valid URL characters

ℹ️ Additional context

No response

@alex-sherwin alex-sherwin added the bug Something isn't working label Nov 19, 2023
@alex-sherwin
Copy link
Author

alex-sherwin commented Nov 20, 2023

MDN has some details about the flexible varaitions that the CSS url() supports.

Based on the description and some googling, I'm not sure that this formal spec can be succinctly represented by a single regex.

Nearly all answers and examples for this specific parsing task that I can find use a naive variation of what this library is already doing.

@alex-sherwin
Copy link
Author

I’ve played with bringing in lightningcss and using a url visitor to aggregate all valid urls in a font face rule instead of the URL_RE behavior

I believe it works well, except, it will behave differently for the existing test case of invalid syntax on the family name when the quotes aren’t closed properly.

I’ll look to open a PR and possibly hide it behind a feature flag

@alex-sherwin
Copy link
Author

Using lightningcss to parse the @font-face rules definitely works, however, it would be in a stricter, more standards compliant way.

This means potential behavioral differences to how this library currently behaves, which is using some relatively naive regexes that make it usually do the right thing even when it's presented with relatively or very invalid @font-face rule definitions....

Happy to pursue this avenue if there's interest, otherwise, maybe a URL_RE update with a better regex (but still imperfect) would be sufficient

@alex-sherwin
Copy link
Author

alex-sherwin commented Nov 20, 2023

This may be a good regex to use for the URL_RE in lieu of using a full blown CSS AST parser:

url\((?:(?:"((?:\\.|[^"\\])*)")|(?:'((?:\\.|[^'\\])*)')|((?:[^'")\\]*(?:\\.[^'")\\]*)*)))\)

Seems to work reasonably well and accounts for most of the ways a valid CSS url() can be specified, including unquoted URLs with escaped parens, or quoted URLs with parens (no escaping needed, as per spec)

Works with the following input URLs

url("https://somesite.com/static/ok.ttf")
url("http://somesite.com/static/ok.ttf")
url("http://www.somesite.com:8080/static/ok.woff2")
url("https://www.somesite.com:443/static/ok.woff2")
url("https://ok.)woff")
url('https://somesite.com/static/ok.ttf')
url('http://somesite.com/static/ok.ttf')
url('http://www.somesite.com:8080/static/ok.woff2')
url('https://www.somesite.com:443/static/ok.woff2')
url('https://ok.)woff')
url(https://somesite.com/static/ok.ttf)
url(http://somesite.com/static/ok.ttf)
url(http://www.somesite.com:8080/static/ok.woff2)
url(https://www.somesite.com:443/static/ok.woff2)
url(https://ok.\)woff)
url(ok)
url("ok")
url('ok')
url('is )ok.woff')
url(what\ ablut)
url(isthis\)ok.woff)
url(../node_modules/some-font/with\ \(parens\)/ok.woff)
url("../node_modules/some-font-with (parens)/ok.woff")
url('../node_modules/some-font-with (parens)/ok.woff')
url(data:image/gif;base64,R0lGODlhEAAQAMQAAORHHOVSKudfOulrSOp3WOyDZu6QdvCchPGolfO0o\as+fa==)

See https://regex101.com/r/7eVGaW/2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant