-
-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lapis is using the wrong request uri from ngx #706
Comments
I seem to remember |
From what that SO article notes, it would seem that the two major differences between |
Doing a bit more research, it appears that nginx as a whole prefers normalized URIs, citing security reasons among others. While this would be a breaking change, I think switching to |
Going beyond nginx and reading the specification for URIs (https://tools.ietf.org/html/rfc3986) in general, I think there has been some misunderstandings on what a URI actually is based on our conversations on Discord over the past few days. What is a URI?https://tools.ietf.org/html/rfc3986#section-1
The spec defines a URI as a "simple and extensible means of identifying a resource", also noting that it has specific syntax and semantics that must be followed. Because it has syntax and semantics, a URI is not just a [unique] string such as the key in a key/value pair, but an actual path to a resource, relative or absolute. https://tools.ietf.org/html/rfc3986#section-3
Most of what I'll be writing about and quoting will be about the What is a URI's Path?https://tools.ietf.org/html/rfc3986#section-3.3
This definition notes specifically that a URI's path is usually organized in a "hierarchical form". This is analoguous to (but not necessarily equivalent to) a structured filesystem. Each segment in a path is typically though of as a nested layer of directories where the left-most segment encapsulates all segments right of it. Similar to a filesystem, we'd expect something akin to:
If we look at a typical blog-style path, we may see something like
Relative path segments such as URI EquivalencyA significant part of our discussion was based around the notion that one URI should point to one resource, and aliases should not exist at all. This isn't necessarily wrong, but I believe the misunderstanding here is focused on what constitutes an alias. https://tools.ietf.org/html/rfc3986#section-6.1
https://tools.ietf.org/html/rfc3986#section-6.2.1
https://tools.ietf.org/html/rfc3986#section-6.2.2
https://tools.ietf.org/html/rfc3986#section-6.2.2.2
https://tools.ietf.org/html/rfc3986#section-6.2.2.3
https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax
All of the rules above talk about URI references and normalization. URI references, as far as I can determine, are the raw URIs that come into a system and must be normalized to produce the URI target. This normalization process should be applied to every URI reference at its destination. We can think of this similar to encoding/decoding JSON: the requesting client encodes JSON as the final step before sending the request, and the responding server decodes it as the first step after it is received. Because of this, reference URIs are not aliases, they are just raw data waiting to be normalized. It is only after normalization that we can determine if they are aliases. Though I cannot find it specifically in the spec document, Wikipedia states that a path segment can be empty and thus many slashes can be adjacent to each other. Because of the semantics of path segments, these segments provide no information, hierarchical or otherwise, and can safely be compressed down to a single slash. For URI normalization, the correct steps are as follows:
Note that these are the rules that are applied to the |
ngx.var.uri normalization is stupid and wrong. It does not take in account the reserved characters. |
lapis/lapis/nginx.moon
Lines 77 to 89 in d3a16d1
Lapis is checking
ngx.var.request_uri
for the request's uri, but this variable only points to the main request and uses a raw uri string which has some potential downsides (see below SO article). Instead, Lapis should usengx.var.uri
which is the current request uri, including any sub-requests. It is also normalized so any funky issues with the uri, including any encoding, is removed.As of right now, using
ngx.location.capture()
to make sub-requests is wholly broken and unusable as it just loops the main request until the server throws a 500 error.https://stackoverflow.com/questions/48708361/nginx-request-uri-vs-uri
The text was updated successfully, but these errors were encountered: