Untangling the nginx location block matching algorithm

I'm new to nginx, and I'm trying to understand in a human, rational way, how the location block is chosen for a given request. I have not found the documentation very enlightening.

'location' blocks identify the path from the request URL in various ways: exact matches, prefix matches, regex matches. You can find a description of the algorithm used by nginx, but it's not very human friendly, so here I've tried to explain it in simpler terms.

List of nginx location match priorities

The first match in this order will be chosen and processed:

  1. Exact string matches location = /foo
  2. The longest of any location ^~ ... matches
  3. The first regex match that is nested within the single longest matching prefix match! See discussion below.
  4. The first other regex match location ~ regex
  5. The longest prefix match location /foo

Nb. Regex (regular expression) matches can be case insensitive or case sensitive, within their priority group it's just the first match that counts; case sensitive ones don't win out over case insensitive ones.

Nested location block processing

The outer level is considered first and one of those is chosen. Then any nested locations within the chosen one are matched. But the third item above explains the exception to this:

Nested location block matches with regexes in.

The third item above is quite peculiar, quite specific, and took me a while to understand. Take the following config, how will it respond to a request for /foo?

location     /foo {
  location ~ /foo { try_files /a.html /error.html; }
}
location     /fo  {
  location ~ /foo { try_files /b.html /error.html; }
}
location ~ /foo { try_files /c.html /error.html; }

Both prefix matches match, with the first one being the longest. All the regexes would match.

Nginx will respond with a.html  because that regex is within the longest matching prefix location. OK, this is useful because it means we can have global regexes that are overridden / pushed down the priority list for certain path prefixes.

However, it's very specific. What page will be served by this example?

location     /foo {
  location ~ /foox { try_files /a.html /error.html; }
}
location     /fo  {
  location ~ /foo { try_files /b.html /error.html; }
}
location ~ /foo { try_files /c.html /error.html; }

The answer is c.html  despite the fact that location /fo matches and so does the regex for the location that would return b.html! But this does not happen because only the regexes in the longest prefix match are ever considered, so the un-nested regex above is matched.

To make the point even clearer, take this final example where we have a matching regex inside a matching prefix match that will never be selected:

location     /foo {
  location ~ /foox { try_files /a.html /error.html; }
}
location     /fo  {
  location ~ /foo { try_files /b.html /error.html; }
}
location ~ /foox { try_files /c.html /error.html; }

The output of this is a 404, not b.html. Following my priority list above will tell you this: on the outer level, there's no exact matches; there's no ^~  matches, there's no matching regexes within the longest matching prefix (/foo), there's no matching regexes on the outer level, so the match goes to the longest matching prefix block /foo, but this does not provide a response for the given request, so it's a 404.

So the quirk is that regexes within the longest prefix match block are processed ahead of the regexes on less-nested locations.

Comments

There should be a note somewhere that for nested location blocks, the inner urls are not relative to outer urls.

Murage replied on

Murage - yes, you're right, that's a helpful contribution, thanks!

Rich replied on

Add new comment