Forum Discussion

Frank_30530's avatar
Frank_30530
Icon for Altocumulus rankAltocumulus
Feb 09, 2018

HTTP::path -normalized (TMOS 13) issue?

According to the Wiki,

HTTP::path -normalized
should do:

the normalization involves lower-casing, removing unnecessary directory traversals, convert from microsoft style %uxxxx form to the standard %xx hex form, bytes not allowed in a uri are normalised to their percent-encoded representation, bytes percent-encoded when they don't need to be are changed to their normal representation

My experience (TMOS 13.1) is that:

  • conversion to lower case does not work at all!
  • "%2E%2E"
    is converted to
    ".."
    (OK)
  • "/foo/../bar"
    is converted to
    "/bar"
    (OK)

However (FAIL):

  • "/foo/%2E%2E/bar"
    is (only) converted to
    "/foo/../bar"
    .

It seems that directory traversal normalization is (incorrectly) performed before %-normalization. I would expect to first do %-normalization and then normalize directory traversal...

So, in order to perform normalization correctly, I need to do this (I need to retain the original URI in the request to the pool member):

  set origPath [HTTP::path]
  set normPath [string tolower [HTTP::path -normalized]]
  HTTP::path $normPath
  set normPath [HTTP::path -normalized]
  HTTP::path $origPath
  log local0. "Path  : \"$origPath\" --> \"$normPath\""

I would have expected to do this, though:

  set normPath [HTTP::path -normalized]
  log local0. "Path  : \"[HTTP::path]\" --> \"$normPath\""

Is there a more elegant way to perform normalization on the URI/PATH?

And can this be fixed, please?

4 Replies

  • JG's avatar
    JG
    Icon for Cumulonimbus rankCumulonimbus

    Apparently the system goes through one pass only, and 2 passes seem to be good enough for your situation.

     

    It would be better for the system to use a loop, and break out if the output is equal to the input.

     

    You might want to open a support case so that F5 developers are aware of this issue.

     

  • As an addendum to the response from Jie

     

    The canonical word on the operations of normalize are

     

    1) The case of scheme is normalized to lower-case.

     

    2) The hostname is normalized to lower-case. A possible trailing '.' is removed from the hostname.

     

    3) IP addresses are normalized. There are many ways to write an IP address (v4 and v6). These are cannonicalized to a normal form.

     

    4) Ports are normalized. Unnecessary port annotations (like :80 on a http request) are removed.

     

    5) Paths are normalized, with extra unneeded directory traversals removed.

     

    6) Percent-encoding normalization. Bytes not allowed in a uri are normalized to their percent-encoded representation. Bytes percent-encoded when they don't need to be are changed to their normal representation. We convert from microsoft style %uxxxx form to the standard %xx hex form.

     

    7) Fragment specifiers are removed.

     

    So - HTTP::uri/HTTP::path -normalize should not change the case of a URI.

     

    The documentation for HTTP::uri is probably clearer.

     

  • I got confirmation by F5 support that lower-casing is not intended (and not performed) and the Wiki documentation (HTTP::path -normalize) is in error.

     

    I also got a bug-id concerning the order of "directory traversal normalization" and "%-normalization.":

     

    bug ID708068 (HTTP: normalize directory traversal after percent-encoding)

     

    I have requested to add this to the known issues section of the Wiki (HTTP::path).

     

    • JG's avatar
      JG
      Icon for Cumulonimbus rankCumulonimbus

      Thanks for reporting back.