Fully Decode URI

Problem this snippet solves:

This sample will recursively decode a URI before inspection.

When checking for special characters in a URI (quotes, percent signs, spaces, etc), one must make sure that the URI has been fully decoded. This could require multiple decodings to cover for the fact that characters within an encoded character could themselves be encoded.

How to use this snippet:

when HTTP_REQUEST {
  # decode original URI.
  set tmpUri [HTTP::uri]
  set uri [URI::decode $tmpUri]

  # repeat decoding until the decoded version equals the previous value.
  while { $uri ne $tmpUri } {
    set tmpUri $uri
    set uri [URI::decode $tmpUri]
  }
  HTTP::uri $uri

  log local0. "Original URI: [HTTP::uri]"
  log local0. "Fully decoded URI: $uri"
}

Notes

Note that the unbounded while loop in this iRule may be dangerous. Due to a flaw in URI::decode (ID 337562, fixed in 10.2.1-HF1 and 11.0.0), it is possible that for some URI's, $uri will never equal $tmpUri, and the loop will continue forever. The variables will keep growing in length, leading to a TMM core. Capping the amount of possible iterations of the while loop is a wise idea.

Implementation Details

This iRule requires LTM v10. or higher.

Code :

when HTTP_REQUEST {

   # Decode original URI
   set tmpUri [HTTP::uri]
   set uri [URI::decode $tmpUri]

   # For versions lower than 10.2 HF2, set a max number of iterations
   # to handle bug ID 337562
   set max 4
   set cnt 0

   # Repeat decoding until the decoded version equals the previous value
   # or max iterations is reached
   while { $uri ne $tmpUri && $cnt < $max } {
      set tmpUri $uri
      set uri [URI::decode $tmpUri]
      incr cnt
   }
   HTTP::uri $uri

   log local0. "Original URI: [HTTP::uri]"
   log local0. "Fully decoded URI: $uri"
}

Tested this on version:

10.0
Published Mar 17, 2015
Version 1.0

Was this article helpful?

1 Comment

  • Hi,

     

    I know that URI::decode will decode the Percent encoding. Is there a way to decode other encoding mechanism as well like UTF-8 ?