My first post on HTTP Parameter Pollution has been read by more than 1,500 people, and several other security portals have blogged about it (e.g., Security-Shell, PenTestIT, Dark Reading, ToolsWatch, Packet Storm and Security Focus). So far, PAPAS, our online HPP scanning service, has received 78 submissions and 1388 unique visits.
I am happy to see that the security community seems to have some interest in our work. I thought it would be a good idea to write more on how our detection tool works. The paper is actually quite complete and comprehensive, but people are generally busy (or lazy ;)) and do not have too much time to go through all the details of a scientific paper. In the next couple of blog posts, I will explain the architecture and the algorithm we designed to detect HPP flaws in web applications.
PAPAS consists of four main components: a browser, a crawler, and two scanners.
The second component is a crawler that communicates with the browser through a bidirectional channel. This channel is used by the crawler to inform the browser on the URLs that need to be visited, and on the forms that need to be submitted. Furthermore, the channel is also used to retrieve the collected information from the browser.
In order to increase the depth that a website can be scanned with, the instrumented browser in PAPAS uses a number of simple heuristics to automatically fill forms. For example, random alphanumeric values of 8 characters are inserted into password fields and a default e-mail address is inserted into fields with the name email, e-mail, or mail.
For sites where is the authenticated section to be scanned, the crawler can be assisted by specifying a regular expression to be used to prevent the crawler from visiting the log-out page (e.g., by excluding links that include the cmd=logout parameter). You find this feature in the online service under the name “exclude regexp”.
Every time the crawler visits a page, it passes the extracted information to the two scanners so that it can be analyzed. The parameter Precedence Scanner (P-Scan) is responsible for determining how the page behaves when it receives two parameters with the same name. The Vulnerability Scanner (V-Scan), in contrast, is responsible for testing the page to determine if it is vulnerable to HPP attacks. V-Scan does this by attempting to inject a new parameter inside one of the existing ones and analyzing the output. The two scanners are written in Python, and communicate with the instrumented browser over TCP/IP sockets.
In the next post, I will go into the details of how these two scanners work. Thanks for your interest and see you next week!