Cross Site Scripting (XSS)
Cross Site Scripting attacks are attacks against the user of a web site where the attacker injects code into the web site that is executed by the browser of the victim. Some XSS attacks are purely mischievous but some can be quite malicious and involve financial damage to the users.
XSS attacks generally fall into three categories:
Reflective Attacks
In reflective attacks (also known as non persistent), the attacker tricks the user into clicking on a link that has the attack embedded in the link, usually as an HTTP GET variable but sometimes as a forged HTML form.
The vulnerable web application then incorporates the script into the page that loads in the users browser, which then executes the script.
For a basic example, let us suppose a web application has a simple form that asks for a users name, and then says hello to the user. The vulnerable code might look like this:
$string = '<p>Hello ' . $_GET['name'] . ', Welcome to my site!</p>'; echo($string);
The attacker can pass a malicious script as part of the name variable in the link they send the user, and the script then gets embedded directly into the paragraph that the web application sends to the user.
There are several ways the web application can sanitize the variable before it is used. The web application can pass it through a series of regular expressions, it can pass it through the strip_tags() function, or it can pass it the string through HTML Purifier.
The problem is that sometimes the attackers can be quite clever in how they find ways to dodge the filters web applications use.
The DOMDocument Solution
DOMDocument is a class based on libxml that allows manipulation of XML as an object using the Document Object Model (DOM). I fell in love with the class when I first found it because it just made so much logical sense to create a document as an object that can have every part manipulated right up until it is sent. The traditional method of echoing and printing individual parts of a page, including tags and attributes as the script runs in a linear fashion just seemed broken.
A side of effect of using DOMDocument for web application development is that it is virtually immune to this kind of attack. Take the following code example:
$string = 'Hello ' . $_GET['name'] . ', Welcome to my site!'; $xmlPar = $dom->createElement('p',$string);
Even though there was no attempt to sanitize the name variable, it will not result in execution of any script embedded within that variable. Some filtering is still necessary, with no filtering the form processing can break by putting a named entity in the variable that is not natively declared in XML (I.E. ) but insertion of a script node simply will not work. The script node will just become part of the text child node of the paragraph node, and it will be properly escaped by the DOMDocument class itself when DOMDocument creates the paragraph node.
Persistent Attacks
With persistent attacks, the malicious code is injected into the server’s data store and then served to any user who requests a page that uses that particular piece of data. There are three typical vectors by which this takes place:
- SQL Compromise
- User Account Compromise
- Vulnerable Form (including CSRF)
Since DOMBlogger uses prepared statements, using an attack against the database to inject malicious code is unlikely to work.
With respect to user accounts, unless you are really hot you probably will not find me personally visiting your house or office to see if you have your password taped under your keyboard or to the back of your monitor. BTW, if I hit the nail on the head, that just goes to show you how common that practice is. DOMBlogger does however take some precautions to protect user accounts, see the Session Security page.
The third vector is a potential issue. Not all data can be added in the safe fashion as described above where nodes are created one at a time. Some cases allow for more complex data to be entered where it is then imported into the DOM including child nodes.
HTML Purifier and Tidy
In those cases, DOMBlogger uses third party filter utilities to sanitize the XML or HTML before it is imported into the DOM. Specifically it passes through HTML Purifier to remove script injection and through the php wrapper to libtidy in XML mode to ensure tag sanity.
The server that hosts domains for DOMBlogger.net manages its HTMLPurifier install through pear so that checking to make sure it is using the current version of HTML Purifier (and updating when it is not) is easy to do.
Even when malicious code is successfully injected into the DOMBlogger CMS, there is a very good chance it will be removed before the content is actually sent to the user by the server side Content Security Policy filter.
DOM Attacks
DOM based Attacks are a new type of XSS and do not attack vulnerabilities in the server software itself. It attacks the DOM in the web page, often in a similar manner to Reflective XSS attacks.
In a nutshell, many modern web application do a lot of their programming within the JavaScript environment of the client. The attacker crafts a URL that exploits a vulnerability in the client side JavaScript causing that JavaScript to take action that was not intended.
This type of attack needs to combated through a policy of best coding practices.
document.location
The most commonly referenced example of this attack takes advantage of document.location for the payload. A dummy variable is used to trigger client side action. For example, a web page address might look like:
http://www.example.tld/Welcome.html?name=Joe
The server does not do anything with the get variable, it ignores it. What looks like a get variable is there in order for JavaScript running within the page itself to extract and use in modification of the DOM.
The attacker tricks people into clicking on a link that looks like:
http://www.example.tld/Welcome.html?name=<script>...
The payload does not go through the server, it goes through the JavaScript.
Proper filtering within the JavaScript would neuter the attack, but in the above example, the real problem is the web developer misused the URL. A URL is to identify the host and path to a resource, using it to pass input to JavaScript is an abuse and one that should be avoided in writing client side JavaScript.
There is rarely ever a true need for JavaScript to look at what is in document.location and modify the DOM based on its contents. With the exceptions of anchor references (the # and what follows) everything in a web browsers URL is known to the server when the page is servered and any actions based on a URL should be handled server side.
Since browsers do not send the contents of an anchor reference to the server, if your JavaScript needs to know the anchor, then you will need to get if from document.location but that is the only case I can think of.
XMLHttpRequest
Ajax would not be possible without XMLHttpRequest. It allows JavaScript to send requests to a server and receive a response, which can then be incorporated into the document DOM.
Since it modifies the DOM, it is a potential source for DOM based attacks. If the data received from the server is tainted with a payload then the payload can be injected into the users web page.
This is largely mitigated at the browser level by Same Origin Policy. Nutshell, XMLHttpRequest is only allowed to make requests to the same server that the web page originated from.
The DOM can not be modified by data that does not come from the same protocol, host, and port that the original HTML document came from.
Unfortunately, there are many work-a-rounds to give similar functionality in a cross domain context (often referred to as XDM, but should not be confused with X Display Manager).
While there are legitimate uses for this, it is dangerous. DOMBlogger does not implement any of these workarounds and sets a CSP policy that forbids it. Any requests for data within JavaScript will be made to the same protocol, domain, and port the web page was served from.