This post is part of the series 'Vulnerabilities'. Be sure to check out the rest of the blog posts of the series!
Many applications use XML files. For example, this blog has RSS and Atom feeds that are XML documents, applications can communicate using SOAP, and XML serialization is widely used. However, most applications use only a subset of XML features. Exploring the full specification reveals potential vulnerabilities when processing XML documents. Let's look at some of them and see how to prevent them in a .NET application.

#Deny of service (DOS) using custom XML entities
There are well-known entities in XML such as < which is translated to <. You can also define custom entities:
XML
<?xml version="1.0" ?>
<!DOCTYPE samples [
<!ENTITY name "value">
]>
<test>&name;</test> <!-- Evaluated to <test>value</test> -->
You can also define entities that expand recursively:
XML
<?xml version="1.0" ?>
<!DOCTYPE samples [
<!ENTITY name "value">
<!ENTITY name2 "&name; &name;">
]>
<test>&name2;</test> <!-- Evaluated to <test>value value</test> -->
Now, let's consider this small document (< 1kB), also known as the billion laughs attack. When evaluating this document, you'll need about 3GB of memory to store the expanded string. Indeed, &lol9; expands to a string composed of 1 billion "lol".
XML
<?xml version="1.0"?>
<!DOCTYPE lolz [
<!ENTITY lol "lol">
<!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
<!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
<!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
<!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
<!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
<!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
<!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
<!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>
Another variation is the Quadratic Blowup attack. While not as effective as the previous attack, it defeats parsers that do not allow recursive expansion. Instead of defining multiple small, deeply nested entities, it defines one very large entity and refers to it many times. A document of about 200kB can expand up to 2.5GB.
XML
<?xml version="1.0"?>
<!DOCTYPE QuadraticBlowup [
<!ENTITY a "aaaaaaaaaaaaaaaaaa...">
]>
<QuadraticBlowup>&a;&a;&a;&a;&a;&a;&a;&a;&a;...</QuadraticBlowup>
Entity expansion also allows you to access local or remote data. An attacker could use this to get the content of a local file or an internal URL.
XML
<!DOCTYPE doc [
<!ENTITY localfile SYSTEM "c:\test.txt">
<!ENTITY remotefile SYSTEM "https://sample/">
]>
<doc>&localfile;</doc>
#Remote code execution using msxsl
XSLT is a language for transforming XML documents. The XSLT syntax allows you to do many things depending on the underlying engine. For instance, the msxsl engine on Windows allows you to execute any C# or JavaScript code during the transformation, giving an attacker access to local or remote resources.
XML
<!-- full code (XML + C#): https://gist.github.com/meziantou/2c377432b178ebab17a6802f189adce7 -->
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:msxsl="urn:schemas-microsoft-com:xslt"
xmlns:user="http://dummy/ns">
<msxsl:script language="C#" implements-prefix="user">
<![CDATA[
public string CustomCode()
{
return DateTime.Now.ToString();
}
]]>
</msxsl:script>
<xsl:template match="/">
<xsl:value-of select="user:CustomCode()"/>
</xsl:template>
</xsl:stylesheet>
You can also copy the content of an existing XML file using document(...):
XML
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:copy-of select="document('file.xml')" />
</xsl:template>
</xsl:stylesheet>
XSLT 2.0 introduced unparsed-text, which reads a non-XML file and returns its content as a string. This means you can also read any file on the system:
XML
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:copy-of select="unparsed-text('file.txt')" />
</xsl:template>
</xsl:stylesheet>
#Detect vulnerable XSLT engines
XSLT also allows you to use system-properties such as <xsl:value-of select="system-property('xsl:product-name')" />. Using this, you can gather information about the tool that processes the file and potentially detect a vulnerable engine.
XML
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="/">
<xsl:value-of select="system-property('xsl:version')" />
<xsl:value-of select="system-property('xsl:vendor')" />
<xsl:value-of select="system-property('xsl:vendor-url')" />
<xsl:value-of select="system-property('xsl:product-name')" />
<xsl:value-of select="system-property('xsl:product-version')" />
</xsl:template>
</xsl:stylesheet>
#How to prevent XML vulnerabilities in .NET
In summary, parsing XML/XSLT files can expose your application to:
- Deny of service
- Information disclosure
- Remote code execution
In .NET there are many ways to read an XML document: XmlReader, XmlDocument, XDocument, XslTransform. Most of them are safe by default since .NET 4, but the default values could change in the future. I strongly recommend testing your code against each of these attacks.
Here are some settings to protect your code:
C#
var readerSettings = new XmlReaderSettings()
{
DtdProcessing = DtdProcessing.Prohibit, // Prohibit or Ignore
XmlResolver = null, // Do not allow to open external resources
};
var reader = XmlReader.Create("file.xml", readerSettings);
C#
var settings = new XsltSettings()
{
EnableScript = false, // Disallow execution of scripts
};
var xslTransform = new XslCompiledTransform(enableDebug: true);
xslTransform.Load(xsl.CreateReader(), settings, stylesheetResolver: null);
Additional resources:
Do you have a question or a suggestion about this post? Contact me!