XSS Vulnerability in Google Code-in via Improperly Escaped JSON Data
Google Code-in is an annual online programming competition hosted by Google, designed to engage students worldwide in open-source software development. During a routine attempt to re-register on the platform, a security researcher discovered a significant Cross-Site Scripting (XSS) vulnerability stemming from improperly escaped JSON data.
Introduction to the Vulnerability
While filling out multiple text fields with malicious payloads, the researcher unexpectedly triggered immediate execution of these payloads upon form submission. Initially, this behavior was considered a self-XSS, as the scripts executed only for the current user. However, further exploration revealed a more critical issue: the same payload executed persistently and affected other users viewing the comments section.
In Google Code-in, users can submit tasks and comment on those tasks. By inserting the payload into a comment, it was observed that the malicious code executed whenever the comment was loaded, posing a significant stored XSS risk. This issue was promptly reported and addressed by Google within one day.
Technical Analysis of the Payload and Root Cause
The vulnerability arose from how Google Code-in utilized <script type="application/json">
elements to transport user-generated data from the backend to the front end. Below is a simplified example of the JSON payload embedded in the HTML page:
<script type="application/json">
{"someData": true, "text": "hello world", "user": 123}
</script>
The attacker crafted a payload such as:
"'><script src=x></script>{{1-1}}
When this malicious input was added to the comment field, it was incorporated into the JSON structure without adequate escaping:
<script type="application/json">
{
"someData": true,
"comments": [{
"id": 123,
"text": ""'><script src=x></script>{{1-1}}"
}]
}
</script>
Although the JSON string correctly escaped double quotes, it failed to escape dangerous sequences such as </script>
. According to HTML4 specifications, the first occurrence of the sequence </
inside a <script>
element prematurely ends the script block. Consequently, the browser parses the malicious <script>
tag as an executable script element, enabling the XSS attack.
Background: Parsing Context and Standards
The HTML4 specification explains that encountering </
inside a script block signals the end of that script element’s content. This behavior requires developers to escape sequences like </script>
inside script literals, such as by replacing </script>
with </script>
to prevent accidental premature termination.
“The first occurrence of the character sequence ‘</’ (end-tag open delimiter) is treated as terminating the end of the element’s content.” – HTML4 Specification
Modern Browser Security Measures and Bypass
Under typical circumstances, modern browsers implement Content Security Policy (CSP) headers that restrict the sources from which scripts can be executed, thereby mitigating many XSS attempts. However, the use of AngularJS on the Google Code-in frontend introduced an additional attack surface.
AngularJS allows for expressions such as {{1-1}}
which are evaluated at runtime. This enables attackers to execute JavaScript code despite CSP rules. Since AngularJS 1.6 removed its expression sandbox, attackers can leverage the constructor.constructor
technique to run arbitrary JavaScript:
{{constructor.constructor('alert("xss")')()}}
This approach circumvents CSP protections as it does not rely on external script source loading but evaluates expressions within Angular’s templating system.
Implications and Real-World Examples
- Stored XSS Risk: The vulnerability permitted persistent XSS attacks that could affect site administrators and mentors upon viewing task comments.
- Data Integrity: Malicious scripts could manipulate page content or hijack user sessions.
- Broader Impact: Similar vulnerabilities in popular platforms such as McDonald’s website have demonstrated the widespread risks of unescaped AngularJS expressions (Thomas Orlita, 2020).
Timeline of the Vulnerability Resolution
Date | Event |
---|---|
2018-10-30 | Vulnerability Reported to Google |
2018-10-31 | Fix Implemented by Development Team |
2018-11-01 | Initial Closure of Report |
2018-11-21 | Report Reopened and Accepted |
2018-11-21 | Priority Upgraded to P2 |
2018-12-11 | Researcher Rewarded |
2018-12-12 | Marked as Fully Resolved |
Preventing Similar JSON-Based XSS Vulnerabilities
To defend against XSS vulnerabilities originating from embedded JSON, organizations should adhere to the following best practices:
- Proper Contextual Escaping: Always escape sequences like
</script>
within JSON embedded in HTML to prevent premature script termination. - Use Safe Data Serialization: Consider serializing JSON data into safe inline JavaScript objects or using methods like
text/json
with secure parsers. - Implement Content Security Policy (CSP): Configure strict CSP rules including nonces or hashes to limit executable scripts.
- Limit Client-side Expression Evaluation: Reduce or sanitise AngularJS expressions and avoid exposing dangerous constructs.
- Adopt Modern Frameworks: Use up-to-date frameworks with built-in protections against XSS and expression injection.
Conclusion
The Google Code-in XSS case highlights a subtle yet impactful security risk arising from improperly escaped JSON within HTML script tags. Combining legacy HTML parsing behaviors with modern frontend frameworks like AngularJS can introduce complex vulnerabilities that circumvent standard protections such as CSP.
Vigilant encoding, comprehensive security policies, and awareness of framework-specific risks are essential to safeguarding web applications against these sophisticated XSS attack vectors.
Written by Thomas Orlita. For additional insights and the latest security research, explore expert blogs and trusted sources in web security.