Web Application Exploits and Defenses

Want to beat the hackers at their own game?

  • Learn how hackers find security vulnerabilities!
  • Learn how hackers exploit web applications!
  • Learn how to stop them!

This codelab shows how web application vulnerabilities can be exploited and how to defend against these attacks. The best way to learn things is by doing, so you’ll get a chance to do some real penetration testing, actually exploiting a real application. Specifically, you’ll learn the following:

  • How an application can be attacked using common web security vulnerabilities, like cross-site scripting vulnerabilities (XSS) and cross-site request forgery (XSRF).
  • How to find, fix, and avoid these common vulnerabilities and other bugs that have a security impact, such as denial-of-service, information disclosure, or remote code execution.

To get the most out of this lab, you should have some familiarity with how a web application works (e.g., general knowledge of HTML, templates, cookies, AJAX, etc.).

Gruyere

This codelab is built around Gruyere /ɡruːˈjɛər/ - a small, cheesy web application that allows its users to publish snippets of text and store assorted files. “Unfortunately,” Gruyere has multiple security bugs ranging from cross-site scripting and cross-site request forgery, to information disclosure, denial of service, and remote code execution. The goal of this codelab is to guide you through discovering some of these bugs and learning ways to fix them both in Gruyere and in general.

The codelab is organized by types of vulnerabilities. In each section, you’ll find a brief description of a vulnerability and a task to find an instance of that vulnerability in Gruyere. Your job is to play the role of a malicious hacker and find and exploit the security bugs. In this codelab, you’ll use both black-box hacking and white-box hacking. In black box hacking, you try to find security bugs by experimenting with the application and manipulating input fields and URL parameters, trying to cause application errors, and looking at the HTTP requests and responses to guess server behavior. You do not have access to the source code, although understanding how to view source and being able to view http headers (as you can in Chrome or LiveHTTPHeaders for Firefox) is valuable. Using a web proxy like Burp or WebScarab may be helpful in creating or modifying requests. In white-box hacking, you have access to the source code and can use automated or manual analysis to identify bugs. You can treat Gruyere as if it’s open source: you can read through the source code to try to find bugs. Gruyere is written in Python, so some familiarity with Python can be helpful. However, the security vulnerabilities covered are not Python-specific and you can do most of the lab without even looking at the code. You can run a local instance of Gruyere to assist in your hacking: for example, you can create an administrator account on your local instance to learn how administrative features work and then apply that knowledge to the instance you want to hack. Security researchers use both hacking techniques, often in combination, in real life.
We’ll tag each challenge to indicate which techniques are required to solve them:

Challenges that can be solved just by using black box techniques.

Challenges that require that you look at the Gruyere source code.

Challenges that require some specific knowledge of Gruyere that will be given in the first hint.
WARNING: Accessing or attacking a computer system without authorization is illegal in many jurisdictions. While doing this codelab, you are specifically granted authorization to attack the Gruyere application as directed. You may not attack Gruyere in ways other than described in this codelab, nor may you attack App Engine directly or any other Google service. You should use what you learn from the codelab to make your own applications more secure. You should not use it to attack any applications other than your own, and only do that with permission from the appropriate authorities (e.g., your company’s security team).

Setup

To access Gruyere, go to http://google-gruyere.appspot.com/start. AppEngine will start a new instance of Gruyere for you, assign it a unique id and redirect you to http://google-gruyere.appspot.com/123/ (where 123 is your unique id). Each instance of Gruyere is “sandboxed” from the other instances so your instance won’t be affected by anyone else using Gruyere. You’ll need to use your unique id instead of 123 in all the examples. If you want to share your instance of Gruyere with someone else (e.g., to show them a successful attack), just share the full URL with them including your unique id.

The Gruyere source code is available online so that you can use it for white-box hacking. You can browse the source code at http://google-gruyere.appspot.com/code/ or download all the files fromhttp://google-gruyere.appspot.com/gruyere-code.zip. If want to debug it or actually try fixing the bugs, you can download it and run it locally. You do not need to run Gruyere locally in order to do the lab.

Running locally

WARNING: Because Gruyere is very vulnerable, it includes some protection against being exploited by an external attacker when run locally. You’ll see these parts of the code marked DO NOT CHANGE. Gruyere only accepts requests from localhost and uses a random unique id in the URL. However, it’s difficult to fully protect against an external attack. And if you make changes to Gruyere you could make it more vulnerable to a real attack. Therefore, you should close other web pages while running Gruyere locally and you should make sure that no other user is logged in to the machine you are using.

To run Gruyere locally, you’ll first need to install Python 2.5, if you don’t already have it. Gruyere was developed and tested with version 2.5 and may not work with other versions of Python. You can download it from python.org. Download Gruyere itself from http://google-gruyere.appspot.com/gruyere-code.zip and unpack it to your local disk. Then to run the application, simply type:

$ cd <gruyere-directory>
$ ./gruyere.py

You’ll need to replace google-gruyere.appspot.com in all the examples with localhost:8008 in addition to replacing 123 with your unique id. Note that the unique id appears in a different location. There are a few other small differences between running Gruyere locally vs. accessing the instance on App Engine. The most obvious is that the App Engine version runs in a limited sandbox. So if you do something that puts Gruyere into an infinite loop, the monitor will detect it and kill it. That might not happen when you run it locally, depending on what the loop is doing.

Reset Button

As noted above, each instance is sandboxed so it can’t consume infinite resources and it can’t interfere with anyone else’s instance. Notwithstanding that, it is possible to put your Gruyere instance into a state where it is completely unusable. If that happens, you can push a magic “reset button” to wipe out all the data in your instance and start from scratch. To do this, visit this URL with your instance id:

http://google-gruyere.appspot.com/resetbutton/123

About the Code

Gruyere is small and compact. Here is a quick rundown of the application code:

  • gruyere.py is the main Gruyere web server
  • data.py stores the default data in the database. There is an administrator account and two default users.
  • gtl.py is the Gruyere template language
  • sanitize.py is the Gruyere module used for sanitizing HTML to protect the application from security holes.
  • resources/... holds all template files, images, CSS, etc.

Features and Technologies

Gruyere includes a number of special features and technologies which add attack surface. We’ll highlight them here so you’ll be aware of them as you try to attack it. Each of these introduces new vulnerabilities.

  • HTML in Snippets: Users can include a limited subset of HTML in their snippets.
  • File upload: Users can upload files to the server, e.g., to include pictures in their snippets.
  • Web administration: System administrators can manage the system using a web interface.
  • New accounts: Users can create their own accounts.
  • Template language: Gruyere Template Language(GTL) is a new language that makes writing web pages easy as the templates connect directly to the database. Documentation for GTL can be found in gruyere/gtl.py.
  • AJAX: Gruyere uses AJAX to implement refresh on the home and snippets page. You should ignore the AJAX parts of Gruyere except for the challenges that specifically tell you to focus on AJAX.
    • In a real application, refresh would probably happen automatically, but in Gruyere we’ve made it manual so that you can be in complete control while you are working with it. When you click the refresh link, Gruyere fetches feed.gtl which contains refresh data for the current page and then client-side script uses the browser DOM API (Document Object Model) to insert the new snippets into the page. Since AJAX runs code on the client side, this script is visible to attackers who do not have access to your source code.

Using Gruyere

To familiarize yourself with the features of Gruyere, complete the following tasks:

  • View another user’s snippets by following the “All snippets” link on the main page. Also check out what they have their Homepage set to.
  • Sign up for an account for yourself to use when hacking. Do not use the same password for your Gruyere account as you use for any real service.
  • Fill in your account’s profile, including a private snippet and an icon that will be displayed by your name.
  • Create a snippet (via “New Snippet”) containing your favorite joke.
  • Upload a file (via “Upload”) to your account.

This covers the basic features provided by Gruyere. Now let’s break them!

Cross-Site Scripting (XSS)

Cross-site scripting (XSS) is a vulnerability that permits an attacker to inject code (typically HTML or Javascript) into contents of a website not under the attacker’s control. When a victim views such a page, the injected code executes in the victim’s browser. Thus, the attacker has bypassed the browser’s same origin policy and can steal victim’s private information associated with the website in question.

In a reflected XSS attack, the attack is in the request itself (frequently the URL) and the vulnerability occurs when the server inserts the attack in the response verbatim or incorrectly escaped or sanitized. The victim triggers the attack by browsing to a malicious URL created by the attacker. In a stored XSS attack, the attacker stores the attack in the application (e.g., in a snippet) and the victim triggers the attack by browsing to a page on the server that renders the attack, by not properly escaping or sanitizing the stored data.

More details

To understand how this could happen: suppose the URL http://www.google.com/search?q=flowers returns a page containing the HTML fragment

<p>Your search for 'flowers'
returned the following results:</p>

that is, the value of the query parameter q is inserted verbatim into the page returned by Google. If www.google.com did not do any validation or escaping of q (it does), an attacker could craft a link that looks like this:

http://www.google.com/search?q=flowers+%3Cscript%3Eevil_script()%3C/script%3E

and trick a victim into clicking on this link. When a victim loads this link, the following page gets rendered in the victim’s browser:

<p>Your search for 'flowers<script>evil_script()</script>'
returned the following results:</p>

And the browser executes evil_script(). And since the page comes from www.google.comevil_script() is executed in the context of www.google.com and has access to all the victim’s browser state and cookies for that domain.

Note that the victim does not even need to explicitly click on the malicious link. Suppose the attacker owns www.evil.example.com, and creates a page with an <iframe> pointing to the malicious link; if the victim visits www.evil.example.com, the attack will silently be activated.

XSS Challenges

Typically, if you can get Javascript to execute on a page when it’s viewed by another user, you have an XSS vulnerability. A simple Javascript function to use when hacking is the alert() function, which creates a pop-up box with whatever string you pass as an argument.

You might think that inserting an alert message isn’t terribly dangerous, but if you can inject that, you can inject other scripts that are more malicious. It is not necessary to be able to inject any particular special character in order to attack. If you can inject alert(1) then you can inject arbitrary script using eval(String.fromCharCode(...)).

Your challenge is to find XSS vulnerabilities in Gruyere. You should look for vulnerabilities both in URLs and in stored data. Since XSS vulnerabilities usually involve applications not properly handling untrusted user data, a common method of attack is to enter random text in input fields and look at how it gets rendered in the response page’s HTML source. But before we do that, let’s try something simpler.

File Upload XSS

Can you upload a file that allows you to execute arbitrary script on the google-gruyere.appspot.com domain?

Hint

You can upload HTML files and HTML files can contain script.

Exploit and Fix

To exploit, upload a .html file containing a script like this:

<script>
alert(document.cookie);
</script>

To fix, host the content on a separate domain so the script won’t have access to any content from your domain. That is, instead of hosting user content on example.com/username we would host it at username.usercontent.example.com or username.example-usercontent.com. (Including something like “usercontent” in the domain name avoids attackers registering usernames that look innocent like wwww and using them for phishing attacks.)

Reflected XSS

There’s an interesting problem here. Some browsers have built-in protection against reflected XSS attacks. There are also browser extensions like NoScript that provide some protection. If you’re using one of those browsers or extensions, you may need to use a different browser or temporarily disable the extension to execute these attacks.

At the time this codelab was written, the two browsers which had this protection were IE and Chrome. To work around this, Gruyere automatically includes a X-XSS-Protection: 0 HTTP header in every response which is recognized by IE and will be recognized by future versions of Chrome. (It’s available in the developer channel now.) If you’re using Chrome, you can try starting it with the --disable-xss-auditor flag by entering one of these commands:

  • Windows: "C:Documents and SettingsUSERNAMELocal SettingsApplication DataGoogleChromeApplicationchrome.exe" --disable-xss-auditor
  • Mac: /Applications/Google Chrome.app/Contents/MacOS/Google Chrome --disable-xss-auditor
  • GNU/Linux: /opt/google/chrome/google-chrome --disable-xss-auditor

If you’re using Firefox with the NoScript extension, add google-gruyere.appspot.com to the allow list. If you still can’t get the XSS attacks to work, try a different browser.

You may think that you don’t need to worry about XSS if the browser protects against it. The truth is that the browser protection can’t be perfect because it doesn’t really know your application and therefore there may be ways for a clever hacker to circumvent that protection. The real protection is to not have an XSS vulnerability in your application in the first place.

Find a reflected XSS attack. What we want is a URL that when clicked on will execute a script.

Hint 1

What does this URL do?

http://google-gruyere.appspot.com/123/invalid

Hint 2

The most dangerous characters in a URL are < and >. If you can get an application to directly insert what you want in a page and can get those characters through, then you can probably get a script through. Try these:

http://google-gruyere.appspot.com/123/%3e%3c

http://google-gruyere.appspot.com/123/%253e%253c


http://google-gruyere.appspot.com/123/%c0%be%c0%bc


http://google-gruyere.appspot.com/123/%26gt;%26lt;


http://google-gruyere.appspot.com/123/%26amp;gt;%26amp;lt;


http://google-gruyere.appspot.com/123/74x3cu003cx3Cu003CX3CU003C


http://google-gruyere.appspot.com/123/+ADw-+AD4-

This tries > and < in many different ways that might be able to make it through the URL and get rendered incorrectly using: verbatim (URL %-encoding), double %-encoding, bad UTF-8 encoding, HTML &-encoding, double &-encoding, and several different variations on C-style encoding. View the resulting source and see if any of those work. (Note: literally typing >< in the URL is identical to %3e%3c because the browser automatically %-encodes those character. If you are trying to want a literal > or < then you will need to use a tool like curl to send those characters in URL.)

Exploit and Fix

To exploit, create a URL like the following and get a victim to click on it:

http://google-gruyere.appspot.com/123/<script>alert(1)</script>

To fix, you need to escape user input that is displayed in error messages. Error messages are displayed using error.gtl, but are not escaped in the template. The part of the template that renders the message is {{message}} and it’s missing the modifier that tells it to escape user input. Add the :text modifier to escape the user input:

<div>{{_message:text}}</div>

This flaw would have been best mitigated by a design that escapes all output by default and only displays raw HTML when explicitly tagged to do so. There are also autoescaping features available in many template systems.

Stored XSS

Now find a stored XSS. What we want to do is put a script in a place where Gruyere will serve it back to another user.

The most obvious place that Gruyere serves back user-provided data is in a snippet (ignoring uploaded files which we’ve already discussed.)

Hint 1

Put this in a snippet and see what you get:

<script>alert(1)</script>

There are many different ways that script can be embedded in a document.

Hint 2

Hackers don’t limit themselves to valid HTML syntax. Try some invalid HTML and see what you get. You may need to experiment a bit in order to find something that will work. There are multiple ways to do this.

Exploit and Fix

To exploit, enter any of these as your snippet (there are certainly more methods):

(1) <a onmouseover="alert(1)" href="#">read this!</a>

(2) <p <script>alert(1)</script>hello

(3) </td <script>alert(1)</script>hello

Notice that there are multiple failures in sanitizing the HTML. Snippet 1 worked because onmouseover was inadvertantly omitted from the list of disallowed attributes in sanitize.py. Snippets 2 and 3 work because browsers tend to be forgiving with HTML syntax and the handling of both start and end tags is buggy.

To fix, we need to investigate and fix the sanitizing performed on the snippets. Snippets are sanitized in _SanitizeTag in the sanitize.py file. Let’s block snippet 1 by adding"onmouseover" to the list of disallowed_attributes.

Oops! This doesn’t completely solve the problem. Looking at the code that was just fixed, can you find a way to bypass the fix?

Hint

Take a close look at the code in _SanitizeTag that determines whether or not an HTML attribute is allowed or not.

Exploit and Fix

The fix was insufficient because the code that checks for disallowed attributes is case sensitive and HTML is not. So this still works:

(1') <a ONMOUSEOVER="alert(1)" href="#">read this!</a>

Correctly sanitizing HTML is a tricky problem. The _SanitizeTag function has a number of critical design flaws:

  • It does not validate the well-formedness of the input HTML. As we see, badly formed HTML passes through the sanitizer unchanged. Since browsers typically apply very lenient parsing, it is very hard to predict the browser’s interpretation of the given HTML unless we exercise strict control on its format.
  • It uses blacklisting of attributes, which is a bad technique. One of our exploits got past the blacklist simply by using an uppercase version of the attribute. There could be other attributesmissing from this list that are dangerous. It is always better to whitelist known good values.
  • The sanitizer does not do any further sanitization of attribute values. This is dangerous since URI attributes like href and src and the style attribute can all be used to inject javascript.

The right approach to HTML sanitization is to:

  • Parse the input into an intermediate DOM structure, then rebuild the body as well-formed output.
  • Use strict whitelists for allowed tags and attributes.
  • Apply strict sanitization of URL and CSS attributes if they are permitted.

Whenever possible it is preferable to use an already available known and proven HTML sanitizer.

Stored XSS via HTML Attribute

You can also do XSS by injecting a value into an HTML attribute. Inject a script by setting the color value in a profile.

Hint 1

The color is rendered as style='color:color'. Try including a single quote character in your color name.

Hint 2

You can insert an HTML attribute that executes a script.

Exploit and Fixes

To exploit, use the following for your color preference:

red' onload='alert(1)' onmouseover='alert(2)

You may need to move the mouse over the snippet to trigger the attack. This attack works because the first quote ends the style attribute and the second quote starts the onload attribute.

But this attack shouldn’t work at all. Take a look at home.gtl where it renders the color. It says style='{{color:text}}' and as we saw earlier, the :text part tells it to escape text. So why doesn’t this get escaped? In gtl.py, it calls cgi.escape(str(value)) which takes an optional second parameter that indicates that the value is being used in an HTML attribute. So you can replace this with cgi.escape(str(value),True). Except that doesn’t fix it! The problem is that cgi.escape assumes your HTML attributes are enclosed in double quotes and this file is using single quotes. (This should teach you to always carefully read the documentation for libraries you use and to always test that they do what you want.)

You’ll note that this attack uses both onload and onmouseover. That’s because even though W3C specifies that onload events is only supported on body and frameset elements, some browsers support them on other elements. So if the victim is using one of those browsers, the attack always succeeds. Otherwise, it succeeds when the user moves the mouse. It’s not uncommon for attackers to use multiple attack vectors at the same time.

To fix, we need to use a correct text escaper, that escapes single and double quotes too. Add the following function to gtl.py and call it instead of cgi.escape for the text escaper.

def _EscapeTextToHtml(var):
  """Escape HTML metacharacters.

  This function escapes characters that are dangerous to insert into
  HTML. It prevents XSS via quotes or script injected in attribute values.

  It is safer than cgi.escape, which escapes only <, >, & by default.
  cgi.escape can be told to escape double quotes, but it will never
  escape single quotes.
  """
  meta_chars = {
      '"': '&quot;',
      ''': '&#39;',  # Not &apos;
      '&': '&amp;',
      '<': '&lt;',
      '>': '&gt;',
      }
  escaped_var = ""
  for i in var:
    if i in meta_chars:
      escaped_var = escaped_var + meta_chars[i]
    else:
      escaped_var = escaped_var + i
  return escaped_var

Oops! This doesn’t completely solve the problem. Even with the above fix in place, the color value is still vulnerable.

Hint 1

Some browsers allow you to include script in stylesheets.

Hint 2

The easiest browser to exploit in this way is Internet Explorer which supports dynamic CSS properties.

Another Exploit and Fix

Internet Explorer’s dynamic CSS properites (aka CSS expressions) make this attack particularly easy.

To exploit, use the following for your color preference:

expression(alert(1))

While other browsers don’t support CSS expressions, there are other dangerous CSS properties, such as Mozilla’s -moz-binding.

To fix, we need to sanitize the color as a color. The best thing to do would be to add a new output sanitizing form to gtl, i.e., we would write {{foo:color}} which makes sure foo is safe to use as a color. This function can be used to sanitize:

SAFE_COLOR_RE = re.compile(r"^#?[a-zA-Z0-9]*$")

def _SanitizeColor(color):
  """Sanitizes a color, returning 'invalid' if it's invalid.

  A valid value is either the name of a color or # followed by the
  hex code for a color (like #FEFFFF). Returning an invalid value
  value allows a style sheet to specify a default value by writing
  'color:default; color:{{foo:color}}'.
  """

  if SAFE_COLOR_RE.match(color):
    return color
  return 'invalid'

Colors aren’t the only values we might want to allow users to provide. You should do similar sanitizing for user-provided fonts, sizes, urls, etc. It’s helpful to do input validation, so that when a user enters an invalid value, you’ll reject it at that time. But only doing input validation would be a mistake: if you find an error in your validation code or a new browser exposes a new attack vector, you’d have to go back and scrub all previously entered values. Or, you could add the output validation which you should have been doing in the first place.

Stored XSS via AJAX

Find an XSS attack that uses a bug in Gruyere’s AJAX code. The attack should be triggered when you click the refresh link on the page.

Hint 1

Run curl on http://google-gruyere.appspot.com/123/feed.gtl and look at the result. (Or browse to it in your browser and view source.) You’ll see that it includes each user’s first snippet into the response. This entire response is then evaluated on the client side which then inserts the snippets into the document. Can you put something in your snippet that will be parsed differently than expected?

Hint 2

Try putting some quotes (") in your snippet.

Exploit and Fixes

To exploit, Put this in your snippet:

all <span style=display:none>"
+ (alert(1),"")
+ "</span>your base

The JSON should look like

_feed(({..., "Mallory": "snippet", ...}))

but instead looks like this:

_feed({..., "Mallory": "all <span style=display:none>"
+ (alert(1),"")
+ "</span>your base", ...})

Each underlined part is a separate expression. Note that this exploit is written to be invisible both in the original page rendering (because of the <span style=display:none>) and after refresh (because it inserts only an empty string). All that will appear on the screen is all your base. There are bugs on both the server and client sides which enable this attack.

To fix, first, on the server side, the text is incorrectly escaped when it is rendered in the JSON response. The template says {{snippet.0:html}} but that’s not enough. This text is going to be inserted into the innerHTML of a DOM node so the HTML does have to be sanitized. However, that sanitized text is then going to be inserted into Javascript and single and double quotes have to be escaped. That is, adding support for {{...:js}} to GTL would not be sufficient; we would also need to support something like {{...:html:js}}.

To escape quotes, use x27 and x22 for single and double quote respectively. Replacing them with &#27; and &quot; is incorrect as those are not recognized in Javascript strings and will break quotes around HTML attribute.

Second, in the browser, Gruyere converts the JSON by using Javascript’s eval. In general, eval is very dangerous and should rarely be used. If it used, it must be used very carefully, which is hardly the case here. We should be using the JSON parser which ensures that the string does not include any unsafe content. The JSON parser is available at json.org.

Reflected XSS via AJAX

Find a URL that when clicked on will execute a script using one of Gruyere’s AJAX features.

Hint 1

When Gruyere refreshes a user snippets page, it uses

http://google-gruyere.appspot.com/123/feed.gtl?uid=value

and the result is the script

_feed((["user", "snippet1", ... ]))

Hint 2

This uses a different vulnerability, but the exploit is very similar to the previous reflected XSS exploit.

Exploit and Fixes

To exploit, create a URL like the following and get a victim to click on it:

http://google-gruyere.appspot.com/123/feed.gtl?uid=<script>alert(1)</script>

http://google-gruyere.appspot.com/123/feed.gtl?uid=%3Cscript%3Ealert(1)%3C/script%3E

This renders as

_feed((["<script>alert(1)</script>"]))

which surprisingly does execute the script. The bug is that Gruyere returns all gtl files as content type text/html and browsers are very tolerant of what HTML files they accept.

To fix, you need to make sure that your JSON content can never be interpreted as HTML. Even though literal < and > are allowed in Javascript strings, you need to make sure they don’t appear literally where a browser can misinterpret them. Thus, you’d need to modify {{...:js}} to replace them with the Javascript escapes x3c and x3e. It is always safe to write 'x3cx3e' in Javscript strings instead of '<>'. (And, as noted above, using the HTML escapes &lt; and &gt; is incorrect.)

You should also always set the content type of your responses, in this case serving JSON results as application/javascript. This alone doesn’t solve the problem because browsers don’t always respect the content type: browsers sometimes do “sniffing” to try to “fix” results from servers that don’t provide the correct content type.

But wait, there’s more! Gruyere doesn’t set the content encoding either. And some browsers try to guess what the encoding type of a document is or an attacker may be able to embed content in a document that defines the content type. So, for example, if an attacker can trick the browser into thinking a document is UTF-7 then it could embed a script tag as +ADw-script+AD4- since +ADw- and +AD4- are alternate encodings for < and >. So always set both the content type and the content encoding of your responses, e.g., for HTML:

Content-Type: text/html; charset=utf-8

More about XSS

In addition to the XSS attacks described above, there are quite a few more ways to attack Gruyere with XSS. Collect them all!

XSS is a difficult beast. On one hand, a fix to an XSS vulnerability is usually trivial and involves applying the correct sanitizing function to user input when it’s displayed in a certain context. On the other hand, if history is any indication, this is extremely difficult to get right. US-CERT reports dozens of publicly disclosed XSS vulnerabilities involving multiple companies.

Though there is no magic defense to getting rid of XSS vulnerabilities, here are some steps you should take to prevent these types of bugs from popping up in your products:

  1. First, make sure you understand the problem.
  2. Wherever possible, do sanitizing via templates features instead of calling escaping functions in source code. This way, all of your escaping is done in one place and your product can benefit from security technologies designed for template systems that verify their correctness or actually do the escaping for you. Also, familiarize yourself with the other security features of your template system.
  3. Employ good testing practices with respect to XSS.
  4. Don’t write your own template library :)

Client-State Manipulation

When a user interacts with a web application, they do it indirectly through a browser. When the user clicks a button or submits a form, the browser sends a request back to the web server. Because the browser runs on a machine that can be controlled by an attacker, the application must not trust any data sent by the browser.

It might seem that not trusting any user data would make it impossible to write a web application but that’s not the case. If the user submits a form that says they wish to purchase an item, it’s OK to trust that data. But if the submitted form also includes the price of the item, that’s something that cannot be trusted.

Elevation of Privilege

Convert your account to an administrator account.

Hint 1

Take a look at the editprofile.gtl page that users and administrators use to edit profile settings. If you’re not an administrator, the page looks a bit different. Can you figure out how to fool Gruyere into letting you use this page to update your account?

Hint 2

Can you figure out how to fool Gruyere into thinking you used this page to update your account?

Exploit and Fixes

You can convert your account to being an administrator by issuing either of the following requests:

  • http://google-gruyere.appspot.com/123/saveprofile?action=update&is_admin=True
  • http://google-gruyere.appspot.com/123/saveprofile?action=update&is_admin=True&uid=username (which will make any username into an an admin)

After visiting this URL, your account is now marked as an administrator but your cookie still says you’re not. So sign out and back in to get a new cookie. After logging in, notice the ‘Manage this server’ link on the top right.

The bug here is that there is no validation on the server side that the request is authorized. The only part of the code that restricts the changes that a user is allowed to make are in the template, hiding parts of the UI that they shouldn’t have access to. The correct thing to do is to check for authorization on the server, at the time that the request is received.

Cookie Manipulation

Because the HTTP protocol is stateless, there’s no way a web server can automatically know that two requests are from the same user. For this reason, cookies were invented. When a web site includes a cookie (an arbitrary string) in a HTTP response, the browser automatically sends the cookie back to the browser on the next request. Web sites can use the cookie to save session state. Gruyere uses cookies to remember the identity of the logged in user. Since the cookie is stored on the client side, it’s vulnerable to manipulation. Gruyere protects the cookies from manipulation by adding a hash to it. Notwithstanding the fact that this hash isn’t very good protection, you don’t need to break the hash to execute an attack.

Get Gruyere to issue you a cookie for someone else’s account.

Hint 1

Hint 2

Exploit and Fix

You can get Gruyere to issue you a cookie for someone else’s account by creating a new account with username "foo|admin|author". When you log into this account, it will issue you the cookie "hash|foo|admin|author||author" which actually logs you into foo as an administrator. (So this is also an elevation of privilege attack.)

Having no restrictions on the characters allowed in usernames means that we have to be careful when we handle them. In this case, the cookie parsing code is tolerant of malformed cookies and it shouldn’t be. It should escape the username when it constructs the cookie and it should reject a cookie if it doesn’t match the exact pattern it is expecting.

Even if we fix this, Python’s hash function is not cryptographically secure. If you look at Python’s string_hash function in python/Objects/stringobject.cc you’ll see that it hashes the string strictly from left to right. That means that we don’t need to know the cookie secret to generate our own hashes; all we need is another string that hashes to the same value, which we can find in a relatively short time on a typical PC. In contrast, with a cryptographic hash function, changing any bit of the string will change many bits of the hash value in an unpredictable way. At a minimum, you should use a secure hash function to protect your cookies. You should also consider encrypting the entire cookie as plain text cookies can expose information you might not want exposed.

And these cookies are also vulnerable to a replay attack. Once a user is issued a cookie, it’s good forever and there’s no way to revoke it. So if a user is an administrator at one time, they can save the cookie and continue to act as an administrator even if their administrative rights are taken away. While it’s convenient to not have to make a database query in order to check whether or not a user is an administrator, that might be too dangerous a detail to store in the cookie. If avoiding additional database access is important, the server could cache a list of recent admin users. Including a timestamp in a cookie and expiring it after some period of time also mitigates against a replay attack.

Another challenge: Since account names are limited to 16 characters, it seems that this trick would not work to log in to the actual administrator account since "administrator|admin" is 19 characters. Can you figure out how to bypass that restriction?

Additional Exploit and Fix

The 16 character limit is implemented on the client side. Just issue your own request:

http://google-gruyere.appspot.com/123/saveprofile?action=new&uid=administrator|admin|author&pw=secret

Again, this restriction should be implemented on the server side, not just the client side.

Cross-Site Request Forgery (XSRF)

The previous section said “If the user submits a form that says they wish to purchase an item, it’s OK to trust that data.” That’s true as long as it really was the user that submitted the form. If your site is vulnerable to XSS, then the attacker can fake any request as if it came from the user. But even if you’ve protected against XSS, there’s another attack that you need to protect against: cross-site request forgery.

When a browser makes requests to a site, it always sends along any cookies it has for that site, regardless of where the request comes from. Additionally, web servers generally cannot distinguish between a request initiated by a deliberate user action (e.g., user clicking on “Submit” button) versus a request made by the browser without user action (e.g., request for an embedded image in a page). Therefore, if a site receives a request to perform some action (like deleting a mail, changing contact address), it cannot know whether this action was knowingly initiated by the user — even if the request contains authentication cookies. An attacker can use this fact to fool the server into performing actions the user did not intend to perform.

More details

For example, suppose Blogger is vulnerable to XSRF attacks (it isn’t). And let us say Blogger has a Delete Blog button on the dashboard that points to this URL:

http://www.blogger.com/deleteblog.do?blogId=BLOGID

Bob, the attacker, embeds the following HTML on his web page on http://www.evil.example.com:

<img src="http://www.blogger.com/deleteblog.do?blogId=alice's-blog-id"
    style="display:none">

If the victim, Alice, is logged in to www.blogger.com when she views the above page, here is what happens:

  • Her browser loads the page from http://www.evil.example.com. The browser then tries to load all embedded objects in the page, including the img shown above.
  • The browser makes a request to http://www.blogger.com/deleteblog.do?blogId=alice's-blog-id to load the image. Since Alice is logged into Blogger — that is, she has a Blogger cookie — the browser also sends that cookie in the request.
  • Blogger verifies the cookie is a valid session cookie for Alice. It verifies that the blog referenced by alice's-blog-id is owned by Alice. It deletes Alice’s blog.
  • Alice has no idea what hit her.

In this sample attack, since each user has their own blog id, the attack has to be specifically targeted to a single person. In many cases, though, requests like these don’t contain any user-specific data.

XSRF Challenge

The goal here is to find a way to perform an account changing action on behalf of a logged in Gruyere user without their knowledge. Assume you can get them to visit a web page under your control.

Find a way to get someone to delete one of their Gruyere snippets.

Hint

What is the URL used to delete a snippet? Look at the URL associated with the “X” next to a snippet.

Exploit and Fix

To exploit, lure a user to visit a page that makes the following request:

http://google-gruyere.appspot.com/123/deletesnippet?index=0

To be especially sneaky, you could set your Gruyere icon to this URL and the victim would be exploited when they visited the main page.

To fix, we should first change /deletesnippet to work via a POST request since this is a state changing action. In the HTML form, change method='get' to method='post'. On the server side, GET and POST requests look the same except that they usually call different handlers. For example, Gruyere uses Python’s BaseHTTPServer which calls do_GET for GET requests anddo_POST for POST requests.

However, note that changing to POST is not enough of a fix in itself! (Gruyere uses GET requests exclusively because it makes hacking it a bit easier. POST is not more secure than GET but it is more correct: browsers may re-issue GET requests which can result in an action getting executed more than once; browsers won’t reissue POST requests without user consent.) Then we need to pass a unique, unpredictable authorization token to the user and require that it get sent back before performing the action. For this authorization token, action_token, we can use a hash of the value of the user’s cookie appended to a current timestamp and include this token in all state-changing HTTP requests as an additional HTTP parameter. The reason we use POST over GETrequests is that if we pass action_token as a URL parameter, it might leak via HTTP Referer headers. The reason we include the timestamp in our hash is so that we can expire old tokens, which mitigates the risk if it leaks.

When a request is processed, Gruyere should regenerate the token and compare it with the value supplied with the request. If the values are equal, then it should perform the action. Otherwise, it should reject it. The functions that generate and verify the tokens look like this:

def _GenerateXsrfToken(self, cookie):
  """Generates a timestamp and XSRF token for all state changing actions."""

  timestamp = time.time()
  return timestamp + "|" + (str(hash(cookie_secret + cookie + timestamp)))

def _VerifyXsrfToken(self, cookie, action_token):
  """Verifies an XSRF token included in a request."""

  # First, make sure that the token isn't more than a day old.
  (action_time, action_hash) = action_token.split("|", 1)
  now = time.time()
  if now - 86400 > float(action_time):
    return False

  # Second, regenerate it and check that it matches the user supplied value
  hash_to_verify = str(hash(cookie_secret + cookie + action_time)
  return action_hash == hash_to_verify

Oops! There’s several things wrong with these functions.

What’s missing?

By including the time in the token, we prevent it from being used forever, but if an attacker were to gain access to a copy of the token, they could reuse it as many times as they wanted within that 24 hour period. The expiration time of a token should be set to a small value that represents the reasonable length of time it will take the user to make a request. This token also doesn’t protect against an attack where a token for one request is intercepted and then used for a different request. As suggested by the name action_token, the token should be tied to the specific state changing action being performed, such as the URL of the page. A better signature for _GenerateXsrfToken would be (self, cookie, action). For very long actions, like editing snippets, a script on the page could query the server to update the token when the user hits submit. (But read the next section about XSSI to make sure that an attacker won’t be able to read that new token.)

XSRF vulnerabilities exist because an attacker can easily script a series of requests to an application and than force a user to execute them by visiting some page. To prevent this type of attack, you need to introduce some value that can’t be predicted or scripted by an attacker for every account changing request. Some application frameworks have XSRF protection built in: they automatically include a unique token in every response and verify it on every POST request. Other frameworks provide functions that you can use to do that. If neither of these cases apply, then you’ll have to build your own. Be careful of things that don’t work: using POST instead of GET is advisable but not sufficient by itself, checking Referer headers is insufficient, and copying cookies into hidden form fields can make your cookies less secure.

Cross Site Script Inclusion (XSSI)

Browsers prevent pages of one domain from reading pages in other domains. But they do not prevent pages of a domain from referencing resources in other domains. In particular, they allow images to be rendered from other domains and scripts to be executed from other domains. An included script doesn’t have its own security context. It runs in the security context of the page that included it. For example, if www.evil.example.com includes a script hosted on www.google.com then that script runs in the evil context not in the google context. So any user data in that script will “leak.”

XSSI Challenge

Find a way to read someone else’s private snippet using XSSI.

That is, create a page on another web site and put something in that page that can read your private snippet. (You don’t need to post it to a web site: you can just create a .html in your home directory and double click on it to open in a browser.)

Hint 1

You can run a script from another domain by adding

<SCRIPT src="http://google-gruyere.appspot.com/123/..."></SCRIPT>

to your HTML file. What scripts does Gruyere have?

Hint 2

feed.gtl is a script. Given that, how can you get the private snippet out of the script?

Exploit and Fix

To exploit, put this in an html file:

<script>
function _feed(s) {
  alert("Your private snippet is: " + s['private_snippet']);
}
</script>
<script src="http://google-gruyere.appspot.com/123/feed.gtl"></script>

When the script in feed.gtl is executed, it runs in the context of the attacker’s web page and uses the _feed function which can do whatever it wants with the data, including sending it off to another web site.

You might think that you can fix this by eliminating the function call and just having the bare expression. That way, when the script is executed by inclusion, the response will be evaluated and then discarded. That won’t work because Javascript allows you to do things like redefine default constructors. So when the object is evaluated, the hosting page’s constructors are invoked, which can do whatever they want with the values.

To fix, there are several changes you can make. Any one of these changes will prevent currently possible attacks, but if you add several layers of protection (“defense in depth“) you protect against the possibility that you get one of the protections wrong and also against future browser vulnerabilities. First, use an XSRF token as discussed earlier to make sure that JSON results containing confidential data are only returned to your own pages. Second, your JSON response pages should only support POST requests, which prevents the script from being loaded via a script tag. Third, you should make sure that the script is not executable. The standard way of doing this is to append some non-executable prefix to it, like ])}while(1);</x>. A script running in the same domain can read the contents of the response and strip out the prefix, but scripts running in other domains can’t.

NOTE: Making the script not executable is more subtle than it seems. It’s possible that what makes a script executable may change in the future if new scripting features or languages are introduced. Some people suggest that you can protect the script by making it a comment by surrounding it with /* and */, but that’s not as simple as it might seem. (Hint: what if someone included */ in one of their snippets?)

There’s much more to XSSI than this. There’s a variation of JSON called JSONP which you should avoid using because it allows script injection by design. And there’s E4X (Ecmascript for XML) which can result in your HTML file being parsed as a script. Surprisingly, one way to protect against E4X attacks is to put some invalid XML in your files, like the </x> above.

Path Traversal

Most web applications serve static resources like images and CSS files. Frequently, applications simply serve all the files in a folder. If the application isn’t careful, the user can use a path traversal attack to read files from other folders that they shouldn’t have access to. For example, in both Windows and Linux, .. represents the parent directory, so if you can inject ../ in a path you can “escape” to the parent directory.

If an attacker knows the structure of your file system, then they can craft a URL that will traverse out of the installation directory to /etc. For example, if Picasa was vulnerable to path traversal (it isn’t) and the Picasa servers use a Unix-like system, then the following would retrieve the password file:

http://www.picasa.com/../../../../../../../etc/passwd

Information disclosure via path traversal

Find a way to read secret.txt from a running Gruyere server.

Amazingly, this attack is not even necessary in many cases: people often install applications and never change the defaults. So the first thing an attacker would try is the default value.

Hint 1

This isn’t a black box attack because you need to know that the secret.txt file exists, where it’s stored, and where Gruyere stores its resource files. You don’t need to look at any source code.

Hint 2

How does the server know which URLs represent resource files? You can use curl or a web proxy to craft request URLs that some browsers may not allow.

Exploit and Fix

To exploit, you can steal secret.txt via this URL:

http://google-gruyere.appspot.com/123/../secret.txt

Some browsers, like Firefox and Chrome, optimize out ../ in URLs. This doesn’t provide any security protection because an attacker will use %2f to represent / in the URL; or a tool like curl, a web proxy or a browser that doesn’t do that optimization. But if you test your application with one of these browsers to see if you’re vulnerable, you might think you were protected when you’re not.

To fix, we need to prevent access to files outside the resources directory. Validating file paths is a bit tricky as there are various ways to hide path elements like “../” or “~” that allow escaping out of the resources folder. The best protection is to only serve specific resource files. You can either hardcode a list or when your application starts, you can crawl the resource directory and build a list of files. Then only accept requests for those files. You can even do some optimization here like caching small files in memory which will make your application faster. If you are going to try to file path validation, you need to do it on the final path, not on the URL, as there are numerous ways to represent the same characters in URLs. Note: Changing file permissions will NOT work. Gruyere has to be able to read this file.

Data tampering via path traversal

Find a way to replace secret.txt on a running Gruyere server.

Hint 1

Again, this isn’t a black box attack because you need to know about the directory structure that Gruyere uses, specifically where uploaded files are stored.

Hint 2

If I log in as user brie and upload a file, where does the server store it? Can you trick the server into uploading a file to ../../secret.txt?

Exploit and Fix

To exploit, create a new user named .. and upload your new secret.txt. You could also create a user named brie/../...

To fix, you should escape dangerous characters in the username (replacing them with safe characters) before using it. It was earlier suggested that we should restrict the cracters allowed in a username, but it probably didn’t occur to you that "." was a dangerous character. It’s worth noting that there’s a vulnerability unique to Windows servers with this implementation. On Windows, filenames are not case sensitive but Gruyere usernames are. So one user can attack another user’s files by creating a similar username that differs only in case, e.g., BRIE instead ofbrie. So we need to not just escape unsafe characters but convert the username to a canonical form that is different for different usernames. Or we could avoid all these issues by assigning each user a unique identifier instead.

Oops! This doesn’t completely solve the problem. Even with the above fix in place, there is another way to perform this attack. Can you find it?

Hint

Are there any limits on the filename when you do an upload? You may need to use a special tool like curl or a web proxy to perform this attack.

Another Exploit and Fix

Surprisingly, you can upload a file named ../secret.txt. Gruyere provides no protection against this attack. Most browsers won’t let you upload that file but, again, you can do it with curl or other tools. You need the same kind of protection when writing files as you do on read.

As a general rule, you should never store user data in the same place as your application files but that alone won’t protect against these attacks since if the user can inject ../ into the file path, they can traverse all the way to the root of the file system and then back down to the normal install location of your application (or even the Python interpreter itself).

Denial of Service

A denial of service (DoS) attack is an attempt to make a server unable to service ordinary requests. A common form of DoS attack is sending more requests to a server than it can handle. The server spends all its time servicing the attacker’s requests that it has very little time to service legitimate requests. Protecting an application against these kinds of DoS attacks is outside the scope of this codelab. And attacking Gruyere in this way would be interpreted as an attack on App Engine.

Hackers can also prevent a server from servicing requests by taking advantage of server bugs, such as sending requests that crash a server, make it run out of memory, or otherwise cause it fail serving legitimate requests in some way. In the next few challenges, you’ll take advantage of bugs in Gruyere to perform DoS attacks.

DoS – Quit the Server

The simplest form of denial of service is shutting down a service. Find a way to make the server quit.

Hint

How does an administrator make the server quit? The server management page is manage.gtl.

Exploit and Fix

To exploit, make a request to http://google-gruyere.appspot.com/123/quitserver. You should need to be logged in as an administrator to do this, but you don’t.

This is another example of a common bug. The server protects against non-administrators accessing certain URLs but the list includes /quit instead of the actual URL /quitserver.

To fix, add /quitserver to the URLS only accessible to administrators:

_PROTECTED_URLS = [
    "/quitserver",
    "/reset"
]

Oops! This doesn’t completely solve the problem. The reset URL is in the protected list. Can you figure out how to access it?

Hint

Look carefully at the code that handles URLs and checks for protected ones.

Another Exploit and Fix

To exploit, use http://google-gruyere.appspot.com/123/RESET. The check for protected urls is case sensitive. After doing that check, it capitalizes the string to look up the implementation. This is a classic check/use bug where the condition being checked does not match the actual use. This vulnerability is worse than the previous one because it exposes all the protected urls.

To fix, put the security check inside the dangerous functions rather than outside them. That ensures that no matter how we get there, the security check can’t be skipped.

DoS – Overloading the Server

Find a way to overload the server when it processes a request.

Hint 1

You can upload a template that does this.

Hint 2

Every page includes the menubar.gtl template. Can you figure out how to make that template overload the server?

Exploit and Fix

To exploit, create a file named menubar.gtl containing:

[[include:menubar.gtl]]DoS[[/include:menubar.gtl]]

and upload it to the resources directory using a path traversal attack, e.g., creating a user named ../resources.

To fix, implement the protections against path traversal and uploading templates discussed earlier.

NOTE: After performing the previous exploit, you’ll need to push the reset button.

More on Denial of Service

Unlike a well defined vulnerability like XSS or XSRF, denial of service describes a wide class of attacks. This might mean bringing your service down or flooding your inbox so you can’t receive legitimate mail. Some things to consider:

  • If you were evil and greedy, how quickly could you take down your application or starve all of its resources? For example, is it possible for a user to upload their hard drive to your application? Entering the attacker’s mindset can help identify DoS points in your application. Additionally, think about where the computationally and memory intensive tasks are in your application and put safeguards in place. Do sanity checks on input values.
  • Put monitoring in place so you can detect when you are under attack and enforce per user quotas and rate limiting to ensure that a small subset of users cannot starve the rest. Abusive patterns could include increased memory usage, higher latency, or more requests or connections than usual.

Code Execution

If an attacker can execute arbitrary code remotely on your server, it’s usually game over. They may be able to take control over the running program or potentially break out the process to open a new shell on the computer. From here, it’s usually not hard to compromise the entire machine the server is running on.

Similar to information disclosure and denial of service, there is no recipe or specific defense to prevent remote code execution. The program must perform validation of all user input before handling it and where possible, implement functions with least privilege rights. This topic can’t be done justice in just a short paragraph, but know that this is likely the scariest results a security bug can have and trumps any of the above attacks.

Code Execution Challenge

Find a code execution exploit.

Hint

You need to use two previous exploits.

Exploit and Fix

To exploit, make a copy of gtl.py (or sanitize.py) and add some exploit code. Now you can either upload a file named ../gtl.py or create a user named .. and upload gtl.py. Then, make the server quit by browsing to http://google-gruyere.appspot.com/123/quitserver. When the server restarts, your code will run.

This attack was possible because Gruyere has permission to both read and write files in the Gruyere directory. Applications should run with the minimal privileges possible.

Why would you attack gtl.py or sanitize.py rather than gruyere.py? When an attacker has a choice, they would usually choose to attack the infrastructure rather than the application itself. The infrastructure is less likely to be updated and less likely to be noticed. When was the last time you checked that no one had replaced python.exe with a trojan?

To fix, fix the two previous exploits.

More on Remote Code Execution

Even though there is no single or simple defense to remote code execution, here is a short list of some preventative measures:

  • Least Privilege: Always run your application with the least privileges it needs.
  • Application Level Checks: Avoid passing user input directly into commands that evaluate arbitrary code, like eval() or system(). Instead, use the user input as a switch to choose from a set of developer controlled commands.
  • Bounds Checks: Implement proper bounds checks for non-safe languages like C++. Avoid unsafe string functions. Keep in mind that even safe languages like Python and Java use native libraries.

Hint 1

Look at all the files installed with Gruyere. Are there any files that shouldn’t be there?

Hint 2

Look for a .gtl file that isn’t referenced anywhere.

Exploit and Fixes

To exploit, you can use the debug dump page dump.gtl to display the contents of the database via the following URL:

http://google-gruyere.appspot.com/123/dump.gtl

To fix, always make sure debug features are not installed. In this case, delete dump.gtl. This is an example of the kind of debug feature that might be left in an application by mistake. If a debug feature like this is necessary, then it needs to be carefully locked down: only admin users should have access and only requests from debug IP addresses should be accepted.

This exploit exposes the users’ passwords. Passwords should never be stored in cleartext. Instead, you should use password hashing. The idea is that to authenticate a user, you don’t need to know their password, only be convinced that the user knows it. When the user sets their password, you store only a cryptographic hash of the password and a salt value. When the user re-enters their password later, you recompute the hash and if it matches you conclude the password is correct. If an attacker obtains the hash value, it’s very difficult for them to reverse that to find the original password. (Which is a good thing, since despite lots of advice to the contrary, users frequently use the same weak passwords for multiple sites.)

Hint

You can upload a file of any type.

Exploit and Fixes

To exploit, Gruyere allows the user to upload files of any type, including .gtl files. So the attacker can simply upload their own copy of dump.gtl or a similar file and than access it. In fact, as noted earlier, hosting arbitrary content on the server is a major security risk whether it’s HTML, JavaScript, Flash or something else. Allowing a file with an unknown file type may lead to a security hole in the future.

To fix, we should do several things:

  1. Only files that are part of Gruyere should be treated as templates.
  2. Don’t store user uploaded files in the same place as application files.
  3. Consider limiting the types of files that can be uploaded (via a whitelist).

Hint 1

You can insert something in your private snippet which will display the contents of the database.

Hint 2

This attack is closely related to the previous ones. There is a bug in the code that expands templates that you can exploit.

Exploit and Fixes

There is a defect in Gruyere’s template expansion code that reparses expanded varibles. Specifically, when expanding a block it expands variables in the block. Then it parses the block as a template and expands variables again,

To exploit, add this to your private snippet:

{{_db:pprint}}

To fix, modify the template code so it never reparses inserted variable values. The defect in the code is due to the fact that ExpandTemplate calls _ExpandBlocks followed by_ExpandVariables, but _ExpandBlocks calls ExpandTemplate on nested blocks. So if a variable is expanded inside a nested block and contains something that looks like a variable template, it will get expanded a second time. That sounds complicated because it is complicated. Parsing blocks and variables separately is a fundamental flaw in the design of the expander, so the fix is non-trivial.

This exploit is possible because the template language allows arbitrary database access. It would be safer if the templates were only allowed to access data specifically provided to them. For example, a template could have an associated database query and only the data matched by that query would be passed to the template. This would limit the scope of a bug like this to data that the user was already allowed to access.

Hint 1

Can you figure out how to change the value of the private snippet in the AJAX response?

Hint 2

What happens if a JSON object has a duplicate key value?

Exploit and Fix

To exploit, create a user named private_snippet and create at least one snippet. The JSON response will then be {'private_snippet' : <user's private snippet>, ..., 'private_snippet' : <attacker's snippet>} and the attacker’s snippet replaces the user’s.

To fix, the AJAX code needs to make sure that the data only goes where it’s supposed to go. The flaw here is that the JSON structure is not robust. A better structure would be[<private_snippet>, {<user> : <snippet>,...}].

Hint 1

Look at what the script does to replace the snippets on the page. Can you get it to replace the sign in link?

Hint 2

Look at the AJAX code to see how it replaces each snippet, and then look at the structure of the home page and see if you can see what else you might be able to replace. (You can’t just replace the sign in link. You’ll have to replace a bit more.)

Exploit and Fix

To exploit, create a user named menu-right and publish a snippet that looks exactly like the right side of the menu bar.

<a href='http://evil.example.com/login'>Sign in</a>
| <a href='http://evil.example.com/newaccount.gtl'>Sign up</a>

If the user is already logged in, the menu bar will look wrong. But that’s ok, since there’s a good chance the user will just think they somehow accidentally got logged out of the web site and log in again.

To fix, the process of modifying the DOM needs to be made more robust. When user values are used in as DOM element identifiers, you should ensure that there can’t be a conflict as there is here, for example, applying a prefix to user values like id="user_". Even better, use your own identifiers rather than user values.

This spoofing attack is easily detected when the user clicks Sign in and ends up at evil.example.com. A clever attacker could do something harder to detect, like replacing the Sign in link with a script that renders the sign in form on the current page with the form submission going to their server.

buffer overflow vulnerability exists when an application does not properly guard its buffers and allow user data to write past the end of a buffer. This excess data can modify other variables, including pointers and function return addresses, leading to arbitrary code execution. Historically, buffer overflow vulnerabilities have been responsible for some of the most widespread internet attacks including SQL Slammer, Blaster and Code Red computer worms. The PS2, Xbox and Wii have all been hacked using buffer overflow exploits.

While not as well known, integer overflow vulnerabilities can be just as dangerous. Any time an integer computation silently returns an incorrect result, the application will operate incorrectly. In the best case, the application fails. In the worst case, there is a security bug. For example, if an application checks that length + 1 < limit then this will succeed if length is the largest positive integer value, which can then expose a buffer overflow vulnerability.

This codelab doesn’t cover overflow vulnerabilities because Gruyere is written in Python, and therefore not vulnerable to typical buffer and integer overflow problems. Python won’t allow you to read or write outside the bounds of an array and integers can’t overflow. While C and C++ programs are most commonly known to expose these vulnerabilities, other languages are not immune. For example, while Java was designed to prevent buffer overflows, it silently ignores integer overflow.

Like all applications, Gruyere is vulnerable to platform vulnerabilities. That is, if there are security bugs in the platforms that Gruyere is built on top of, then those bugs would also apply to Gruyere. Gruyere’s platform includes: the Python runtime system and libraries, AppEngine, the operating system that Gruyere runs on and the client side software (including the web browser) that users use to run Gruyere. While platform vulnerabilities are important, they are outside the scope of this codelab as you generally can’t fix platform vulnerabilities by making changes to your application. Fixing platform vulnerabilities yourself is also not practical for many people, but you can mitigate your risks by making sure that you are diligent in applying security updates as they are released by platform vendors.

Just as XSS vulnerabilities allow attackers to inject script into web pages, SQL injection vulnerabilities allow attackers to inject arbitarary scripts into SQL queries. When a SQL query is executed it can either read or write data, so an attacker can use SQL injection to read your entire database as well as overwrite it, as described in the classic Bobby Tables XKCD comic. If you use SQL, the most important advice is to avoid building queries by string concatenation: use API calls instead. This codelab doesn’t cover SQL injection because Gruyere doesn’t use SQL.


Learn how to make web apps more secure. Do the Gruyere codelab.