blog.smart-java.nl
Ordina J-Technologies – Java Blog

Archief voor August, 2009




Cross Site Request Forgery: An introduction

Door: Jan-Kees van Andel, 31 August 2009

What is it?

Cross site request forgery (CSRF) is a type of attack, made to a web application by another web application, running in another tab or window in the same browser on the same computer. The server cannot see the difference between legitimate and malicious requests, since all HTTP requests look the same. Thus, the server (obviously) assumes a legitimate request. This gives the hacker the opportunity to impersonate the user and execute commands without the user knowing it. Or more abstract, CSRF may violate integrity.

The attack is caused by three things.

  1. First, the fact that most websites completely rely on a session cookie to represent the user.
  2. Second, because browsers allow websites to invoke certain cross domain requests.
  3. And third, the browser appends all cookies that belong to a domain to every request to that domain.

Example

For example, you have a social networking site where you can add friends to your network. Let’s say everyone has a link on his/her personal page that can be used to add that person to your “friends” list.

The link may look like this:

<a href="/addfriend.do?userid=12345">Add friend</a>

The application uses the HttpSession (or JAAS Subject or whatever. The attack is not specific to Java) to determine the currently logged in user and its userId. The userId of the new friend is taken from the querystring. Both are combined and the relation is created. This mechanism trusts the integrity of the session, which is not a bad thing on itself.

So far so good…

Attacking the example website

But the above example has one major flaw. It assumes that a session safely represents the user and bases security checks on this assumption. For some reason, people think that the session is “safe”, just because it exists on the server. But there is one big issue with sessions. The server determines which session belongs to a certain user by a sessionId, which is stored in a cookie in the user’s browser (URL rewriting is also possible, but not common). From the server point of view, this cookie IS the user. So, when someone accesses the cookie value, this person is automatically authenticated. But that’s not the issue here. That would be a case of session hijacking (or session fixation). You can migitate the risk of session hijacking using common security mechanisms, like SSL, and by ensuring that only proper input is accepted. But we’re talking about CSRF here.

CSRF is possible, because all cookies that belong to a certain domain are sent with each request to that domain, regardless from what domain the request is invoked, even when that web page is running in a different tab or window. SSL doesn’t help you here. If a call is made using HTTPS, the cookies are still appended to the request and a valid HTTPS request is executed.

Note: The details of cookie sharing differ per browser. For example, in Internet Explorer, when you use CTRL+N to open a new window, cookies (and thus sessions) are shared between the window and its “parent”, but when a new process is created (e.g. from the Start Menu or Quick Launch), the two act as two separate browsers and cookies are not shared. This distinction is often not visible to (or known by) the end user. But there are more issues, like the annoying IE8-I-share-cookies-between-tabs-”feature”.

What can an attacker do with this knowledge? Consider the example above. If the social networking site permits HTML content on your personal page, the attacker can first create an account and then put the following HTML snippet on his personal page.

<img src="/addfriend.do?userid=[ATTACKER_USER_ID]" />

The browser sees this image tag and tries to load the image, effectively invoking the following HTTP request to the specified URL. It doesn’t matter that the URL doesn’t point to a real image, the processing on the server is being done anyway. The request is invoked from the victim’s browser, so the victim’s cookies are appended to the request.

Now, everyone who visits the attacker’s personal page becomes a friend of the attacker.

With a bit more effort, the attacker can do almost anything with the entire social networking site. SAMY is an example of an attack on MySpace which used XSS and CSRF techniques to do its thing.

Almost all major websites (like GMail) have had CSRF leaks and many major websites still have.

How about POST?

The previous example uses an image to trigger an HTTP GET request. So you might think that POST is safer. Well, opposed to what many people claim, POST is not much safer than GET (from a networking perspective they are the same). Where we use an image to trigger a GET request, we can just as easily use a (hidden) form to trigger a POST request, as shown in the next snippet:

<form id="myForm" action="/addFriend.do" method="post">
  <input type="hidden" name="userid" value="[ATTACKER_USER_ID]" />
</form>
<script type="text/javascript">
  document.getElementById("myForm").submit();
</script>

Cross Site?

The risk of CSRF may be small or big. In the above example, the attack concentrated on the same website.

This is a case of a “Stored CSRF Vulnerability” (from www.isecpartners.com/documents/XSRF_Paper.pdf), which means that the attacker uses a security hole in the application itself to invoke the HTTP methods. The attacker uses techniques like XSS as an “enabler” for their CSRF attack. This kind of attack is completely the responsibility of the website. They just need to prevent the user to enter any HTML.

But CSRF also works cross domain. In that case, we talk about “Reflected CSRF” (also from www.isecpartners.com/documents/XSRF_Paper.pdf). In this case, your website itself may be perfectly safe (meaning it doesn’t accept HTML snippets like in the example) but an attacker may still invoke HTTP requests to your site using a different site, like a forum, blog, email, IM or whatever. Effectively this means your website can be the victim of the leak in another website. Luckily, this method often fails, since the user must be logged in on the targeted site and visit the malicious other site at the same time. How often this happens depends on the type of website. I’m sure most people don’t visit many dangerous websites when doing electronic banking, but it’s not uncommon to do this when logged in on webmail, especially since these kind of websites often provide features like Single Sign On or long sessions.

But doesn’t the browser prevent this behavior?!?!?! Nope, as I’ll indicate in the following section.

Same Origin Policy

First implemented in Netscape Navigator 2.0, the Same Origin Policy is one of the main security mechanisms in modern browsers. Basically, it allows scripts that are located on the same domain to interact, while preventing interaction between scripts that are on different domains. With interaction, I mean accessing properties, invoking methods and other scripting mechanisms.

The Same Origin Policy is invoked when one of the following actions happens.

  • Requesting URLs with XMLHttpRequest (XHR)
  • Accessing frames and iframes
  • Accessing documents
  • Accessing cookies
  • Accessing browser windows

The list above is not complete, but contains the major actions that are validated against the Same Origin Policy.

This means that for example, it’s not allowed to invoke cross domain XHR requests. That’s good. It adds a serious layer of security and in most cases, you won’t need cross domain XHR requests.

But the Same Origin Policy doesn’t apply to everything. For example, it’s still possible to reference images, scripts and style sheets from different domains. It’s also possible to submit forms across domains.

I’m talking about domains a lot, let’s see what I mean with a domain, using the following listing. It shows how some example URL’s relate to the following web page: http://www.domain.com/pages/homepage.html. It also shows whether the Same Origin Policy allows document retrieval, and if not, why.

Other URL Allowed Reason
http://www.domain.com/otherDirectory/page.html Allowed Different directory is allowed
https://www.domain.com/page.html Not allowed Different protocol
http://sub.domain.com/page.html Not allowed Different subdomain
http://www.domain.com:8080/page.html Not allowed Different port

So a different domain can also mean: different protocol, different port or different subdomain.

Also, be warned that there are several ways around the Same Origin Policy. For example using the Flash cross-domain.xml file. Flickr was vulnerable to this, as written here.

To summarize, the Same Origin Policy provides quite a lot of safety, but not enough to completely prevent CSRF attacks. It does provide a helping hand, as I will show in the last part.

For a more detailed discussion of the Same Origin Policy, see:
https://developer.mozilla.org/En/Same_origin_policy_for_JavaScript and
http://taossa.com/index.php/2007/02/08/same-origin-policy/ and
http://taossa.com/index.php/2007/02/17/same-origin-proposal/

HTTPOnly cookie flag

As some may know, HTTPOnly is a cookie flag that is respected by most major browsers. Some browsers only added HTTPOnly support recently, like Firefox, which only has HTTPOnly support since version 3. Some browsers, like IE, already supported the flag for ages.

It is included in the Set-Cookie Http response header, as follows:

Set-Cookie: =[; =]
[; expires=][; domain=]
[; path=][; secure][; HTTPOnly]

It is appended to the Set-Cookie value in just the same way as the Secure flag, which prevents the cookie to be sent across insecure connections (non-HTTPS).

Basically, marking a cookie as HTTPOnly, prevents any scripts to access it. HTTPOnly cookies are only accessible by the browser to include it in the requests made to the server. It is not accessible by any JavaScript (the document.cookie property).

For more information, see the OWASP HTTPOnly documentation.

Btw. The Java Servlet specification only added HttpOnly support in version 3.0, so it’s not final at the time of writing!

Fixes

The solution is actually quite simple. Just use unguessable URLs in your website. You can achieve this by including a random token in the querystring/POSTdata of “important” requests. It’s not necessary to include it in requests that don’t change any state on the server, only state modifying requests need to have the token.

Why not include it in simple GET requests? Simple, because they don’t change anything and the hacker has no way to read the information that has been sent back to the browser, because of the Same Origin Policy. As indicated before, the Same Origin Policy prevents any scripts to read content from a remote page. Using XHR is also not possible, since the Same Origin Policy prevents cross domain requests using XHR. So, thanks to the Same Origin Policy, confidentiality is ensured. But to ensure integrity, some steps need to be taken by the developer.

For integrity, you need to implement a mechanism like the following: When you render an HTML form, be sure to include a hidden field with a random integrity token. You’ll also need to put this token into the session. When the form is submitted, you’ll need to compare the submitted token with the token in the session. If they are not the same, trigger an error or log the user off. Whatever you do, don’t let the request continue executing!

Also, don’t forget to include an integrity token in state-changing GET requests and Ajax calls and check appropriately!

There are a lot of variables in a successful CSRF prevention mechanism. In the next blog, I’ll give you some guidelines so you can pick the right ones for your website.




Client side performance tuning: Minimize HTTP requests

Door: Jan-Kees van Andel, 23 August 2009

For some reason, when talking about web site/application performance, people think about the server side, like optimizing database queries and tuning connection pools. Yeah, of course, it’s important to tune the server side. But what people often don’t realize is that there’s lots to gain on the client side as well.

For the people that haven’t done so already, download Firefox with the Firebug plugin. You can activate Firebug using the F12 keyword.

Firebug has a really useful “Net” tab (which must be activated because it has quite an overhead. You can use this tab for debugging (inspecting headers and stuff), but it also gives you a good indication of your performance.

To give you an example, below is a screenshot of java.sun.com.

firebug_net_java_sun_com_thumb

Analyzing the problem

What can a developer learn from this output?

  1. First, you can see that the server side part is quite fast. It only takes 220ms to write the entire page to the client (the first bar). They could maybe optimize a bit by flushing earlier, to have better network utilization, but this is probably the consequence of an architectural choice (I assume they’re using an MVC style approach like Struts, where you would first gather all data and then forward to a JSP page). It wouldn’t even be a big win, since the server side performance is only a small part of the total performance.
  2. Second, we can see a lot of activity is happening after the initial page request. By using the handy Firebug filters, I can easily see that 3 CSS style sheets, 17 JavaScripts and 38 images are fetched. This sums up to a total page load time of almost 4 seconds. In fact, there are so many requests that they don’t even fit on my screen.
  3. A third lesson we can learn from this output is that almost all requests are made sequentially. I’ve made a simple calculation. The total amount of data downloaded is 137KB and it took around 4 seconds. That’s around 35KB/s!!! As an indication, I normally download (large files) with a megabyte per second. How about bad network utilization!
  4. Fourth, there are some moments where no resources are downloaded at all. This is because downloaded resources have to be parsed, executed, rendered or whatever. This also hurts parallelism. After all, the computations required to do something with a resource don’t involve network I/O, so it would be nice if the browser started downloading the next file. This would be way more efficient. Fixing this issue is not trivial in most cases, so I’m not gonna talk about it here. Steve Souders has blogged about it extensively.

Http connections in your browser

Before trying to fix the problem (which I’m not going to do, since it’s not my website), we need to know some things about how web browsers handle requests.

An important thing to know is that browsers only use a certain amount of connections in parallel to the same web server. With web server, I mean the same IP address. So if multiple domain names resolve to the same IP address I’m talking about the same web server. This is in accordance with the HTTP 1.1 protocol, which states that a single web browser instance should only use 2 connections per web server. Below is a table with some popular browsers and their default parallel connection counts, using HTTP 1.1. (when using HTTP 1.0, the numbers may differ)

Browser                # of Connections
Internet Explorer <8   2
Internet Explorer >=8  6
Firefox <3             2
Firefox >=3            6
Safari                 4

References:
http://support.microsoft.com/?scid=kb%3Ben-us%3B282402&x=8&y=8
http://blogs.zdnet.com/Burnette/?p=565
http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/

You can change the settings of your browser. For example, when using Firefox, you can use about:config to increase the amount of parallel connections. But since most visitors of your website won’t change this setting and just use a mainstream browser, you can’t rely on this setting. On my current project (online ebanking, so the end user are “normal” people, not whizkids), we have to support browsers back to IE6. This means we have to perform well in browsers that use only 2 parallel connections per web server. Also, statistics indicate that people who use old browsers often are not on high bandwith connections (or are on a corporate network).

So… we had to do some work to enhance user experience.

To wrap up: The issue is simply that too many resources are loaded. It shows up in the chatty waterfall chart.

The solution

With a bit of background knowledge, we can start looking at the solution. We need a way to increase the average download speed/decrease the loading times. As shown in the image, you can see that several small files is not very efficient, so let’s start combining them. This saves some overhead, because we have to download less files. Also, as a nice side effect, compression techniques like GZIP are more efficient on large files.

But are issues, because we developers like to create maintainable code in separate small files. And working with large project teams and version control systems is often a mess when you have large files that everyone is editing concurrently.

Luckily, the solution is easy. Just aggregate all files into one big file. On my project, I’ve used Yahoo’s YUICompressor which not only aggregates, but also minifies the scripts and stylesheets. http://developer.yahoo.com/yui/compressor/

Since we use Apache Maven 2 as our build tool, I’ve integrated compression into our build using a Maven plugin which invokes YUICompressor. I’ve used http://alchim.sourceforge.net/yuicompressor-maven-plugin/

Using the default settings, the plugin only minifies the files. The minified files get the “-min” suffix.

Below is an example of a Maven 2 configuration where the listed files are minified and aggregated into one big “all.js” file. The normal files are still present, but I’ll get into that later.

<project>
  <build>
    <plugins>
      <plugin>
        <groupId>net.sf.alchim</groupId>
        <artifactId>yuicompressor-maven-plugin</artifactId>
        <executions>
          <execution>
            <goals>
              <goal>compress</goal>
            </goals>
          </execution>
        </executions>        
        <configuration>
          <nosuffix>true</nosuffix>
          <aggregations>
            <aggregation>
              <insertNewLine>true</insertNewLine>
              <output>${project.build.directory}/${project.build.finalName}/js/all.js</output>
              <includes>
                <include>${project.build.directory}/${project.build.finalName}/js/jquery.js</include>
                <include>${project.build.directory}/${project.build.finalName}/js/util.js</include>
                <include>${project.build.directory}/${project.build.finalName}/js/defs.js</include>
                <include>${project.build.directory}/${project.build.finalName}/js/ajax.js</include>
                <include>${project.build.directory}/${project.build.finalName}/js/handlers.js</include>
              </includes>
            </aggregation>
          </aggregations>
        </configuration>
      </plugin>
    </plugins>
  </build>
</project>

As you can see, I’ve explicitly specified the files to include. You can also use wildcards. I prefer this way because now I’m 100% sure of the ordering in the final aggregation. With wildcards, on the other hand, a simple refactoring (like renaming a file) could silently break the application @runtime. By being explicit, I get an error telling me that a file is missing.

CSS can also be aggregated and minified. And I must say, YUICompressor is really impressive. For example, it doesn’t just remove all comments, but determines per occurrence if it needs to be removed. For example, if you have a Safari Commented Backslash Hack v2 (http://perishablepress.com/press/2006/08/27/css-hack-dumpster/), it will not be removed, since that would break your CSS.

Below is an example where both the JS and CSS files are aggregated into two files: “all.js” and “all.css”.

<plugin>
  <groupId>net.sf.alchim</groupId>
  <artifactId>yuicompressor-maven-plugin</artifactId>
  <executions>
    <execution>
      <goals>
        <goal>compress</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <nosuffix>true</nosuffix>
    <aggregations>
      <aggregation>
        <insertNewLine>true</insertNewLine>
        <output>${project.build.directory}/${project.build.finalName}/js/all.js</output>
        <includes>
          <include>${project.build.directory}/${project.build.finalName}/js/jquery.js</include>
          <include>${project.build.directory}/${project.build.finalName}/js/util.js</include>
          <include>${project.build.directory}/${project.build.finalName}/js/defs.js</include>
          <include>${project.build.directory}/${project.build.finalName}/js/ajax.js</include>
          <include>${project.build.directory}/${project.build.finalName}/js/handlers.js</include>
        </includes>
      </aggregation>
      <aggregation>
        <insertNewLine>true</insertNewLine>
        <output>${project.build.directory}/${project.build.finalName}/css/all.css</output>
        <includes>
          <include>${project.build.directory}/${project.build.finalName}/css/global.css</include>
          <include>${project.build.directory}/${project.build.finalName}/css/main.css</include>
          <include>${project.build.directory}/${project.build.finalName}/css/buttons.css</include>
          <include>${project.build.directory}/${project.build.finalName}/css/menu.css</include>
          <include>${project.build.directory}/${project.build.finalName}/css/components.css</include>
        </includes>
      </aggregation>
    </aggregations>
  </configuration>
</plugin>

What about images?

Images can be aggregated too, but this is way more difficult than with scripts. You’ll have to use image sprites, but this first involves creating a sprite (you can do this with online tools if you want), but then the difficulties start, since you need some CSS tricks to “select” the image from the sprite using offsets. This is a real pixel-pain-in-the-ass, but you’ll also enter the realm of cross browser issues here.

This page is a good starting point, but beware, depending on your situation, things may become difficult.
See: http://www.alistapart.com/articles/sprites.

On the other hand, you can of course begin small. For example, creating several sprites for buttons (with hovers and sliding doors you can turn 4 images into 1), boxes and logos. Every improvement is welcome.

Conclusion

Aggregation can greatly improve the performance of your website. On my project, I’ve decreased load times (with an empty cache) from 7 seconds to less than 3 seconds, using only aggregation and script/CSS compression. Using GZIP, caching and some other tweaks, I’m currently even lower. I’m still planning to implement the image sprites.

The moral of this blog is that there is more than the server side and you have to remember, it’s the end user who experiences your application. End users really hate hickups, long load times and web pages that build up in strange ways. They will be annoyed and maybe even lose their trust in the integrity of the application. And when this happens, you’re in deep shit.

Also, remember. Performance tuning differs from other disciplines, like security. With security, there are no (or at least little) compromises. But on the other hand, you can (and should) be happy with every performance improvement as the only valid performance measurement is the happyness of the end user.

Notes

  1. I didn’t remove the original files, since I want to be able to switch @runtime between aggregated files and the originals. This greatly simplifies debugging, especially in production.
  2. Check performance in all browsers. As indicated, browsers have different default values for several settings and different characteristics.
  3. Client side performance is often easy to test. You don’t need heavy load, like you would when doing server side performance testing. You can start with your development machine and a simple testing server is often good enough for a more thorough test.