<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog.smart-java.nl &#187; Architectuur</title>
	<atom:link href="http://blog.smart-java.nl/blog/index.php/category/architectuur/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.smart-java.nl/blog</link>
	<description>Ordina J-Technologies - Java Blog</description>
	<lastBuildDate>Wed, 05 May 2010 20:06:33 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Cross Site Request Forgery: Implementation patterns</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/09/07/cross-site-request-forgery-implementation-patterns/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/09/07/cross-site-request-forgery-implementation-patterns/#comments</comments>
		<pubDate>Mon, 07 Sep 2009 19:01:50 +0000</pubDate>
		<dc:creator>Jan-Kees van Andel</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[ajax]]></category>
		<category><![CDATA[CSRF]]></category>
		<category><![CDATA[XSS]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=516</guid>
		<description><![CDATA[In the previous blog, I&#8217;ve shown you a tip of the CSRF iceberg and a simple prevention mechanism. In this blog, I&#8217;ll elaborate a bit more on the implementation of a successful CSRF prevention mechanism.
Ad-hoc token per form implementation
As stated, the solution is simple, just include a hidden field in every form, store the same [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://blog.smart-java.nl/blog/index.php/2009/08/31/cross-site-request-forgery-an-introduction/">the previous blog</a>, I&#8217;ve shown you a tip of the CSRF iceberg and a simple prevention mechanism. In this blog, I&#8217;ll elaborate a bit more on the implementation of a successful CSRF prevention mechanism.</p>
<h3>Ad-hoc token per form implementation</h3>
<p>As stated, the solution is simple, just include a hidden field in every form, store the same value in the session and compare the two when the form is submitted.</p>
<p>Although working, this fix has some disadvantages.</p>
<ol>
<li>First, security is not turned-on by default. Every developer needs to be aware of the concept of CSRF and may not forget it when adding new forms. This also applies to developers that have to maintain the website in the future. They may not even know CSRF in the first place and just remove &#8216;that silly token&#8217; from the form.</li>
<li>Second, it&#8217;s probably not the most elegant solution from a <a href="http://en.wikipedia.org/wiki/Separation_of_concerns">Separation of Concerns</a> perspective. Now, every form needs to be aware of security and you need to perform a check in your side controller logic.</li>
</ol>
<p>So let&#8217;s elaborate on some more mature solutions.</p>
<h3>What are the variables?</h3>
<p>Like always in computer science, there is no single, one-size-fits-all, perfect solution. It always depends on the context. Well, let&#8217;s give it a bit of context!</p>
<ul>
<li>
<strong>Token freshness</strong><br />
How often is a new token generated? In other words, what&#8217;s the time-to-live of a token?
</li>
<li>
<strong>Token quality</strong><br />
How random should the token be? What algorithm should you use?
</li>
<li>
<strong>Token storage</strong><br />
You&#8217;ll need to store the token in two places to compare them. First, the HTML where your forms and links reside. And somewhere else&#8230;
</li>
<li>
<strong>Back buttons and bookmarking</strong><br />
How to deal with users pressing the back button and seeing a cached web page which they submit? How to deal with bookmarks? Also, you can have Single Sign On, paving the way for deep linking into your system without re-authentication.
</li>
<li>
<strong>Ajax</strong><br />
When using Ajax, many server side requests may occur before a full page refresh happens (if any). What to do with token management?
</li>
<li>
<strong>Punishment</strong><br />
What to do with hack attempts? Do you log an error and short-circuit the server side processing? Or do you log the user off?
</li>
<li>
<strong>Logging</strong><br />
How do you log the event? What kind of information do you include in the logging?
</li>
</ul>
<p>As you can see, there are many variables and all of them may impact your implementation. Most of them even influences each other. In the next section, I&#8217;ll discuss all of them in isolation.</p>
<h3>Token freshness</h3>
<p>This is a major one. A choice needs to be made here. Do you generate one token when the user logs in or do you re-generate a new token each time an action is made?</p>
<p>While re-generating a new token with every request may be very secure, it&#8217;s not the most practical solution in reality. In practice, it doesn&#8217;t even add that much security, because the <a href="http://taossa.com/index.php/2007/02/08/same-origin-policy/">Same Origin Policy</a> takes care of the integrity of the token. I also wouldn&#8217;t worry about brute force attacks for the token, since the user will probably be logged off long before the attacker guesses right. But when we throw back buttons, Ajax, double-clicking users, etc. into the mix, changing the token during a session has serious disadvantages.</p>
<p>So, unless you have really really really weak token (which will never happen unless you pick an integer between 0 and 1), I would generate the token once (@login) and reuse that value.</p>
<h3>Token quality</h3>
<p>First of all, we&#8217;re not talking about high end encryption stuff here. The only use of the token is turning the URL into a &#8220;random&#8221; URL. So simple tokens will do.</p>
<p>Important to note is that the attacker will probably never see any token, unless he logs into the system (the token is only used for secure pages). So we need something between an incremental counter and some heavy duty 1024 bit encryption.</p>
<p>I would say, keep it simple. Just use a Random generator or a simple <a href="http://en.wikipedia.org/wiki/Advanced_Encryption_Standard">AES</a> key, if it makes your security department happy. When using something like AES, remember to encode it using <a href="http://en.wikipedia.org/wiki/Base64">Base64</a>. It only needs to be one way encryption, since you can compare the encrypted values.</p>
<p>This is a simple AES token generator which also encodes the token using Base64 so it can be used in the HTML.</p>

<div class="wp_syntax"><div class="code"><pre class="java5" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">java.util.Random</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">javax.crypto.Cipher</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">javax.crypto.KeyGenerator</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">javax.crypto.SecretKey</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">import</span> <span style="color: #006699;">org.apache.commons.codec.binary.Base64</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> IntegrityTokenGenerator <span style="color: #009900;">&#123;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #006600; font-weight: bold;">void</span> main<span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> args<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        IntegrityTokenGenerator integrityTokenGenerator = <span style="color: #000000; font-weight: bold;">new</span> IntegrityTokenGenerator<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>integrityTokenGenerator.<span style="color: #006633;">generateToken</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>integrityTokenGenerator.<span style="color: #006633;">generateToken</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">out</span>.<span style="color: #006633;">println</span><span style="color: #009900;">&#40;</span>integrityTokenGenerator.<span style="color: #006633;">generateToken</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399; font-weight: bold;">String</span> ENC_TYPE = <span style="color: #0000ff;">&quot;AES&quot;</span><span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399; font-weight: bold;">SecretKey</span> key<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399; font-weight: bold;">Cipher</span> cipher<span style="color: #339933;">;</span>
    <span style="color: #000000; font-weight: bold;">private</span> <span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399; font-weight: bold;">Random</span> random<span style="color: #339933;">;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> IntegrityTokenGenerator<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399; font-weight: bold;">KeyGenerator</span> keyGen = <span style="color: #003399; font-weight: bold;">KeyGenerator</span>.<span style="color: #006633;">getInstance</span><span style="color: #009900;">&#40;</span>ENC_TYPE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            key = keyGen.<span style="color: #006633;">generateKey</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
            cipher = <span style="color: #003399; font-weight: bold;">Cipher</span>.<span style="color: #006633;">getInstance</span><span style="color: #009900;">&#40;</span>ENC_TYPE<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            cipher.<span style="color: #006633;">init</span><span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">Cipher</span>.<span style="color: #006633;">ENCRYPT_MODE</span>, key<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
            random = <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">Random</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">RuntimeException</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Error generating integrity token&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
&nbsp;
    <span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399; font-weight: bold;">String</span> generateToken<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #009900;">&#123;</span>
            <span style="color: #003399; font-weight: bold;">String</span> tokenString = <span style="color: #0000ff;">&quot;&quot;</span> + <span style="color: #003399; font-weight: bold;">System</span>.<span style="color: #006633;">nanoTime</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span> + <span style="color: #0000ff;">&quot;&quot;</span> + random.<span style="color: #006633;">nextInt</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #006600; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> tokenBytes = tokenString.<span style="color: #006633;">getBytes</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;UTF-8&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #006600; font-weight: bold;">byte</span><span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span> encoded = cipher.<span style="color: #006633;">doFinal</span><span style="color: #009900;">&#40;</span>tokenBytes<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">return</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">String</span><span style="color: #009900;">&#40;</span>Base64.<span style="color: #006633;">encodeBase64</span><span style="color: #009900;">&#40;</span>encoded<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #009900;">&#40;</span><span style="color: #003399; font-weight: bold;">Exception</span> e<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
            e.<span style="color: #006633;">printStackTrace</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
            <span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399; font-weight: bold;">RuntimeException</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;Error generating integrity token&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
        <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>I used AES in this example, but you can use anything, like <a href="http://en.wikipedia.org/wiki/Blowfish_%28cipher%29">Blowfish</a>. Performance is not an issue, because of the small token length. I&#8217;m talking about less than milliseconds. Also, encryption strength is not very important, because you only need a simple token. Remember to base64 encode binary tokens into plain text.</p>
<p>Also be sure to use a random value each time when you generate a token. This way, the token sequence becomes less predictable. To make it a bit more secure, include some personal data in the token, like a customer ID. The combination of data make the key even less likely to guess. Don&#8217;t put any confidential data in the token. It doesn&#8217;t add any security and only increases the risk of information leakage.</p>
<p>And make sure not to throw too specific exceptions. Just log them and throw a general exception. Details may never end up in the wrong hands!</p>
<p>One more note about the snippet. A big advantage of this approach is that you don&#8217;t have any key management. Keys may leak and key management puts another burden on the system administrators. This implementation generates a key once (when the class is instantiated) and reuses it. Java takes care of the details.</p>
<h3>Token storage</h3>
<p>There are many ways to keep track of the current token for the currently logged in user, but the HttpSession and Cookies are the most common. Cookies decrease the amount of state your server must manage, but for some reason, developers think cookies are unsafe.</p>
<p>When used for storing CSRF prevention tokens, cookies can be safe. Why? Because an attacker cannot see the contents of the cookies. It&#8217;s the browser that appends the cookies to an HTTP request, but the attacker cannot see them. A mechanism like HttpOnly makes cookies more secure, because it prevents scripts to access them. But storing security tokens on the client is inherently unsafe and is probably the prime reason not to use cookies to store the tokens. You CAN do it, but you must be very careful AND rely on the browser to do the right thing. For example, Firefox ignores HttpOnly, causing cookies to be quite unsafe.</p>
<p>But why not store the token in an HttpSession attribute? Generate it once, put it in the session and then compare this token with the submitted token. It has a little bit of overhead, but that will probably be nothing compared with the rest of the session data.</p>
<p>I prefer the HttpSession over cookies by a mile length. With session attributes, I don&#8217;t have to think about token confidentiality and with cookies I do. Easy choice.</p>
<h3>Back buttons and bookmarking</h3>
<p>For some reason, clients want us to build systems in web browsers that are not specifically designed for such a purpose. But it&#8217;s a fact of life, so we need to deal with it. Browsers have back/forward buttons, refresh buttons and bookmarks and allow users to navigate freely by using the address bar. If you want your users to be able to use these features, here are some notes.</p>
<p>Only check the CSRF prevention token on state changing actions. Users often don&#8217;t bookmark actions, but they bookmark pages. When you only check the CSRF token when the form is submitted, the user can use a bookmark to navigate to the form. The server then renders a token into the form. The user submits the form and it works. You just need to give the server a chance to render a form before the user triggers an action. Using <a href="http://en.wikipedia.org/wiki/Post/Redirect/Get">post-redirect-get</a> makes it even more robust, because it prevents users to bookmark the POST request.</p>
<p>Also, don&#8217;t re-generate tokens with each request. When you want to support back buttons, you&#8217;ll have to be aware of the possibility that the rendered page may be a cached (stale) page which contains an outdated token. So, use the same token for the duration of the session.</p>
<p>You might also want to consider storing the token per user in a persistent store, so that when the same user navigates to the site (after a time of inactivity, maybe days or weeks) he gets the same token, thus fixing possible bookmarking problems.</p>
<p>If you implement your own home brewn CSRF prevention mechanism you can redirect the user to a proper page when the integrity token is absent or invalid. This way, the user doesn&#8217;t see some strange error but a proper page instead.</p>
<h3>Ajax</h3>
<p>When securing your website, you also need to secure your Ajax actions. After all, an Ajax request is just the same (from the server perspective) as another request. So you need to include CSRF tokens in the Ajax requests as well.</p>
<p>Luckily, with most JavaScript libraries, this is easy, since they provide a generic abstraction for Ajax requests and often provide a way to append generic parameters to a request.</p>
<p>The implementation is really easy, just render the CSRF token to a script variable and append it to all requests, like shown below.</p>

<div class="wp_syntax"><div class="code"><pre class="jsp" style="font-family:monospace;">var globalIntegrityToken = '&lt;%=(String)session.getAttribute(&quot;integrityToken&quot;)%&gt;';</pre></div></div>

<p>Yeah I know that scriptlets are ugly, but I don&#8217;t care in this example. You can use EL or whatever you like.</p>
<p>Appending the token to the URL is easy, for example using JQuery:</p>

<div class="wp_syntax"><div class="code"><pre class="javascript" style="font-family:monospace;">$.<span style="color: #660066;">ajax</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#123;</span>
  type<span style="color: #339933;">:</span> <span style="color: #3366CC;">'POST'</span><span style="color: #339933;">,</span>
  url<span style="color: #339933;">:</span> <span style="color: #3366CC;">'/changeEmail.do'</span><span style="color: #339933;">,</span>
  data<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
    newEmail<span style="color: #339933;">:</span> $<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'emailaddress'</span><span style="color: #009900;">&#41;</span>.<span style="color: #660066;">val</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span>
    integrityToken<span style="color: #339933;">:</span> globalIntegrityToken
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Again, reusing the same token for a longer period of time makes your job a lot easier. Otherwise, if you re-generate the token per request, you need a way to update all tokens that exist in the DOM when the request is completed. This has performance impact on the client and makes the code a lot more complex. You&#8217;ll have to make sure you update all tokens, which include:</p>
<ul>
<li>The global variable, which is easy to update in a generic way</li>
<li>All hidden fields that contain a token. If you can easily find the fields, this should also be easy</li>
<li>Any tokens in querystrings. This may be a bigger challenge and you may end up using regular expressions to parse and replace the querystrings</li>
</ul>
<p>So, when you reuse the same token for the duration of the session (or longer) Ajax will probably not be difficult.</p>
<h3>Punishment</h3>
<p>What do you do when you see someone fiddling with the CSRF prevention token? First of all, don&#8217;t proceed with the action the user invokes. But do you log the user off?</p>
<p>If you can, I would suggest making a switch to turn the logoff mechanism on and off. Turning it off makes it easier to test, but you need to turn it on in production. If you&#8217;re afraid that you (or the sysadmin) forgets to turn the mechanism on, you can skip this switch.</p>
<p>Just be sure your prevention mechanism doesn&#8217;t give false positives, because users will be very annoyed if they are logged off often.</p>
<p>Logging the user off has some other benefit. If there is a hacker messing around with your website, users will get logged off very often and complaints will come in often at your help desk. This way, you have the opportunity to detect hackers. If you let the user proceed working without notification, you will never know that something happened. Unless your sysadmins proactively scan the log files, but unfortunately I don&#8217;t see sysadmins do these kind of things very often&#8230; <img src='http://blog.smart-java.nl/blog/wp-includes/images/smilies/icon_sad.gif' alt=':-(' class='wp-smiley' /> </p>
<h3>Logging</h3>
<p>How do you log the event? I would suggest creating a separate security log file so that sysadmins can react to hack attempts. It&#8217;s also useful as a proof in a lawsuit. I would log as much as possible. You can think about:</p>
<ul>
<li>Date/time of course and also the timezone.</li>
<li>The entire HTTP request, with cookies, referrer, user agent, etc.</li>
<li>The source IP address. However it may be spoofed or a proxy, at least you can see that there is a difference with the &#8220;real&#8221; user&#8217;s IP.</li>
<li>The invoked action, along with it&#8217;s parameters.</li>
<li>Any server side state, specific to your system.</li>
</ul>
<p>It might be a good idea to also log the first two items when the user authenticates so you can compare the two values.</p>
<h3>To summarize</h3>
<p>In most cases, you&#8217;ll want to use a static CSRF prevention token for the duration of the session. It makes Ajax implementations easier and gives browser features like back buttons and bookmarks a more natural behavior. When you do, implementing a working CSRF prevention mechanism may not be a big issue anymore.</p>
<p>The other variables don&#8217;t have this much impact, but be sure to pick a good combination, because some variable combinations may harm each other.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/09/07/cross-site-request-forgery-implementation-patterns/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Cross Site Request Forgery: An introduction</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/08/31/cross-site-request-forgery-an-introduction/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/08/31/cross-site-request-forgery-an-introduction/#comments</comments>
		<pubDate>Mon, 31 Aug 2009 20:31:28 +0000</pubDate>
		<dc:creator>Jan-Kees van Andel</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[CSRF]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[XSS]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=506</guid>
		<description><![CDATA[What is it?
Cross site request forgery (CSRF) is a type of attack, made to a web application by another web application, running in another tab or window in the same browser on the same computer. The server cannot see the difference between legitimate and malicious requests, since all HTTP requests look the same. Thus, the [...]]]></description>
			<content:encoded><![CDATA[<h3>What is it?</h3>
<p>Cross site request forgery (<a href="http://www.owasp.org/index.php/Cross-Site_Request_Forgery_%28CSRF%29">CSRF</a>) is a type of attack, made to a web application by another web application, running in another tab or window in the same browser on the same computer. The server cannot see the difference between legitimate and malicious requests, since all HTTP requests look the same. Thus, the server (obviously) assumes a legitimate request. This gives the hacker the opportunity to impersonate the user and execute commands without the user knowing it. Or more abstract, CSRF may violate <a href="http://en.wikipedia.org/wiki/Information_security#Integrity">integrity</a>.</p>
<p>The attack is caused by three things.</p>
<ol>
<li>First, the fact that most websites completely rely on a session cookie to represent the user.</li>
<li>Second, because browsers allow websites to invoke certain cross domain requests.</li>
<li>And third, the browser appends all cookies that belong to a domain to every request to that domain.</li>
</ol>
<h3>Example</h3>
<p>For example, you have a social networking site where you can add friends to your network. Let&#8217;s say everyone has a link on his/her personal page that can be used to add that person to your &#8220;friends&#8221; list.
</p>
<p>
The link may look like this:
</p>
<pre class="HTML">
&lt;a href="/addfriend.do?userid=12345"&gt;Add friend&lt;/a&gt;
</pre>
<p>
The application uses the <a href="http://java.sun.com/javaee/5/docs/api/javax/servlet/http/HttpSession.html">HttpSession</a> (or <a href="http://java.sun.com/j2se/1.4.2/docs/guide/security/jaas/JAASRefGuide.html#Subject">JAAS Subject</a> or whatever. The attack is not specific to Java) to determine the currently logged in user and its userId. The userId of the new friend is taken from the querystring. Both are combined and the relation is created. This mechanism trusts the integrity of the session, which is not a bad thing on itself.
</p>
<p>
So far so good&#8230;
</p>
<h3>Attacking the example website</h3>
<p>But the above example has one major flaw. It assumes that a session safely represents the user and bases security checks on this assumption. For some reason, people think that the session is &#8220;safe&#8221;, just because it exists on the server. But there is one big issue with sessions. The server determines which session belongs to a certain user by a sessionId, which is stored in a cookie in the user&#8217;s browser (URL rewriting is also possible, but not common). From the server point of view, this cookie <b>IS</b> the user. So, when someone accesses the cookie value, this person is automatically authenticated. But that&#8217;s not the issue here. That would be a case of <a href="http://www.owasp.org/index.php/Session_hijacking_attack">session hijacking</a> (or <a href="http://www.owasp.org/index.php/Session_Fixation">session fixation</a>). You can migitate the risk of session hijacking using common security mechanisms, like SSL, and by ensuring that only proper input is accepted. But we&#8217;re talking about CSRF here.</p>
<p>CSRF is possible, because all cookies that belong to a certain domain are sent with each request to that domain, regardless from what domain the request is invoked, even when that web page is running in a different tab or window. SSL doesn&#8217;t help you here. If a call is made using HTTPS, the cookies are still appended to the request and a valid HTTPS request is executed.</p>
<p><i>Note: The details of cookie sharing differ per browser. For example, in Internet Explorer, when you use CTRL+N to open a new window, cookies (and thus sessions) are shared between the window and its &#8220;parent&#8221;, but when a new process is created (e.g. from the Start Menu or Quick Launch), the two act as two separate browsers and cookies are not shared. This distinction is often not visible to (or known by) the end user. But there are more issues, like the annoying IE8-I-share-cookies-between-tabs-&#8221;feature&#8221;.</i></p>
<p>What can an attacker do with this knowledge? Consider the example above. If the social networking site permits HTML content on your personal page, the attacker can first create an account and then put the following HTML snippet on his personal page.</p>
<pre class="HTML">
&lt;img src="/addfriend.do?userid=[ATTACKER_USER_ID]" /&gt;
</pre>
<p>The browser sees this image tag and tries to load the image, effectively invoking the following HTTP request to the specified URL. It doesn&#8217;t matter that the URL doesn&#8217;t point to a real image, the processing on the server is being done anyway. The request is invoked from the victim&#8217;s browser, so the victim&#8217;s cookies are appended to the request.</p>
<p>
Now, everyone who visits the attacker&#8217;s personal page becomes a friend of the attacker.
</p>
<p>With a bit more effort, the attacker can do almost anything with the entire social networking site. <a href="http://en.wikipedia.org/wiki/Samy_%28XSS%29">SAMY</a> is an example of an attack on MySpace which used <a href="http://en.wikipedia.org/wiki/Cross-site_scripting">XSS</a> and CSRF techniques to do its thing.</p>
<p>Almost all major websites (like <a href="http://blogs.zdnet.com/Google/?p=434">GMail</a>) have had CSRF leaks and many major websites still have.</p>
<h3>How about POST?</h3>
<p>The previous example uses an image to trigger an HTTP GET request. So you might think that POST is safer. Well, opposed to what many people claim, POST is not much safer than GET (from a networking perspective they are the same). Where we use an image to trigger a GET request, we can just as easily use a (hidden) form to trigger a POST request, as shown in the next snippet:</p>
<pre class="HTML">
&lt;form id="myForm" action="/addFriend.do" method="post"&gt;
  &lt;input type="hidden" name="userid" value="[ATTACKER_USER_ID]" /&gt;
&lt;/form&gt;
&lt;script type="text/javascript"&gt;
  document.getElementById("myForm").submit();
&lt;/script&gt;
</pre>
<h3>Cross Site?</h3>
<p>The risk of CSRF may be small or big. In the above example, the attack concentrated on the same website.</p>
<p>This is a case of a &#8220;Stored CSRF Vulnerability&#8221; (from <a href="http://www.isecpartners.com/documents/XSRF_Paper.pdf">www.isecpartners.com/documents/XSRF_Paper.pdf</a>), which means that the attacker uses a security hole in the application itself to invoke the HTTP methods. The attacker uses techniques like XSS as an &#8220;enabler&#8221; for their CSRF attack. This kind of attack is completely the responsibility of the website. They just need to prevent the user to enter any HTML.</p>
<p>But CSRF also works cross domain. In that case, we talk about &#8220;Reflected CSRF&#8221; (also from <a href="http://www.isecpartners.com/documents/XSRF_Paper.pdf">www.isecpartners.com/documents/XSRF_Paper.pdf</a>). In this case, your website itself may be perfectly safe (meaning it doesn&#8217;t accept HTML snippets like in the example) but an attacker may still invoke HTTP requests to your site using a different site, like a forum, blog, email, IM or whatever. Effectively this means your website can be the victim of the leak in another website. Luckily, this method often fails, since the user must be logged in on the targeted site and visit the malicious other site at the same time. How often this happens depends on the type of website. I&#8217;m sure most people don&#8217;t visit many dangerous websites when doing electronic banking, but it&#8217;s not uncommon to do this when logged in on webmail, especially since these kind of websites often provide features like Single Sign On or long sessions.</p>
<p>But doesn&#8217;t the browser prevent this behavior?!?!?! Nope, as I&#8217;ll indicate in the following section.</p>
<h3>Same Origin Policy</h3>
<p>First implemented in Netscape Navigator 2.0, the Same Origin Policy is one of the main security mechanisms in modern browsers. Basically, it allows scripts that are located on the same domain to interact, while preventing interaction between scripts that are on different domains. With interaction, I mean accessing properties, invoking methods and other scripting mechanisms.</p>
<p>The Same Origin Policy is invoked when one of the following actions happens.</p>
<ul>
<li>Requesting URLs with <a href="http://en.wikipedia.org/wiki/XMLHttpRequest">XMLHttpRequest (XHR)</a></li>
<li>Accessing frames and iframes</li>
<li>Accessing documents</li>
<li>Accessing cookies</li>
<li>Accessing browser windows</li>
</ul>
<p>The list above is not complete, but contains the major actions that are validated against the Same Origin Policy.</p>
<p>This means that for example, it&#8217;s not allowed to invoke cross domain XHR requests. That&#8217;s good. It adds a serious layer of security and in most cases, you won&#8217;t need cross domain XHR requests.</p>
<p>But the Same Origin Policy doesn&#8217;t apply to everything. For example, it&#8217;s still possible to reference images, scripts and style sheets from different domains. It&#8217;s also possible to submit forms across domains.</p>
<p>I&#8217;m talking about domains a lot, let&#8217;s see what I mean with a domain, using the following listing. It shows how some example URL&#8217;s relate to the following web page: http://www.domain.com/pages/homepage.html. It also shows whether the Same Origin Policy allows document retrieval, and if not, why.
</p>
<table>
<thead>
<tr>
<th>Other URL</th>
<th>Allowed</th>
<th>Reason</th>
</tr>
</thead>
<tbody>
<tr>
<td>http://www.domain.com/otherDirectory/page.html</td>
<td>Allowed</td>
<td>Different directory is allowed</td>
</tr>
<tr>
<td>https://www.domain.com/page.html</td>
<td>Not allowed</td>
<td>Different protocol</td>
</tr>
<tr>
<td>http://sub.domain.com/page.html</td>
<td>Not allowed</td>
<td>Different subdomain</td>
</tr>
<tr>
<td>http://www.domain.com:8080/page.html</td>
<td>Not allowed</td>
<td>Different port</td>
</tr>
</tbody>
</table>
<p>So a different domain can also mean: different protocol, different port or different subdomain.</p>
<p>Also, be warned that there are several ways around the Same Origin Policy. For example using the Flash cross-domain.xml file. Flickr was vulnerable to this, as written <a href="http://blog.monstuff.com/archives/000302.html">here</a>.</p>
<p>To summarize, the Same Origin Policy provides quite a lot of safety, but not enough to completely prevent CSRF attacks. It does provide a helping hand, as I will show in the last part.</p>
<p>For a more detailed discussion of the Same Origin Policy, see:<br />
<a href="https://developer.mozilla.org/En/Same_origin_policy_for_JavaScript">https://developer.mozilla.org/En/Same_origin_policy_for_JavaScript</a> and<br />
<a href="http://taossa.com/index.php/2007/02/08/same-origin-policy/">http://taossa.com/index.php/2007/02/08/same-origin-policy/</a> and<br />
<a href="http://taossa.com/index.php/2007/02/17/same-origin-proposal/">http://taossa.com/index.php/2007/02/17/same-origin-proposal/</a>
</p>
<h3>HTTPOnly cookie flag</h3>
<p>As some may know, HTTPOnly is a cookie flag that is respected by most major browsers. Some browsers only added HTTPOnly support recently, like Firefox, which only has HTTPOnly support since version 3. Some browsers, like IE, already supported the flag for ages.</p>
<p>It is included in the Set-Cookie Http response header, as follows:</p>
<pre>
Set-Cookie: <name>=<value>[; <Max-Age>=<age>]
[; expires=<date>][; domain=<domain_name>]
[; path=<some_path>][; secure][; HTTPOnly]
</pre>
<p>It is appended to the Set-Cookie value in just the same way as the Secure flag, which prevents the cookie to be sent across insecure connections (non-HTTPS).
</p>
<p>Basically, marking a cookie as HTTPOnly, prevents any scripts to access it. HTTPOnly cookies are only accessible by the browser to include it in the requests made to the server. It is not accessible by any JavaScript (the document.cookie property).</p>
<p>For more information, see the <a href="http://www.owasp.org/index.php/HTTPOnly">OWASP HTTPOnly documentation</a>.</p>
<p>Btw. The Java Servlet specification only added HttpOnly support in version 3.0, so it&#8217;s not final at the time of writing!</p>
<h3>Fixes</h3>
<p>The solution is actually quite simple. Just use unguessable URLs in your website. You can achieve this by including a random token in the querystring/POSTdata of &#8220;important&#8221; requests. It&#8217;s not necessary to include it in requests that don&#8217;t change any state on the server, only state modifying requests need to have the token.</p>
<p>Why not include it in simple GET requests? Simple, because they don&#8217;t change anything and the hacker has no way to read the information that has been sent back to the browser, because of the Same Origin Policy. As indicated before, the Same Origin Policy prevents any scripts to read content from a remote page. Using XHR is also not possible, since the Same Origin Policy prevents cross domain requests using XHR. So, thanks to the Same Origin Policy, <b>confidentiality</b> is ensured. But to ensure <b>integrity</b>, some steps need to be taken by the developer.</p>
<p>For integrity, you need to implement a mechanism like the following: When you render an HTML form, be sure to include a hidden field with a random integrity token. You&#8217;ll also need to put this token into the session. When the form is submitted, you&#8217;ll need to compare the submitted token with the token in the session. If they are not the same, trigger an error or log the user off. Whatever you do, don&#8217;t let the request continue executing!</p>
<p>Also, don&#8217;t forget to include an integrity token in state-changing GET requests and Ajax calls and check appropriately!</p>
<p>There are a lot of variables in a successful CSRF prevention mechanism. In the next blog, I&#8217;ll give you some guidelines so you can pick the right ones for your website.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/08/31/cross-site-request-forgery-an-introduction/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Client side performance tuning: Minimize HTTP requests</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/08/23/client-side-performance-tuning-minimize-http-requests/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/08/23/client-side-performance-tuning-minimize-http-requests/#comments</comments>
		<pubDate>Sun, 23 Aug 2009 20:02:28 +0000</pubDate>
		<dc:creator>Jan-Kees van Andel</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[performance]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=490</guid>
		<description><![CDATA[For some reason, when talking about web site/application performance, people think about the server side, like optimizing database queries and tuning connection pools. Yeah, of course, it&#8217;s important to tune the server side. But what people often don&#8217;t realize is that there&#8217;s lots to gain on the client side as well.
For the people that haven&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p>For some reason, when talking about web site/application performance, people think about the server side, like optimizing database queries and tuning connection pools. Yeah, of course, it&#8217;s important to tune the server side. But what people often don&#8217;t realize is that there&#8217;s lots to gain on the client side as well.</p>
<p>For the people that haven&#8217;t done so already, download <a href="http://www.mozilla-europe.org/nl/firefox/">Firefox</a> with the <a href="https://addons.mozilla.org/en-US/firefox/addon/1843">Firebug</a> plugin. You can activate Firebug using the F12 keyword.</p>
<p>Firebug has a really useful &#8220;Net&#8221; tab (which must be activated because it has quite an overhead. You can use this tab for debugging (inspecting headers and stuff), but it also gives you a good indication of your performance.</p>
<p>To give you an example, below is a screenshot of java.sun.com.</p>
<p><a href="http://blog.smart-java.nl/blog/wp-content/uploads/2009/08/firebug_net_java_sun_com.jpg"><img src="http://blog.smart-java.nl/blog/wp-content/uploads/2009/08/firebug_net_java_sun_com_thumb.JPG" alt="firebug_net_java_sun_com_thumb" title="firebug_net_java_sun_com_thumb" width="343" height="512" class="alignnone size-full wp-image-498" /></a></p>
<h3>Analyzing the problem</h3>
<p>What can a developer learn from this output?</p>
<ol>
<li>
First, you can see that the server side part is quite fast. It only takes 220ms to write the entire page to the client (the first bar). They could maybe optimize a bit by flushing earlier, to have better network utilization, but this is probably the consequence of an architectural choice (I assume they&#8217;re using an MVC style approach like Struts, where you would first gather all data and then forward to a JSP page). It wouldn&#8217;t even be a big win, since the server side performance is only a small part of the total performance.
</li>
<li>
Second, we can see a lot of activity is happening after the initial page request. By using the handy Firebug filters, I can easily see that 3 CSS style sheets, 17 JavaScripts and 38 images are fetched. This sums up to a total page load time of almost 4 seconds. In fact, there are so many requests that they don&#8217;t even fit on my screen.
</li>
<li>
A third lesson we can learn from this output is that almost all requests are made sequentially. I&#8217;ve made a simple calculation. The total amount of data downloaded is 137KB and it took around 4 seconds. That&#8217;s around 35KB/s!!! As an indication, I normally download (large files) with a megabyte per second. How about bad network utilization!
</li>
<li>
Fourth, there are some moments where no resources are downloaded at all. This is because downloaded resources have to be parsed, executed, rendered or whatever. This also hurts parallelism. After all, the computations required to do something with a resource don&#8217;t involve network I/O, so it would be nice if the browser started downloading the next file. This would be way more efficient. Fixing this issue is not trivial in most cases, so I&#8217;m not gonna talk about it here. <a href="http://www.stevesouders.com/blog/2009/04/27/loading-scripts-without-blocking/">Steve Souders</a> has blogged about it extensively.
</li>
</ol>
<h3>Http connections in your browser</h3>
<p>Before trying to fix the problem (which I&#8217;m not going to do, since it&#8217;s not my website), we need to know some things about how web browsers handle requests.</p>
<p>An important thing to know is that browsers only use a certain amount of connections in parallel to the same web server. With web server, I mean the same IP address. So if multiple domain names resolve to the same IP address I&#8217;m talking about the same web server. This is in accordance with the HTTP 1.1 protocol, which states that a single web browser instance should only use 2 connections per web server. Below is a table with some popular browsers and their default parallel connection counts, using HTTP 1.1. (when using HTTP 1.0, the numbers may differ)</p>
<pre>
Browser                # of Connections
Internet Explorer <8   2
Internet Explorer >=8  6
Firefox <3             2
Firefox >=3            6
Safari                 4
</pre>
<p>References:<br />
<a href="http://support.microsoft.com/?scid=kb%3Ben-us%3B282402&#038;x=8&#038;y=8">http://support.microsoft.com/?scid=kb%3Ben-us%3B282402&#038;x=8&#038;y=8</a><br />
<a href="http://blogs.zdnet.com/Burnette/?p=565">http://blogs.zdnet.com/Burnette/?p=565</a><br />
<a href="http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/">http://www.stevesouders.com/blog/2008/03/20/roundup-on-parallel-connections/</a></p>
<p>You can change the settings of your browser. For example, when using Firefox, you can use about:config to increase the amount of parallel connections. But since most visitors of your website won&#8217;t change this setting and just use a mainstream browser, you can&#8217;t rely on this setting. On my current project (online ebanking, so the end user are &#8220;normal&#8221; people, not whizkids), we have to support browsers back to IE6. This means we have to perform well in browsers that use only 2 parallel connections per web server. Also, statistics indicate that people who use old browsers often are not on high bandwith connections (or are on a corporate network).</p>
<p>So&#8230; we had to do some work to enhance user experience.</p>
<p>To wrap up: The issue is simply that too many resources are loaded. It shows up in the chatty waterfall chart.</p>
<h3>The solution</h3>
<p>With a bit of background knowledge, we can start looking at the solution. We need a way to increase the average download speed/decrease the loading times. As shown in the image, you can see that several small files is not very efficient, so let&#8217;s start combining them. This saves some overhead, because we have to download less files. Also, as a nice side effect, compression techniques like GZIP are more efficient on large files.</p>
<p>But are issues, because we developers like to create maintainable code in separate small files. And working with large project teams and version control systems is often a mess when you have large files that everyone is editing concurrently.</p>
<p>Luckily, the solution is easy. Just aggregate all files into one big file. On my project, I&#8217;ve used Yahoo&#8217;s YUICompressor which not only aggregates, but also minifies the scripts and stylesheets. <a href="http://developer.yahoo.com/yui/compressor/">http://developer.yahoo.com/yui/compressor/</a></p>
<p>Since we use Apache Maven 2 as our build tool, I&#8217;ve integrated compression into our build using a Maven plugin which invokes YUICompressor. I&#8217;ve used <a href="http://alchim.sourceforge.net/yuicompressor-maven-plugin/">http://alchim.sourceforge.net/yuicompressor-maven-plugin/</a></p>
<p>Using the default settings, the plugin only minifies the files. The minified files get the &#8220;-min&#8221; suffix.</p>
<p>Below is an example of a Maven 2 configuration where the listed files are minified and aggregated into one big &#8220;all.js&#8221; file. The normal files are still present, but I&#8217;ll get into that later.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;project<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;build<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;plugins<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>net.sf.alchim<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>yuicompressor-maven-plugin<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
            <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>compress<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
            <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>        
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;nosuffix<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/nosuffix<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;aggregations<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
            <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/all.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
                <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/jquery.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
                <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/util.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
                <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/defs.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
                <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/ajax.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
                <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/handlers.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
              <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
            <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/aggregations<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/plugins<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/build<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/project<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>As you can see, I&#8217;ve explicitly specified the files to include. You can also use wildcards. I prefer this way because now I&#8217;m 100% sure of the ordering in the final aggregation. With wildcards, on the other hand, a simple refactoring (like renaming a file) could silently break the application @runtime. By being explicit, I get an error telling me that a file is missing.</p>
<p>CSS can also be aggregated and minified. And I must say, YUICompressor is really impressive. For example, it doesn&#8217;t just remove all comments, but determines per occurrence if it needs to be removed. For example, if you have a Safari Commented Backslash Hack v2 (<a href="http://perishablepress.com/press/2006/08/27/css-hack-dumpster/">http://perishablepress.com/press/2006/08/27/css-hack-dumpster/</a>), it will not be removed, since that would break your CSS.</p>
<p>Below is an example where both the JS and CSS files are aggregated into two files: &#8220;all.js&#8221; and &#8220;all.css&#8221;.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>net.sf.alchim<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/groupId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>yuicompressor-maven-plugin<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/artifactId<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>compress<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goal<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/goals<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/execution<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/executions<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;nosuffix<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/nosuffix<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;aggregations<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/all.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/jquery.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/util.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/defs.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/ajax.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/js/handlers.js<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/insertNewLine<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/all.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/output<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/global.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/main.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/buttons.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/menu.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
          <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>${project.build.directory}/${project.build.finalName}/css/components.css<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/include<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
        <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/includes<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/aggregation<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/aggregations<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/plugin<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<h3>What about images?</h3>
<p>Images can be aggregated too, but this is way more difficult than with scripts. You&#8217;ll have to use image sprites, but this first involves creating a sprite (you can do this with online tools if you want), but then the difficulties start, since you need some CSS tricks to &#8220;select&#8221; the image from the sprite using offsets. This is a real pixel-pain-in-the-ass, but you&#8217;ll also enter the realm of cross browser issues here.</p>
<p>This page is a good starting point, but beware, depending on your situation, things may become difficult.<br />
See: <a href="http://www.alistapart.com/articles/sprites">http://www.alistapart.com/articles/sprites</a>.</p>
<p>On the other hand, you can of course begin small. For example, creating several sprites for buttons (with hovers and sliding doors you can turn 4 images into 1), boxes and logos. Every improvement is welcome.</p>
<h3>Conclusion</h3>
<p>Aggregation can greatly improve the performance of your website. On my project, I&#8217;ve decreased load times (with an empty cache) from 7 seconds to less than 3 seconds, using only aggregation and script/CSS compression. Using GZIP, caching and some other tweaks, I&#8217;m currently even lower. I&#8217;m still planning to implement the image sprites.</p>
<p>The moral of this blog is that there is more than the server side and you have to remember, it&#8217;s the end user who experiences your application. End users really hate hickups, long load times and web pages that build up in strange ways. They will be annoyed and maybe even lose their trust in the integrity of the application. And when this happens, you&#8217;re in deep shit.</p>
<p>Also, remember. Performance tuning differs from other disciplines, like security. With security, there are no (or at least little) compromises. But on the other hand, you can (and should) be happy with every performance improvement as the only valid performance measurement is the happyness of the end user.</p>
<h3>Notes</h3>
<ol>
<li>I didn&#8217;t remove the original files, since I want to be able to switch @runtime between aggregated files and the originals. This greatly simplifies debugging, especially in production.</li>
<li>Check performance in all browsers. As indicated, browsers have different default values for several settings and different characteristics.</li>
<li>Client side performance is often easy to test. You don&#8217;t need heavy load, like you would when doing server side performance testing. You can start with your development machine and a simple testing server is often good enough for a more thorough test.</li>
</ol>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/08/23/client-side-performance-tuning-minimize-http-requests/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Retrofit your webapp with generic XSS protection</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/06/25/retrofit-your-webapp-with-generic-xss-protection/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/06/25/retrofit-your-webapp-with-generic-xss-protection/#comments</comments>
		<pubDate>Thu, 25 Jun 2009 21:00:44 +0000</pubDate>
		<dc:creator>Jan-Kees van Andel</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Security]]></category>
		<category><![CDATA[Web]]></category>
		<category><![CDATA[cross site scripting]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=465</guid>
		<description><![CDATA[On my current project (online e-banking application for a medium/large scale bank), we needed to add Cross Site Scripting (XSS) protection afterwards. Well, actually, we had a working XSS protection mechanism in place, but the security auditors pointed out the implementation had flaws. But we&#8217;ll get to that later.
Cross Site Scripting?
First, let me explain what [...]]]></description>
			<content:encoded><![CDATA[<p>On my current project (online e-banking application for a medium/large scale bank), we needed to add <a href="http://www.owasp.org/index.php/Top_10_2007-Cross_Site_Scripting">Cross Site Scripting (XSS)</a> protection afterwards. Well, actually, we had a working XSS protection mechanism in place, but the security auditors pointed out the implementation had flaws. But we&#8217;ll get to that later.</p>
<h3>Cross Site Scripting?</h3>
<p>First, let me explain what Cross Site Scripting (XSS) is. XSS is a hack where malicious code (JavaScript/HTML) is injected into a trusted site. With trusted, I mean the user trusts the site, which is usually the case with e-banking websites. At least, I hope so. <img src='http://blog.smart-java.nl/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>A simple example is a web site which contains a search box, like the following.</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">&lt;form action=&quot;/search.do&quot; method=&quot;GET&quot;&gt;
  &lt;input type=&quot;text&quot; name=&quot;searchString&quot; value=&quot;Enter something here&quot; /&gt;
  &lt;input type=&quot;submit&quot; value=&quot;Search&quot; /&gt;
&lt;/form&gt;</pre></div></div>

<p>When the user submits the form, a request is made to the following URL: /search.do?searchString=[THE_USER_INPUT]</p>
<p>The application responds with a page, containing the results, including a summary of the search criteria, and also the search form, prefilled with the search string the user entered, like here.</p>

<div class="wp_syntax"><div class="code"><pre class="jsp" style="font-family:monospace;">&lt;!-- Some HTML --&gt;
&nbsp;
&lt;form action=&quot;search.jsp&quot; method=&quot;GET&quot;&gt;
  &lt;input type=&quot;text&quot; name=&quot;searchString&quot; value=&quot;${param.searchString}&quot; /&gt;
  &lt;input type=&quot;submit&quot; value=&quot;Search&quot; /&gt;
&lt;/form&gt;
&nbsp;
&lt;table id=&quot;results&quot;&gt;
  &lt;!-- The search results --&gt;
&lt;/table&gt;
&nbsp;
&lt;!-- The rest of the site --&gt;</pre></div></div>

<p>So far, so good, right?</p>
<p><b>No, wrong!!!</b></p>
<p>Suppose a hacker enters the following value into the searchString box: </p>
<pre>"/>&lt;script&gt;alert("hello")&lt;/script&gt;&lt;p id="</pre>
</p>
<p>You&#8217;ll see an alert box with the message: &#8220;Hello&#8221;. You have injected a script into your web page.</p>
<p>But why would anyone hack his own browser session? That&#8217;s not the danger here. The danger is that a hacker can create an URL, like
<pre>http://www.yoursite.com/search.do?searchString="/&gt;&lt;script&gt;alert("hello")&lt;/script&gt;&lt;p id="</pre>
<p> and send it to other people. </p>
<p>When they click on it, the script gets executed in their browser.</p>
<p>Of course, an alert window is harmless, but what if I inject some DOM scripting to create a form where the user must enter his or her credentials for the attacked site? Or an image to automatically trigger a &#8220;Button click&#8221; (this is actually a case of <a href="http://www.owasp.org/index.php/Top_10_2007-Cross_Site_Request_Forgery">Cross Site Request Forgery</a>)? With such a leak, I can do almost anything on your page, without you knowing. <img src='http://blog.smart-java.nl/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><b>Let&#8217;s see what happens when the login page of your bank is vulnerable to XSS. Phishing becomes easy as stepping on kittens!</b></p>
<p>As you&#8217;re probably now aware of, XSS is a great &#8220;enabler&#8221; for all kind of other hacks. And the fun part is, hackers can spot XSS flaws easily and often using <a href="http://www.owasp.org/index.php/Category:OWASP_WebScarab_Project">automatic tools</a>. Not scared yet? In that case, I hope you&#8217;re not working on the website where I perform my electronic payments! <img src='http://blog.smart-java.nl/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  You <b>should</b> be scared for these kind of leaks.</p>
<h3>Countermeasures</h3>
<p>In theory, it is very easy to protect your website against XSS. Just escape all variables, according to the escaping rules that apply for the markup/script language you are about to put the variables in. <a href="http://commons.apache.org/lang/">Apache Commons Lang</a> has a utility for this purpose: the class <a href="http://commons.apache.org/lang/api-2.3/org/apache/commons/lang/StringEscapeUtils.html">StringEscapeUtils</a> with its method <a href="http://commons.apache.org/lang/api-2.3/org/apache/commons/lang/StringEscapeUtils.html#escapeHtml(java.lang.String)">escapeHtml</a> and some others. escapeHtml is probably the method you&#8217;ll need most, since most variables are outputted in HTML tags.</p>
<p>Why is this fix enough? Well, just test it out with the example above. We modify the JSP to escape the input before rendering it to the client, as shown below.</p>

<div class="wp_syntax"><div class="code"><pre class="jsp" style="font-family:monospace;">&lt;%
  String param = request.getParameter(&quot;searchString&quot;);
  if (param == null) param = &quot;&quot;;
  String escaped = org.apache.commons.lang.StringEscapeUtils.escapeHtml(param);
%&gt;
&lt;form action=&quot;search.jsp&quot; method=&quot;GET&quot;&gt;
  &lt;input type=&quot;text&quot; name=&quot;searchString&quot; value=&quot;&lt;%=escaped%&gt;&quot;/&gt;
  &lt;input type=&quot;submit&quot; value=&quot;Search&quot;/&gt;
&lt;/form&gt;</pre></div></div>

</p>
<p>The difference is that I added an escaping routine here. Nothing fancy, except that the &#8220;obvious&#8221; happens. The user input is shown on the screen in exactly the same way the user has entered it.</p>
<p>You see, no validation or black-/whitelists are needed, which is great, since you don&#8217;t want to bother users with security.</p>
<p>Some caveats, which become quite annoying later on.</p>
<ul>
<li>Timing is essential with regards to escaping. You cannot do it on your input parameters, since you&#8217;ll then end up with HTML encoded texts in your backend systems (that was the flaw I was talking about in the introduction). You also don&#8217;t want to be bothered with HTML encoding in your Java business logic, so input filtering is not the way to go. You definitely need to filter on the output, just before you render your page back to the client.</li>
<li>You may not escape a variable twice, since the user ends up with HTML codes in his/her UI. When escaping > twice, it first becomes &gt; and after the second pass, it becomes &amp;gt;, which is not only wrong with regards to usability, it also opens up some interesting new security holes.</li>
<li>When you persist user input unencoded, which you should, you also need to escape all variables that come from there before rendering the web page. Essentially, you just need to escape every variable that is inserted into the web page, since it may (directly or indirectly) be manipulated by a malicious user.</li>
</ul>
<h3>Implementation</h3>
<p>So, we need a way to escape all user input just before it is rendered. What are the options?</p>
<ul>
<li>
    <b>Custom JSP function</b><br />
    Probably the easiest way to escape special characters is by implementing a JSP custom function, like the JSF &#8220;<a href="http://java.sun.com/javaee/javaserverfaces/1.1_01/docs/tlddocs/h/outputText.html">outputText</a>&#8221; component: <br/>
<pre>&lt;h:outputText escape="true" value="&lt;script&gt;" /&gt;</pre>
<p>. This one is easy to implement. It&#8217;s just a burden to edit all 200 JSP files (and the same amount of JSP Tag files) in our project and add the function at all places where a variable is used. The second issue is that you also need to be sure you don&#8217;t escape a variable twice, since that results in strange output and, ironically, new security holes. This is especially the case with tag attributes. Do you escape in the tag or in the calling code? You&#8217;ll need a strategy there. Finally, and probably most importantly, this solution will probably become a maintenance nightmare. JSPs are cluttered with these custom functions and future maintainers may never forget one when making changes. Something more generic would be nice.
  </li>
<li>
    <b>ELResolver?</b><br />
    <a href="http://java.sun.com/javaee/5/docs/api/javax/el/ELResolver.html">ELResolvers</a> are added to JSP with version 2.1. We use Tomcat 6, so that&#8217;s great. However, the ELResolver mechanism differs greatly from, for example, the JSF <a href="http://kazed.blogspot.com/2007/10/create-your-own-jsf-variable-resolver.html">VariableResolver</a> mechanism. VariableResolvers are implemented using <a href="http://en.wikipedia.org/wiki/Decorator_pattern">decoration</a>, which makes them very useful to intercept all variable access and modify the responses. ELResolvers are implemented as a <a href="http://en.wikipedia.org/wiki/Chain-of-responsibility_pattern">CoR</a> and the implementation makes it impossible to create a custom filtering mechanism for all variables. It is also not the right place for this kind of logic. Why? Imagine a JSP Tag file. In the calling JSP file, you use EL to pass an argument to the tag and in the JSP Tag file, you use EL to put the argument in the HTML. With an ELResolver solution, variables get escaped twice. Not good.
  </li>
<li>
    <b>Tomcat hooks, AOP, Javassist?</b><br />
    There are other ways to hook into the JSP/Servlet lifecycle. Containers provide vendor specific hooks to hook into the lifecycle, but unfortunately, no Tomcat hooks to help us here. But we&#8217;re not defeated yet! We can use <a href="http://en.wikipedia.org/wiki/Aspect-oriented_programming">AOP</a> or a <a href="http://en.wikipedia.org/wiki/Javassist">bytecode library</a> to enhance some core Tomcat classes. Of course, we&#8217;re entering the black magic room here, but if it works&#8230; Unfortunately, these are also no real options. We can for example intercept all calls that go into the EL runtime by proxying all EL calls, but this has almost all of the issues as a custom ELResolver. Darn!
  </li>
<li>
    <b>Servlet Filter?</b><br />
    But what&#8217;s wrong with a plain old <a href="http://java.sun.com/products/servlet/Filters.html">Servlet Filter</a> which escapes HTML special tags afterwards? The issue with a Servlet Filter is that it can&#8217;t possibly see the difference between legitimate and injected markup. The Filter Either escapes all markup (including your own) or nothing (which renders it useless). After the JSP has been executed, you&#8217;ll end up with HTML. If there is an XSS attack, you are simply too late when you use a Servlet Filter. So, it has to do with timing. Let&#8217;s step back.
  </li>
<li>
    <b>Object graph XSS filtering!</b><br />
    There are two issues with most solutions presented.</p>
<ul>
<li>Timing: The solution is too early or too late, making it useless or wrong.</li>
<li>no. executions: The solution has the risk of escaping the variable more than once, making it wrong and unsafe.</li>
</ul>
<p>
    So, what&#8217;s the best point in the request processing lifecycle to do the escaping? It turns out that just before the JSP is executed is the ideal moment. It&#8217;s ideal, because you&#8217;re not too early, so you won&#8217;t notice anything in your Java code. It&#8217;s also ideal, since you have full control over execution, making it easier to execute once and only once.<br/>
  </li>
</ul>
<h3>Object graph XSS filtering</h3>
<p>The graph walker sounds like a solution, however, it&#8217;s a difficult one to implement.</p>
<p>The reason is the flexibility of our MVC framework. It&#8217;s allowed to pass arbitrary object graphs to the JSP. This is an issue, since you can&#8217;t just iterate over all attributes and escape them. For example, some attributes may be collections, maps or domain objects, containing possibly unsafe strings.</p>
<p>So we need an object walker. What features should it have?</p>
<ul>
<li>Cycle detection, to prevent infinite loops.</li>
<li>Ability to modify arbitrary objects, either using reflection (POJO&#8217;s) or API (Collections).</li>
</ul>
<h3>Implementation sources</h3>
<p>A hack and slash implementation of this solution is attached. What do you guys think of it? Does it look like a workable solution?</p>
<p>Ps. I&#8217;m aware of the fact that I haven&#8217;t implemented any pattern or followed any best practice. It&#8217;s just a simple PoC.</p>
<p><a href='http://blog.smart-java.nl/blog/wp-content/uploads/2009/06/objectwalker.zip'>Click here to download the ObjectWalker sources</a></p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/06/25/retrofit-your-webapp-with-generic-xss-protection/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Can software developers build usable GUI&#8217;s?</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/04/17/can-software-developers-build-usable-guis/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/04/17/can-software-developers-build-usable-guis/#comments</comments>
		<pubDate>Fri, 17 Apr 2009 14:11:16 +0000</pubDate>
		<dc:creator>Vincent Hartsteen</dc:creator>
				<category><![CDATA[Algemeen]]></category>
		<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Boeken]]></category>
		<category><![CDATA[Web]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=368</guid>
		<description><![CDATA[Most developers would answer this question the same way that Bob the Builder would when asked if he can fix it. &#8220;Yes I can&#8221; would be the reply. And I&#8217;m sure that Bob can. Why? Because Bob does his building based on a blueprint that tells him exactly what to build and what to use [...]]]></description>
			<content:encoded><![CDATA[<p>Most developers would answer this question the same way that Bob the Builder would when asked if he can fix it. &#8220;Yes I can&#8221; would be the reply. And I&#8217;m sure that Bob can. Why? Because Bob does his building based on a blueprint that tells him exactly what to build and what to use for building it.</p>
<p>In construction the architect is responsible for producing the blueprint. The architect talks to the customer in order to get a list of requirements for the building. For instance is the building used as an office-space or is it used as a family-home. Should it be a single or multi story building and how many rooms should it have. And of course the amount of money the customer is willing to spend. Geared with the list of requirements the architect uses his design skills and his technical knowledge to make a first sketch of the building. The architect discusses this sketch with his team of electrical, mechanical and structural engineers to find out if it feasible from a technical point-of-view. For instance is the construction mechanically strong enough, are the selected materials useable, etcetera. This should all result in a blueprint for a building that meets the customer&#8217;s requirements, is usable with respect to the needs of the customer and is aesthetically pleasing. Finally the constructionworkers take the blueprint and start building the construction.</p>
<p>In software development some similar process takes place. In short the information analist and the customer discuss about the requirements for the system. What functionality should the system offer, what is the expected load etcetera. The information analist works in close cooperation with the software architect. The software architect is responsible for making the technical decisions (which frameworks to use, what platform, how many nodes, etc.) such that the non-functional requirements (scalability, extensibility, manageability, etc.) can be met. The UML-artifacts created by the information analist and the software architect (e.g. use-case model, software architecture document, use-case realizations, etc.) are input for the developers.</p>
<p>One major difference between the blueprint for the construction workers and the UML-artifacts for the developers is that the blueprint has usability embedded. The architect has knowledge about how to make a construction usable to its users. When there is a requirement that a house must have 2 bathrooms the architect knows that to make them usable he could place one on the groundfloor and the other on the first floor. He could have fulfilled the requirement by putting both bathrooms on the attic but that would not have made them very useable. Very often a developers gets a requirement which basically comes down to: &#8220;the application must have a web-interface or a GUI and it should be user-friendly&#8221;. That&#8217;s it. So now it is up to the developers to figure out what &#8220;user-friendly&#8221; is.</p>
<p>Most developers have a technical mindset and find it hard to step into the role of the end-user. It is the end-user that decides if the application is user-friendly. He does so when the application supports the end-user to do his job without problems. So in order to design a user-friendly interface it must be clear what the application is used for and how it is used. Very often user-friendlyness is misconceived as &#8220;protect the end-user from making mistakes&#8221;. If end-users use the application every now and then to do some work it can be valid to pop-up a confirmation dialog. Expert users that use the application on a regular basis get annoyed by such dialogs.</p>
<p>There are currently lots of tools/frameworks that allow developers to build goodlooking user-interfaces: JavaFX, Flex, SAF, JSF, Wicket, and the list goes on. But goodlooking is not the same as user-friendly. With the tooling mentioned before we could build a very nice 3D, animated, gradient color dialog box that is very annoying (remember the Office paperclip?). So basically we have the tools to create cool and goodlooking user-interfaces but most of us lack the knowledge of how to build usable ones.</p>
<p>My point is that there should be a role in the softwaredevelopment team, just like a information analist or a software architect, that is responsible for designing the interaction with the end-user. Just like the architect in construction. I haven&#8217;t come across a person with such a role in the many projects I&#8217;ve worked on over the past years. Until that time it is up to us developers to design user-interfaces and for that we need to broaden our horizon and try to place ourselves on the end-user seat.</p>
<p>A few years ago I read &#8220;About Face. The Essentials of User Interface Design&#8221; (first edition) written by Alan Cooper which got me interested on this subject (3rd edition is the most recent version). The book describes the problems that occur when developers start designing user-interfaces and give advice on how to improve on that, all illustrated with entertaining examples. If you get into the situation where you as a developer are responsible for the design of the user-interface this book might help you do a better job.</p>
<p>Vincent Hartsteen</p>
<p>N.B.: Currently I&#8217;m reading &#8220;The Inmates are running the Asylum&#8221;, also by Alan Cooper. This book describes the role of &#8220;interaction designers&#8221;. I&#8217;ll write my findings when I&#8217;ve finished the book.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/04/17/can-software-developers-build-usable-guis/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Cyclische dependencies @ runtime</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/04/16/cyclische-dependencies-runtime/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/04/16/cyclische-dependencies-runtime/#comments</comments>
		<pubDate>Thu, 16 Apr 2009 11:01:40 +0000</pubDate>
		<dc:creator>Frank Verbruggen</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Spring]]></category>
		<category><![CDATA[patterns]]></category>
		<category><![CDATA[spring]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=353</guid>
		<description><![CDATA[In een gelaagd project zijn er meestal aparte lagen gedefinieerd voor services, domein logica en data access. Stel nu dat er een noodzaak is om vanuit een lager gelegen laag gebruik te maken van functionaliteit die thuis hoort in een hoger gelegen laag. Bijvoorbeeld omdat er in de domein objecten een noodzaak is om gebruik [...]]]></description>
			<content:encoded><![CDATA[<p>In een gelaagd project zijn er meestal aparte lagen gedefinieerd voor services, domein logica en data access. Stel nu dat er een noodzaak is om vanuit een lager gelegen laag gebruik te maken van functionaliteit die thuis hoort in een hoger gelegen laag. Bijvoorbeeld omdat er in de domein objecten een noodzaak is om gebruik te maken van een service. Hoe zorg je er dan voor dat je nette object georienteerde code houdt, zonder dat je compile time dependencies krijgt tussen je projecten / lagen? Het hier gepresenteerde design pattern is een elegante object georienteerde oplossing voor het bovenstaande probleem.</p>
<p>Gegeven een DomeinObject als object uit de lagere laag, en een ServiceImplementation als object uit de hogere laag. De serviceImplementation heeft een een interface methode &#8217;someMethod()&#8217; uit de ServiceInterface die benodigd is om de methode &#8216;businessMethod()&#8217; uit het DomeinObject te implementeren (zie figuur 1, originele klasse diagram).</p>
<p>N.B. Het design pattern is uitgewerkt met domein objecten en service implementaties, maar dat is slechts een invulling.</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
<img src="http://blog.smart-java.nl/blog/wp-content/uploads/2009/04/originele-situatie.jpg" alt="" /><br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
Figuur 1, originele klasse diagram<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>Los dit als volgt op. Trek de ServiceInterface uit de hogere laag, en zet deze in de lagere laag. Maak een RequiredInterfaceFactory die een instantie van de ServiceInterface kan bevatten. Initialiseer bij het opstarten van je applicatie vanuit de bovenste laag de RequiredInterfaceFactory met de ServiceImplementation (dit kan bijvoorbeeld goed met de Spring configuratie uit Listing 1, Spring configuratie). En implementeer de businessMethod door de aanroep naar de ServiceImplementation als volgt te abstraheren:</p>

<div class="wp_syntax"><div class="code"><pre class="java5" style="font-family:monospace;">RequiredInterfaceFactory.<span style="color: #006633;">getInstance</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">getServiceInterface</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span>.<span style="color: #006633;">someMethod</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></div></div>

<p>Het klasse diagram ziet er dan uit zoals weergegeven in Figuur 2, klasse diagram.</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;">&nbsp;</pre></div></div>

<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
Listing 1, Spring configuratie<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
<img src="http://blog.smart-java.nl/blog/wp-content/uploads/2009/04/design-pattern.jpg" alt="" /><br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-<br />
Figuur 2, klasse diagram<br />
&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;-</p>
<p><strong>Voordelen en nadelen</strong></p>
<p>Het gebruik van dit design pattern biedt de volgende voordelen:</p>
<ul>
<li>Compile time dependencies zijn niet nodig</li>
<li>At runtime kunnen interface methoden van bovenliggende klassen toch gebruikt worden</li>
<li>Ieder object kan gebruik maken van de interface, ongeacht in welk package het zit, dus het ophalen van de interface kan in de code altijd op dezelfde manier gebeuren</li>
</ul>
<p>Het gebruik van dit design pattern biedt de volgende nadelen:</p>
<ul>
<li>Je moet de RequiredInterfaceFactory injecteren met de ServiceImplementation voordat de rest van de applicatie aangesproken wordt</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/04/16/cyclische-dependencies-runtime/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Hoe agile is architectuur?</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/03/20/hoe-agile-is-architectuur/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/03/20/hoe-agile-is-architectuur/#comments</comments>
		<pubDate>Fri, 20 Mar 2009 14:33:59 +0000</pubDate>
		<dc:creator>Eric Jan Malotaux</dc:creator>
				<category><![CDATA[Algemeen]]></category>
		<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[agile]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=337</guid>
		<description><![CDATA[Sinds er software ontwikkeld wordt is één van de grootste uitdagingen daarbij hoe je ervoor zorgt dat je precies bouwt wat nodig is. Hoe houd je de afstand tussen klant en ontwikkelaar klein, terwijl er allerlei krachten werkzaam zijn die die afstand juist vergroten? Bijna iedere methode of programmeertaal belooft een oplossing voor dit probleem.
Neem [...]]]></description>
			<content:encoded><![CDATA[<p>Sinds er software ontwikkeld wordt is één van de grootste uitdagingen daarbij hoe je ervoor zorgt dat je precies bouwt wat nodig is. Hoe houd je de afstand tussen klant en ontwikkelaar klein, terwijl er allerlei krachten werkzaam zijn die die afstand juist vergroten? Bijna iedere methode of programmeertaal belooft een oplossing voor dit probleem.</p>
<p>Neem COBOL, één van de oudste programmeertalen: de naam COBOL betekent COmmon Business Oriented Language.  Het idee was dat met COBOL business mensen zelf konden programmeren. Dat is nooit gelukt natuurlijk. COBOL wordt algemeen beschouwd als een moeilijke programmeertaal, en COBOL programmeurs, hoewel een uitstervend ras, blijven nodig om de enorme hoeveelheid COBOL programmatuur die nog altijd in gebruik is te onderhouden.</p>
<p>SQL, zelfde verhaal. In SQL vertel je de computer, of de database eigenlijk, wat je met de gegevens wilt, maar niet hoe. Inmiddels wordt SQL zelfs voor gemiddelde programmeurs als te moeilijk beschouwd, en zijn er Object-Relational Mappers om ze tegen de &#8220;complexiteit&#8221; van SQL te beschermen.</p>
<p>Volgende kandidaat: UML.  We maken diagrammen, plaatjes die gebruikers kunnen begrijpen of zelf maken, en een code-generator die de plaatjes vertaald naar programma&#8217;s, en de gebruiker is weer &#8220;in control&#8221;. Ook deze aanpak, hoewel vrij recent, is inmiddels weer uit de gratie. Zie mijn eerdere post &#8220;<a title="Wat is er nieuw aan Model-Driven Development?" href="http://blog.smart-java.nl/blog/index.php/2009/02/13/whats-new-about-model-driven-development/" target="_blank">Wat is er nieuw aan Model-Driven Development?</a>&#8220;. Het gebruik van DSL&#8217;s in plaats van UML helpt wel een beetje, omdat een DSL nu eenmaal eenvoudiger is. Maar de verwachting van sommigen, dat het domein specifieke karakter van een DSL betekent dat een domein expert ermee kan programmeren kunnen we op basis van de eerdere voorbeelden gerust als naïef beschouwen. Al is het maar omdat het domein van een DSL vrijwel nooit het domein van de gebruiker is, maar dat van de programmeur.</p>
<p>Goed, nieuwe programmeertalen of in het algemeen technische hulpmiddelen helpen dus blijkbaar niet om de kloof tussen gebruiker en programmeur te verkleinen. Laten we het helemaal anders aanpakken. We hebben architecten nodig die de gebruiker helpen om de processen van zijn organisatie in kaart te brengen, en de automatisering daar precies op te laten aansluiten. En we introduceren een Service Oriented Architecture (SOA), zodat alle applicaties makkelijk met elkaar kunnen communiceren via services, en een process engine waarmee we de processen heel direct kunnen ondersteunen door de services in de juiste volgorde aan te roepen. Alleen, SOA experts worden niet moe om te benadrukken dat een SOA niet makkelijk in te voeren is in een organisatie, dat de beoogde voordelen &#8211; snelle aanpasbaarheid van de automatisering aan de veranderende eisen van de business &#8211; pas na jaren behaald zullen worden. Tenminste, als de invoering op de juiste manier wordt gedaan, en er gezorgd is voor een goede inbedding in de organisatie.  Governance is hier het toverwoord.</p>
<p>Klinkt allemaal prachtig, maar ik geloof er geen woord van. Ten eerste, hoezo snelle aanpassing aan veranderende eisen, als je eerst jaren moet investeren?  Ten tweede, governance wekt bij mij de associatie van &#8220;programming by commitee&#8221;. Moet dat sneller gaan dan programmeren door één of twee programmeurs? Ten derde, nu zitten de architecten tussen de gebruiker en de ontwikkelaars, zodat de afstand juist groter geworden is. De goede niet te na gesproken natuurlijk, maar veel van deze architecten hebben weinig affiniteit met de praktische problemen waar programmeurs mee te maken hebben bij het implementeren van een SOA. Ik hoor architecten vrijwel altijd klagen dat de programmeurs zich niet aan hun richtinggevende kaders houden. Hoe zou dat toch komen? Even opvallend is dat die organisaties de er een aparte afdeling architecten op na houden meestal ook degene zijn waar het hele software ontwikkelproces vrijwel tot stilstand is gekomen.</p>
<p>Is er dan helemaal geen hoop? Jawel, die is er. Die projecten die radicaal agile werken, volgens eXtreme Programming, Scrum, Evo of hoe ze ook heten, met een on-site customer, behalen ook een radicaal hogere productiviteit, en bouwen wat de klant wil. Behalve dat de verhalen overtuigend klinken heb ik het ook zelf ervaren. Dat beviel zo goed, dat ik eigenlijk niet meer anders wil en misschien ook niet meer kan.</p>
<p>Ondanks de goede resultaten is er toch veel weerstand tegen agile methoden.  Eén veelgehoord bezwaar is, dat grote systemen niet zonder architectuur kunnen, en dat een agile werkwijze tot chaotische onbeheersbare systemen leidt. Het eerste is waar: architectuur is nodig. Het tweede is niet waar: agile leidt niet noodzakelijk tot onbeheersbare chaos. Maar agile is geen vervanging voor vakmanschap. Alleen de manier waarop de architectuur tot stand komt is helemaal anders. Architectuur is een bijprodukt van software ontwikkeling. Het ontstaat tegelijk met de software zelf, en niet vooraf.  Alleen dan is er hoop dat de architectuur als een maatpak bij de applicatie past.</p>
<p>Systemen zijn tegenwoordig zo ingewikkeld, dat ze niet zonder architectuur kunnen. Diezelfde complexiteit is er ook de oorzaak van, dat die architectuur niet meer helemaal vooraf bedacht kan worden. Wordt dat wel geprobeerd, dan leidt dat tot starre systemen die moeilijk veranderd kunnen worden, precies wat diezelfde architectuur probeert te vermijden. Immers, het hoeft niet meer veranderd te worden, er is toch van tevoren over nagedacht? Kortom architectuur moet, maar wel via een agile proces.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/03/20/hoe-agile-is-architectuur/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Domain Driven Design in de Referentie Architectuur.</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/02/19/domain-driven-design-in-de-referentie-architectuur/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/02/19/domain-driven-design-in-de-referentie-architectuur/#comments</comments>
		<pubDate>Thu, 19 Feb 2009 15:01:02 +0000</pubDate>
		<dc:creator>Pieter van Boxtel</dc:creator>
				<category><![CDATA[Algemeen]]></category>
		<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Testing]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=306</guid>
		<description><![CDATA[In de huidige versie van de Referentie Architectuur hebben we verklaard aanhanger te zijn van Domain Driven Design. We hebben domein objecten gedefinieerd, maar wat doen we nu eigenlijk met het gedachtengoed van Evans? Een van de doelen voor de volgende versie is te kijken of we de patterns van DDD een plekje kunnen geven. [...]]]></description>
			<content:encoded><![CDATA[<p>In de huidige versie van de Referentie Architectuur hebben we verklaard aanhanger te zijn van Domain Driven Design. We hebben domein objecten gedefinieerd, maar wat doen we nu eigenlijk met het gedachtengoed van Evans? Een van de doelen voor de volgende versie is te kijken of we de patterns van DDD een plekje kunnen geven. Hieronder een poging daartoe, waarbij ik niet al te veel beschrijving of uitleg van de patterns geef. Voor een goede uitleg van de patterns is een niet te overtreffen bron, en dat is het <a href="http://domaindrivendesign.org/books/index.html#DDD" target="_blank">boek van Evans</a>.</p>
<p>Evans beschijft eigenlijk twee soorten patterns. De eerste soort geeft een onderscheid in typen objecten en methoden. Deze patterns zijn concreet terug te zien in een domein model. Dit zijn de patterns waar we echt iets mee kunnen in de referentie architectuur. De tweede soort zijn eerder proces patterns. Deze beschrijven wat abstractere doelen en wegen naar deze doelen. Deze patterns zijn moeilijk concreet te maken in een referentie architectuur en ze komen hier verder niet meer aan bod.</p>
<p>Kern van een DDD domein model zijn <strong>Entities (89)</strong> en <strong>Value Objects (97)</strong>. Dit zijn de klassieke OO objecten met een toestand en gedrag. Entities zijn objecten met een “identiteit”. Ze zijn niet uitwisselbaar. Ze kunnen gedurende hun levensduur veranderen, maar ze blijven uniek. Een persoon object is in de regel een Entity. Een persoon wordt ouder, zijn uiterlijk verandert, allerlei attributen wijzigen, maar een persoon blijft dezelfde identiteit. Value Objects daarentegen zijn wel uitwisselbaar. In een klantencontactsysteem is een adres niets meer dan een verzameling attributen. Een adres object kan hergebruikt worden, maar dat hoeft niet. Het maakt niet uit als in het systeem twee adres objecten geïnstantieerd zijn met hetzelfde adres. En als een persoon en zijn adres uit memory verdwijnen, dan is het wanneer de persoon opnieuw opgevraagd wordt niet belangrijk dat hetzelfde adres object teruggevonden wordt. (Voor een kadastraal systeem gaat deze redenatie uiteraard niet op&#8230;)</p>
<p>Onderscheid tussen Entities en Value Objects is waardevol bij het ontwerp van een applicatie. We kunnen andere ontwerpregels op deze typen objecten loslaten. Zo is het een goed idee om Value Objects immutable te maken. Wanneer een attribuut dient te wijzigen vervangen we het object gewoon door een nieuwe instantie. Fowler definieerde het onderscheid tussen Entities en Value Objects al in <a title="Patterns of Enterprise Application Architecture" href="http://www.martinfowler.com/books.html#eaa">Patterns of Enterprise Application Architecture</a>. Hij lijkt Value Objects echter te suggereren voor algemene, herbruikbare zaken, zoals Money. Evans gebruikt het onderscheid structureler door elk “klassiek” domein object in een van beide hokjes te duwen. Bijvoorbeeld in een facturatie systeem kunnen we facturen modelleren als Entities en factuurregels als Value Objects.</p>
<p>Ook geeft Evans enkele richtlijnen voor het opstellen van Entities en Value Objects:</p>
<ul>
<li>Gebruik <strong>Intention-Revealing Interfaces (246)</strong>, waarbij namen van klassen en methoden hun effect en doel beschrijven.</li>
<li>Gebruik <strong>Side-Effect-Free Functions (250)</strong>, factor deze uit in Value Objects en maak er <strong>Closed Operations (268)</strong> van.</li>
<li>Maak daarentegen van Entity methodes die de state wijzigen commands die geen domein klassen teruggeven. Gebruik <strong>Assertions (255)</strong> om post-condities en invarianten van deze methodes te controleren.</li>
</ul>
<p><strong>Services (104)</strong> zijn objecten met enkel gedrag. Ze bevatten methoden die geen logische plek hebben in een Entity of Value Object. De naam &#8216;Service&#8217; is enigszins verwarrend aangezien de term &#8217;service&#8217; in de Referentie Architectuur en andere architecturen in een andere context gebruikt wordt. Een betere naam is wellicht &#8216;Domain Service&#8217;.</p>
<p>Entities en Value Objects zijn de objecten in het domein model die gepersisteerd worden. <strong>Factories (136)</strong> en <strong>Repositories (147)</strong> zijn de domein objecten die voor persistentie verantwoordelijk zijn. Factories creëren domein objecten en Repositories bieden zoek-ingangen. Een Factory is in feite geen afzonderlijk type object. Een Factory wordt gerealiseerd door middel van een Factory Method (GoF) op een Entity of op een al dan niet speciaal voor het doel gecreëerde Service. Door vanuit de Factory Method te delegeren naar een &#8216;add&#8217; methode op een Repository bewerkstelligen we dat database logica in één type domeinobject gecentreerd wordt.</p>
<p>De eenheid voor persistentie is een <strong>Aggregate (125)</strong>. Dit is verzameling van een of meerdere entities en value objects, waarbij één entity, de root, als aanspreekpunt fungeert. Alle niet-root objecten in de aggregate staan uitsluitend ter dienste van de root en objecten buiten de aggregate hebben uitsluitend relaties naar de aggregate root.</p>
<p>Verder kunnen domeinobjecten gegroepeerd worden in <strong>Modules (109)</strong>, om zodoende structuur te creëren in het domeinmodel. Modules of Packages zijn standaard concepten in OO programmeertalen en modelleertechnieken.</p>
<p>Een <strong>Specification (224)</strong> tenslotte is een eenvoudig Value Object dat een (of meerdere) business rule controleert voor een ander object. Specifications zijn gebaseerd op predicatenlogica waardoor ze als logische condities zijn te combineren tot complexere business rules. Een Specification kan drie doelen hebben:</p>
<ol>
<li><em>Validatie</em> van een business rule op een object. Een toepassing hiervan is de manier waarop Mod4J met business rules in de domein DSL omgaat.</li>
<li>Een <em>selectie</em> criterium specificeren om een object uit een selectie te selecteren. Een toepassing hiervan is gebruik van een Specification als parameter voor een Repository.</li>
<li><em>Bouw</em> van een nieuw object specificeren.</li>
</ol>
<p>Onderstaand plaatje toont bovenstaande patterns en hun relaties:</p>
<p style="text-align: center;"><a href="http://blog.smart-java.nl/blog/wp-content/uploads/2009/02/ddd.png"><img class="size-medium wp-image-309 aligncenter" title="ddd" src="http://blog.smart-java.nl/blog/wp-content/uploads/2009/02/ddd.png" alt="" width="587" height="411" /></a></p>
<p>De hierboven genoemde patterns kunnen we zonder al te grote gevolgen in de referentie architectuur opnemen ter vervanging van / uitbreiding op de huidige typen business componenten. De vertaling gaat dan als volgt:</p>
<ul>
<li>De huidige Referentie Architectuur kent Business Processen, welke vergelijkbaar lijken met Domain Services. Business Processen zijn echter geen onderdeel van het domein model, ze worden niet aangeroepen door domein objecten. Verder stelt de Referentie Architectuur dat elke aanroep van een domein object via een business process gaat. In feite hebben we met deze Business Processen een extra “laagje” gecreëerd dat redundant is, aangezien erboven zich een verplichte servicelaag bevindt. Niets let ons om dat laagje te schrappen en domain services alleen te definieren als ze een toegevoegde waarde hebben.</li>
<li>De domein objecten zoals we ze nu kennen kunnen we onderverdelen in Entities en Value Objects.</li>
<li>Aggregates worden nu wel genoemd in de implementation view als eenheid van persistentie, maar ze worden niet expliciet onderkend als modelleereenheid in de logische view.</li>
<li>Specifications zijn nieuw ten opzichte van de referentie architectuur, hoewel Mod4J ze wel al toepast voor business rules.</li>
<li>Repositories zitten nu in een afzonderlijke laag, de datalaag, en heten Data Access Logic component. Echter, zoals ik in een <a title="Lagen in de referentiearchitectuur" href="http://blog.smart-java.nl/blog/index.php/2008/10/06/lagen-in-de-referentiearchitectuur/" target="_blank">vorige blog</a> betoogde is dat niet terecht en horen ze in de domeinlaag thuis.</li>
<li>Volgens diezelfde blog zijn de Service Agents en Data Service Agents zoals we ze nu kennen dezelfde dingen en kunnen we ze onderbrengen in een (nieuwe) laag, de Integratielaag.</li>
</ul>
<p>Mijn conclusie is dat we door toepassing van de DDD patterns de referentie architectuur consistenter en begrijpelijker kunnen maken. In het ontwerp van een applicatie komt meer structuur, terwijl het resultaat, de implementatie, nagenoeg niet wijzigt ten opzichte van de huidige referentie architectuur. Bijkomend voordeel van toepassing van DDD is dat we voor de educatie van de architectuur kunnen verwijzen naar het boek “Domain Driven Design” van Evans.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/02/19/domain-driven-design-in-de-referentie-architectuur/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Scala: too hot to handle?</title>
		<link>http://blog.smart-java.nl/blog/index.php/2009/01/29/scala-too-hot-to-handle/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2009/01/29/scala-too-hot-to-handle/#comments</comments>
		<pubDate>Thu, 29 Jan 2009 18:30:48 +0000</pubDate>
		<dc:creator>Hedzer Westra</dc:creator>
				<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Java language]]></category>
		<category><![CDATA[Clojure]]></category>
		<category><![CDATA[Scala]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=288</guid>
		<description><![CDATA[Het blog item van 21 januari sloot ik af met: &#8220;Ikzelf hoop binnenkort te komen met een bespreking van &#8216;Programming in Scala,&#8217; het eerste boek over deze nieuwe en spannende programmeertaal.&#8221;
Prompt reageerde Jan-Kees van Andel: &#8220;Ik verwacht niet dat Scala verder zal groeien qua populariteit. Op de JVM Language Summit 2008 was de algemene opinie [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">Het blog item van 21 januari sloot ik af met: &#8220;Ikzelf hoop binnenkort te komen met een bespreking van &#8216;Programming in Scala,&#8217; het eerste boek over deze nieuwe en spannende programmeertaal.&#8221;</p>
<p class="MsoNormal">Prompt reageerde Jan-Kees van Andel: &#8220;Ik verwacht niet dat Scala verder zal groeien qua populariteit. Op de JVM Language Summit 2008 was de algemene opinie dat Scala het niet gaat worden omdat het te moeilijk is. En dat &#8216;te moeilijk&#8217; komt uit de mond van hele capabele mensen (lees: JVM taalontwerpers). Zij waren vooral gecharmeerd van Clojure.&#8221;</p>
<p class="MsoNormal">Aangezien ik me ten doel heb gesteld om Scala dit jaar onder de aandacht te brengen bij mijn J-Tech collega&#8217;s, kan ik dat natuurlijk niet over mijn kant laten gaan <img src='http://blog.smart-java.nl/blog/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p class="MsoNormal">Als je het hebt over JVM taalontwerpers: de auteur van Scala, Martin Odersky, is bedenker van Java generics, en ontwikkelaar van de Java 1.2 reference compiler. Zelf maakt hij dus ook deel uit van deze groep, en hijzelf is nog allerminst klaar met Scala.</p>
<p class="MsoNormal">Onlangs is bekend geworden dat Java7 geen closures zal krijgen &#8211; men kon geen beslissing nemen. Eind augustus/begin september postte Jan-Kees nog hoopvol een artikel waarin de drie onderhanden voorstellen behandeld werden, maar helaas zal geen ervan het licht gaan zien in Java7. Dat op zich is al een goed argument om naar Scala te kijken: daar zitten closures rotsvast in de basis van de taal.</p>
<p class="MsoNormal">Natuurlijk heeft Clojure ook closures (vandaar de naam&#8230;), maar als je kijkt naar de syntax van Clojure, zie je meteen waar deze voornamelijk op gebaseerd is: LISP. En dat staat niet voor niets voor Lots of Irritating Silly Parentheses (eigenlijk LISt Processing, maar je snapt waar ik heen wil). LISP stamt uit de jaren &#8216;50/&#8217;60 van het vorige millennium en heeft begrijpelijkerwijs een inmiddels verouderde syntax. De Scala syntax daarentegen is grotendeels op Java gebaseerd, wat de overstap Java -&gt; Scala mijns inziens eenvoudiger maakt.</p>
<p class="MsoNormal">Een ander punt is dat ik een beetje gelogen heb toen ik zei &#8220;deze nieuwe [..] programmeertaal&#8221;: Scala is al sinds 2001 in ontwikkeling, en niet door de minste wetenschappers, en is zodoende al vergaand uitontwikkeld. Clojure begon pas in 2005.</p>
<p class="MsoNormal">Scala&#8217;s propositie is dat het een schaalbare taal is (&#8216;SCAlable LAnguage&#8217;). Niet alleen in de zin van performance bij gebruik op multi-core systemen, maar ook &#8211; en veel belangrijker &#8211; schaalbaar naar het type applicatie dat je er mee wilt bouwen. Dat varieert van script via desktop applicatie tot enterprise systeem. De syntax ondersteunt al deze applicaties op dezelfde manier. Dat gezegd hebbende, is Scala een rijke taal, die je &#8211; naar wens &#8211; kunt gebruiken als &#8216;Java on steroids&#8217;, dan wel als een bijna-pure functionele taal à la Haskell. Als je dat laatste wilt, en momenteel alleen Java beheerst, is dat inderdaad nogal een klus, en kan ik me voorstellen dat je de taal als &#8216;te moeilijk&#8217; bestempelt. Maar als je kiest voor het eerste, en je je de taal langzamerhand eigen maakt, kunt je zogezegd &#8216;opschalen&#8217; naar een meer functioneel gerichte programmeerstijl. Kan allemaal in Scala.</p>
<p class="MsoNormal">Wat betreft de functionele programmeerstijl (FP): die is echt fundamenteel anders dan de imperatieve OO programmeerstijl die voor bijvoorbeeld Java de voorkeur geniet. Dat moet je dus leren, en leren is altijd (in meer of mindere mate) moeilijk &#8211; maar om daar meteen de taal de schuld van te geven? Als je overstapt van C naar Java moet je ook een nieuwe, in het begin moeilijke, stijl aanleren, maar uiteindelijk is dat de moeite dubbel en dwars waard. Voor FP geldt dat evenzogoed. Als je echt &#8216;hardcore&#8217; puur FP wilt proeven, zou je eens kunnen kijken naar Haskell of ML. Maar dat is niet voor niets, ondanks jaren taalontwikkeling, nog steeds het domein van wiskundig opgeleide academici &#8211; programmeren in die talen wordt snel heel erg abstract (hoor ik iemand &#8216;Monads!&#8217; roepen?). Scala is veel meer &#8216;down to earth&#8217; en bereikbaar voor de gewone stervelingen.</p>
<p class="MsoNormal">Mocht je nog niet overtuigd zijn dat <a title="Scala" href="http://www.scala-lang.org/">Scala</a> toch echt wel wat meer aandacht verdient, check dan de <a title="Wiki" href="https://wiki.ordina.nl/confluence/display/JAVA/Scala">Wiki</a> die ik ingericht heb (helaas alleen voor Ordinamedewerkers). Daarop vind je een powerpoint van een presentatie die ik onlangs gaf bij een architectuur intervisie meeting, een draft cookbook en een Eclipse workspace met veel kleine codevoorbeeldjes. Wel eerst even de <a title="Scala plugin" href="http://www.scala-lang.org/scala-eclipse-plugin">Scala plugin</a> installeren!</p>
<p class="MsoNormal">Een special meeting Scala zit in de koker. Ook kun je extra blog items van me verwachten over bijvoorbeeld Liftweb en Sweet, twee web frameworks in Scala. A propos: ik ben nog op zoek naar reviewers voor mijn kookboek! Wie meldt zich aan als proeflezer?</p>
<div id="attachment_289" class="wp-caption alignnone" style="width: 103px"><a href="http://blog.smart-java.nl/blog/wp-content/uploads/2009/01/hedzerwestra_93x1231.jpg"><img class="size-medium wp-image-289" title="hedzerwestra_93x1231" src="http://blog.smart-java.nl/blog/wp-content/uploads/2009/01/hedzerwestra_93x1231.jpg" alt="Hedzer Westra" width="93" height="123" /></a><p class="wp-caption-text">Hedzer Westra</p></div>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2009/01/29/scala-too-hot-to-handle/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Flat file parsing met Spring Batch</title>
		<link>http://blog.smart-java.nl/blog/index.php/2008/11/07/flat-file-parsing-met-spring-batch/</link>
		<comments>http://blog.smart-java.nl/blog/index.php/2008/11/07/flat-file-parsing-met-spring-batch/#comments</comments>
		<pubDate>Thu, 06 Nov 2008 23:17:05 +0000</pubDate>
		<dc:creator>Richard Kettelerij</dc:creator>
				<category><![CDATA[Algemeen]]></category>
		<category><![CDATA[Architectuur]]></category>
		<category><![CDATA[Spring]]></category>
		<category><![CDATA[Tools/Frameworks]]></category>

		<guid isPermaLink="false">http://blog.smart-java.nl/blog/?p=240</guid>
		<description><![CDATA[Legacy systemen gebruiken vaak flat files voor gegevens uitwisseling. Bij integratie met dergelijk systemen moeten deze files worden geparst. In zo’n geval kan je natuurlijk zelf met de Java Scanner of StingTokenizer aan de slag gaan maar het is waarschijnlijk verstandiger om een bestaand framework te gebruiken. Spring Batch biedt hiervoor een elegante oplossing.
Mocht je [...]]]></description>
			<content:encoded><![CDATA[<p>Legacy systemen gebruiken vaak flat files voor gegevens uitwisseling. Bij integratie met dergelijk systemen moeten deze files worden geparst. In zo’n geval kan je natuurlijk zelf met de Java <tt>Scanner</tt> of <tt>StingTokenizer</tt> aan de slag gaan maar het is waarschijnlijk verstandiger om een bestaand framework te gebruiken. Spring Batch biedt hiervoor een elegante oplossing.</p>
<p>Mocht je overigens meer willen weten over Spring Batch en batch processing in het algemeen? Schrijf je dan in voor <a href="http://joost.ordina.nl/Default.asp/id,410/cursusid,828/index.html">Special Meeting op 18 november</a> a.s.</p>
<p><strong>Bestandsdefinitie</strong><br />
Als voorbeeld gebruiken we een fixed-width file afkomstig uit een legacy systeem van een fictieve online videotheek. Zoals gebruikelijk bij deze vorm van gegevensuitwisseling bestaat het bestand uit verschillende soorten records: Een header record met metadata als een bedrijfsnaam en batchnummer, een footer record met een controlegetal en uiteraard een aantal records met de daadwerkelijke informatie; in dit geval film titels.<br />
<code><br />
H   OrdinaVideoStore.nl    12<br />
M   Shrek II<br />
M   Lord of War<br />
M   Godfather, The<br />
M   Kungfu Panda<br />
F   0000000000000000006<br />
</code><br />
<strong>Tokenizing</strong><br />
Om bovenstaand bestand te kunnen parsen dient allereerst onderscheid te worden gemaakt tussen de verschillende soorten records. Aangezien dit uit het eerste karakter van elk record kan worden afgeleid gebruiken we de  <tt>PrefixMatchingCompositeLineTokenizer</tt>.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;movieFileLayout&quot;</span></span>
<span style="color: #009900;"><span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.transform.PrefixMatchingCompositeLineTokenizer&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;tokenizers&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;map<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;entry</span> <span style="color: #000066;">key</span>=<span style="color: #ff0000;">&quot;H&quot;</span> <span style="color: #000066;">value-ref</span>=<span style="color: #ff0000;">&quot;movieHeaderRecordLayout&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;entry</span> <span style="color: #000066;">key</span>=<span style="color: #ff0000;">&quot;M&quot;</span> <span style="color: #000066;">value-ref</span>=<span style="color: #ff0000;">&quot;movieRecordLayout&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
			<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;entry</span> <span style="color: #000066;">key</span>=<span style="color: #ff0000;">&quot;F&quot;</span> <span style="color: #000066;">value-ref</span>=<span style="color: #ff0000;">&quot;movieFooterRecordLayout&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
		<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/map<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;movieHeaderRecordLayout&quot;</span></span>
<span style="color: #009900;">	<span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.transform.FixedLengthTokenizer&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;names&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;recordtype, videostore, batchid&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;columns&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;1,5-25,28-30&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>Deze tokenizer geeft records op basis van het type door aan een nieuwe <tt>LineTokenizer</tt>. Zoals je in bovenstande configuratie kunt zien mapped deze tokenizer elke fixed-width kolom naar een bijbehorende kolomnaam. Hier zie je duidelijk de kracht van de configuratie mogelijkheden in Spring. Je geeft gewoon in de application context de ranges van gerelateerde kolommen op en Spring Batch doet de rest. Om deze magic mogelijk te maken heeft Spring wel wat hulp nodig in de vorm van een <tt>CustomEditorConfigurer</tt>. Deze vertaald de range definities (5-3, etc) naar betekenisvolle <tt>Range</tt> objecten.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;customEditorConfigurer&quot;</span></span>
<span style="color: #009900;">	<span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.beans.factory.config.CustomEditorConfigurer&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;customEditors&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;map<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;entry</span> <span style="color: #000066;">key</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.transform.Range[]&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.transform.RangeArrayPropertyEditor&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/entry<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/map<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p><strong>Mapping naar domein objecten</strong><br />
Vervolgens is het de bedoeling om de kolom/veld definities naar domein objecten te vertalen. Hiervoor heeft Spring Batch de <tt>FieldSetMapper</tt> interface geïntroduceerd. Hoewel we hiermee zelf de mapping van velden naar domein objecten kunnen verzorgen is het mogelijk om dit aan Spring Batch over te laten middels een <tt>FieldSetMapper</tt> gebaseerd op Spring’s <tt>BeanWrapper</tt>. Er moet echter ook onderscheid worden gemaakt tussen de verschillende record soorten. Hiervoor is &#8211; in tegenstelling tot de prefix-enabled tokenizer &#8211; geen standaard fieldset mapper aanwezig. Daarom schijven we zelf een eenvoudige mapper:</p>

<div class="wp_syntax"><div class="code"><pre class="java" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">class</span> PrefixAwareMovieFieldSetMapper <span style="color: #000000; font-weight: bold;">implements</span> FieldSetMapper <span style="color: #009900;">&#123;</span>
	<span style="color: #000000; font-weight: bold;">private</span> FieldSetMapper headerFieldSetMapper<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> FieldSetMapper movieFieldSetMapper<span style="color: #339933;">;</span>
	<span style="color: #000000; font-weight: bold;">private</span> FieldSetMapper footerFieldSetMapper<span style="color: #339933;">;</span>
&nbsp;
	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #003399;">Object</span> mapLine<span style="color: #009900;">&#40;</span>FieldSet fieldSet<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
		<span style="color: #000000; font-weight: bold;">final</span> <span style="color: #003399;">String</span> recordType <span style="color: #339933;">=</span> fieldSet.<span style="color: #006633;">readString</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;recordtype&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>recordType.<span style="color: #006633;">equals</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;H&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">return</span> headerFieldSetMapper.<span style="color: #006633;">mapLine</span><span style="color: #009900;">&#40;</span>fieldSet<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000000; font-weight: bold;">else</span> <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>recordType.<span style="color: #006633;">equals</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;M&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">return</span> movieFieldSetMapper.<span style="color: #006633;">mapLine</span><span style="color: #009900;">&#40;</span>fieldSet<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000000; font-weight: bold;">else</span> <span style="color: #000000; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>recordType.<span style="color: #006633;">equals</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;F&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
			<span style="color: #000000; font-weight: bold;">return</span> footerFieldSetMapper.<span style="color: #006633;">mapLine</span><span style="color: #009900;">&#40;</span>fieldSet<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
		<span style="color: #009900;">&#125;</span>
		<span style="color: #000000; font-weight: bold;">throw</span> <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #003399;">IllegalStateException</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">&quot;onbekend record type&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
	<span style="color: #009900;">&#125;</span>
	<span style="color: #666666; font-style: italic;">// setters</span>
<span style="color: #009900;">&#125;</span></pre></div></div>

<p>Wanneer we deze <tt>FieldSetMapper</tt> configureren is de mapping van tokens naar domein objecten compleet.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;movieFileFieldSetMapper&quot;</span></span>
<span style="color: #009900;">	<span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;nl.ordina.batch.PrefixAwareMovieFieldSetMapper&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;headerFieldSetMapper&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;movieHeaderFieldSetMapper&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;movieFieldSetMapper&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;movieFieldSetMapper&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;footerFieldSetMapper&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;movieFooterFieldSetMapper&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;movieHeaderFieldSetMapper&quot;</span> </span>
<span style="color: #009900;">   <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.mapping.BeanWrapperFieldSetMapper&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;prototypeBeanName&quot;</span> <span style="color: #000066;">value</span>=<span style="color: #ff0000;">&quot;headerRecord&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
&nbsp;
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;headerRecord&quot;</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;nl.ordina.batch.model.MovieHeaderRecord&quot;</span></span>
<span style="color: #009900;"><span style="color: #000066;">scope</span>=<span style="color: #ff0000;">&quot;prototype&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span></pre></div></div>

<p><strong>Parsen</strong><br />
Nu we hebben gedefinieerd welke karakters naar welke velden mappen, en welke velden naar welk domein object mapped wordt het tijd om bestanden te parsen. Hiervoor gebruiken we een <tt>FlatFileItemReader</tt> die we injecteren met de eerder gecreërde tokenizer en fieldset mapper. Geconfigureerd binnen een batch job ziet dit er als volgt uit. Wanneer deze job wordt uitgevoerd ontvangt de &#8220;DummyWriter&#8221; voor elke record in het bronbestand een overeenkomstig domein object.</p>

<div class="wp_syntax"><div class="code"><pre class="xml" style="font-family:monospace;"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;videotheekJob&quot;</span> <span style="color: #000066;">parent</span>=<span style="color: #ff0000;">&quot;simpleJob&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;steps&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;list<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">id</span>=<span style="color: #ff0000;">&quot;printRecordStep&quot;</span> <span style="color: #000066;">parent</span>=<span style="color: #ff0000;">&quot;simpleStep&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;itemReader&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	   <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;bean</span> <span style="color: #000066;">class</span>=<span style="color: #ff0000;">&quot;org.springframework.batch.item.file.FlatFileItemReader&quot;</span><span style="color: #000000; font-weight: bold;">&gt;</span></span>
	     <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;resource&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;file:///videotheek/films.txt&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	     <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;lineTokenizer&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;movieFileLayout&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	     <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;fieldSetMapper&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;movieFileFieldSetMapper&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
	   <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
	<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property</span> <span style="color: #000066;">name</span>=<span style="color: #ff0000;">&quot;itemWriter&quot;</span> <span style="color: #000066;">ref</span>=<span style="color: #ff0000;">&quot;dummyWriter&quot;</span> <span style="color: #000000; font-weight: bold;">/&gt;</span></span>
      <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
    <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/list<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
  <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/bean<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></pre></div></div>

<p>Zoals je wellicht opvalt wijst de <tt>FlatFileItemReader</tt> naar een hardcoded filepath. Momenteel biedt Spring Batch nog geen mogelijkheid om dit voor meerdere bestanden configurabel te maken. Dit heb ik inmiddels gemeld via een <a href="http://jira.springframework.org/browse/BATCH-905">JIRA ticket</a> en een oplossing bijgevoegd in de vorm van een <tt>DynamicMultiResourceItemReader</tt>. </p>
<p>Tot slot, Spring Batch biedt m.i. behoorlijke ondersteuning om diverse soorten bestanden te parsen. Dit is met name handig in applicaties die reeds gebruik maken van het Spring Framework. Maar Spring Batch bevat méér en is zeker het overwegen waard buiten pure Spring applicaties.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.smart-java.nl/blog/index.php/2008/11/07/flat-file-parsing-met-spring-batch/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
