One of the biggest problems of web development right now is bloat.

Bloat manifests itself in several forms:

The question I want to talk about today is how this happens and what can be done about it. There are several issues we can tackle and we can approach the problem from different sides. I will not talk about server optimization or different connections. I will talk about how frontend code will turn bad and what can be done to avoid it.

There is a lot of talk about this and without a doubt there will be comments like "nothing new" and "this has been practice for years in Java" and so on, but if this is so common, how come we have to deal with bloat all the time? When was the last time you sat down on an existing project and immediately knew what was going on?

Reason#1: Wrong perception of time needed to accustom ourselves to a project

And this is actually the first reason for bloat: people don't get the time to analyze a system before they get allocated to it. From day one onward a new resource is supposed to be at 100% efficiency and time spent on handover is time wasted. After all, we are all clever "coding guys" and should all work the same way and very logical, right?

There is no easy solution for this problem, other than making sure the people allocated to the problem have time to get to grips with it. As a lead developer or manager this means that you request handover time and a learning period for the developer allocated to the other product.

Reason#2: Maintenance without the right tools

If this time is not allocated the following code atrocities happen:

In CSS you will find constructs like these:

html body #content #mainsection li a:link{
  color:#333;
}
html body #content #mainsection li a:visited{
  color:#000;
}
html body #content #mainsection li a:hover{
  color:#999;
}
html body #content #mainsection li a:active{
  color:#999;
}

You artificially bump up the specificity as there is no way to find out where the original setting came from. There is, however, more on that later. Other classics are the use of !important all over the place, or adding lots of nested DIVs with IDs to add extra specificity or even style attributes to override what is already there.

In JavaScript, the same happens when people just randomly add new event handlers to elements as there is no clue why something happens. In a script environment that used addEventHandler instead of inline event handlers this is possible. The other, more common way is just to add an onchange, onclick or onmouseover inline event handler.

Add this several times and you have a very bloated, hardly maintainable product.

What is the solution for it? A little thing called Firebug given to use by Joe Hewitt.

Firebug allows you to inspect any element and find out what its CSS is and more importantly where it was inherited from. For JavaScript, it allows you to inspect the DOM of every element and set stop points inspecting what is happening right here and now.

Reason#3: Bad or non-existent documentation

Another worthy cause of bloat is the lack of or simply bad documentation. Documentation is a pain to do, as it requires you both to know the system and to make it understandable to people who don't understand it. What more logical choice to make than to make the initial developers create the documentation from comments in the source code. That way you ensure that only people on the same level and same mindset understand it.

Really good documentation should be written by someone who gets the system explained to them and knows how to write. The original developer should only be there to help out in the case this person gets stuck and needs more information.

Good documentation should also not knock out the reader but invite her to play with the system. You can have your JavaDoc style documentation, but in order to make your system maintainable, there should be additional cookbook style documentation. Do not bother with "Hello world" examples, nobody needs them. Explain the system with real implementation examples, not a "perfect world scenario".

Reason#4: People do not read

This may sound harsh, but I can prove it. This is a comment on Amazon about my book.

"I keep running into a custom object in the code examples of the book called "DOMhelp". (…) For example, instead of using the actual DOM methods to get all the links on the page and loop through them, he shows you a line of code that just says "DOMhelp.getlinks". Yes, that line does the same thing by accessing his object and running the regular DOM functions, but what does it teach me? Nothing. That alone is a big enough annoyance to regret buying this book."

There is no function with that name, and chapter 4 explains that the library DOMhelp.js is assembled from all the tool methods used in chapter 1 to 3.

Another example: one of my clients had a third party company developing a site for them. They needed a JavaScript that showed some popups which were JS dependent. I did something like this for them:

popups = {
  // id of the section with the popup links
  popupsContainerId:'main',
  // id of the login show link
  loginLinkId:'loginshowtrigger',
  // id of the subscribe show link
  subscribeLinkId:'subscribeshowtrigger',
  // text of the login slideshow link 
  loginLabel:'how to login',
  // text of the subscribe slideshow link 
  subscribeLabel:'how to subscribe',

  init:function(){
    var o = document.getElementById(popups.popupsContainerId);
    if(o){
      popups.loginLink = document.createElement('a');	
      popups.loginLink.setAttribute('href', '#');
      popups.loginLink.appendChild(popups.loginLabel);
      popups.loginLink.id = popups.loginLinkId;
      popups.addEvent(popups.loginLink, 'click', popups.loginShow);
      o.appendChild(popups.loginLink);

      popups.subscribeLink = document.createElement('a');	
      popups.subscribeLink.setAttribute('href', '#');
      popups.subscribeLink.appendChild(popups.subscribeLabel);
      popups.subscribeLink.id = popups.subscribeLinkId;
      popups.addEvent(popups.subscribeLink, 'click', popups.subscribeShow);
      o.appendChild(popups.subscribeLink);
    }	
  },
  loginShow:function(){…},
  subscribeShow:function(){…},
  addEvent:function(){…}
}
popups.addEvent(window,'load',popups.init);

When I got the full site back from the provider with the rest of the content and the styling, I found this in the HTML:

<a href="javascript:popups.loginShow()">Show login demo</a>
<a href="javascript:popups2.subscribeShow()">Show subscribe demo</a>

And my script mutated:

popups = {
  […]
  init:function(){
    var o = document.getElementById(popups.popupsContainerId);
    if(o){
      […]
      // o.appendChild(popups.loginLink);
      […]
      // o.appendChild(popups.subscribeLink);
    }	
  },
  loginShow:function(){…},
  subscribeShow:function(){…},
  addEvent:function(){…}
}
popups2 = {
  […]
  init:function(){
    var o = document.getElementById(popups.popupsContainerId);
    if(o){
      […]
      // o.appendChild(popups.loginLink);
      […]
      // o.appendChild(popups.subscribeLink);
    }	
  },
  loginShow:function(){…},
  subscribeShow:function(){…},
  addEvent:function(){…}
}
popups.addEvent(window,'load',popups.init);
popups2.addEvent(window,'load',popups2.init);

I don't blame this fact on the premise that people are stupid, I blame it on a natural instinct of especially men to want to find their own way out of a situation.

Here is a normal happening:

Clever companies learnt about this behaviour and gave up on written documentation. Instead they use cryptic iconography, hard to pronounce names and leisurely add too many or not enough assembly parts in their packaging. That way the assembly becomes more of an adventure and this instinct is satisfied.

Heilmann's law of documentation: your documentation is only as good as the worst recipient.

First follow-up fact: this is the only person that will contact you about your documentation.

Second follow-up fact: this is also the only person that will write about your product elsewhere.

Reason#5: Lack of awareness

Another cause for bloat is that people don't bother to understand a script or library before they implement it. I've seen several times that web sites had prototype, jQuery and YUI included as each had a demo script that looked cool or provided a certain effect.

Furthermore, these sites tended to include the wrong versions of the YUI. YUI components come in three flavours: a minified version, a build version and a debugging version.

The only version to use in a live system is the minified version. The build version is good if you want to see what is going on during development and the debug version already provides you with lots of debugging messages thrown out to the console or the logger widget.

What not many people seem to have found out is that the YUI is also available as a hosted version. These files are hosted on a distributed network of servers and are automatically minified and packed. If a user has been on another site with these script includes they will already be cached.

Another problem is that implementers don't realize that the scripting enabled version of the site is not what every user can see or experience. While it is easy to show and hide a lot of the content and pack a lot into carousels and tabs, you should always check how much information a visitor without JS has to deal with. If that is just too much then there is always the option to only load extra content on demand or after the main document has loaded with Ajax.

Code bloat also happens when people don't understand the technologies at their disposal and try to shoe-horn their specialized knowledge into the grander picture.

How many times have you seen constructs like this:

<ul>
  <li class="list-item">The Passenger</li>
  <li class="list-item currentlyplaying">Louie Louie</li>
  <li class="list-item">I want to conquer the world</li>
  <li class="list-item">Foxtrott Uniform Charlie Kilo</li>
</ul>

li.list-item{ padding:.5em; font-family:courier;color:#000; }
li.currentlyplaying{ color:#c00; }

CSS for the logically challenged:

Start with a global whitespace reset:

*{
  margin:0;
  padding:0;
  list-style:none;
  border:none;
}

Reason: no more surprises from "browser stylesheets" - there is no unstyled document on the web.

Now define the most common elements you use and give them a predefined style:

body{
  font-family:helvetica,arial,sans-serif;
  background:#fff;
  color:#333;
  padding:2em;
}
p,li {
  padding-bottom:.5em;
  line-height:1.3em;
}

h1,h2,h3,h4,h5,h6{
  padding-bottom:.5em;
}

Then override or extend these settings with special cases inside elements with IDs:


#nav li{
  padding:1em .5em;
}

#header p{
  border:1px solid #999;
  background:#ddd;
}

<ul id="playlist">
  <li>The Passenger</li>
  <li>Louie Louie</li>
  <li>I want to conquer the world</li>
  <li>Foxtrott Uniform Charlie Kilo</li>
</ul>

Then you can detect the exceptions to the rule that need extra formatting. For these you can use CSS classes.

<ul id="playlist">
  <li>The Passenger</li>
  <li class="currentlyplaying">Louie Louie</li>
  <li>I want to conquer the world</li>
  <li>Foxtrott Uniform Charlie Kilo</li>
</ul>

#playlist li{ padding:.5em; font-family:courier;color:#000; }
#playlist li.currentlyplaying { color:#c00; }

The best thing about Cascading Style Sheets is that they do cascade. Instead of re-define, try to re-use.

Reason#6: Failure to specialize

This reason is quite interesting as it is a bit of a double-edged sword. Code bloat can happen when you don't allow your code to fulfil a special role, but instead try to do everything at once.

This is a problem of dealing with expectations.

The same applies to libraries. In my book, the most important thing a JavaScript library should do is solve the random browser and language problems we have to work around every time we build something with JavaScript. If it does that, I am happy, as I know how to write JavaScript.

The most acclaimed JavaScript libraries however go much further; they extend the syntax of scripting, offer pre-made transitions and interfaces into the document via XPATH, CSS Selectors or whatever will come next.

The claim is that these libraries make development easier, but what this approach means is that I need to know the library and its methods to use it. I also need to ensure that the HTML stays the same and that is almost impossible.

Another overhead is that not all libraries use the same method names and parameter order. This makes it harder to jump from one to another and in my book that is not an aid to reach our goal faster.

Right now I can look for a JavaScript developer if I want to hire someone. In the nearer future when we used different libraries on different projects we will have to find a JavaScript developer who knows all of these. The same thing happened with CMS in the last few years. The question is if we want that.

The other problem that very clever scripts have is that they allow for abusing them. Maybe it is not a bad idea that a multi-level-dropdown menu or transition-enhanced slide show with zoom options can be used only once per document and not as often as an inexperienced developer wants to use it. JavaScript should be an aid to create interfaces that help the users, not allow the maintainer to do whatever they want.

Sounds arrogant? Maybe, but I'd rather see my scripts not used than used in a fashion that makes a site inaccessible or grinds my computer down to a halt.

Reason#7: Lack of a front-end build process

My final reason for today is that web development is still not getting the kudos it should be getting. While there are code freezes on the backend and clever build scripts that turn the maintained code into live code, front-end code is still fair game to anybody who wants to have a go. How many times have you had to go into Vi and do a last minute change on a live server, or - even worse - how many times did you come into the office next day and someone else applied a quick fix over night?

HTML, CSS and JavaScript are still not considered real development skills, and as everybody who works for a web development company has done a table or some font tags that makes them good enough to change things to what they seemingly need to be after you've messed up.

There are ways out of that situation, one of them proposed by the IWA/HWG in association with the W3C, allowing web developers to get trained as certified developers. For us this seems ludicrous, but it may give our trade a bit more glitz in the eyes of business owners.

The other proposal I have is to minify the heck out of your code when it gets deployed. Remove all whitespace, comments and indentation. Concatenate different CSS and JS includes into single files - there is a great script by Edward Eliot available for that at http://www.ejeliot.com/blog/73. Use CSS sprites to make sure that not too many images are used. This means that your web sites will be smaller and load faster. They also render faster as there are less HTTP requests when the page is loaded and the code is much harder to read and mess around with. Once you minified the code, make sure to add some scary comments. You can either use a real build script that adds build comments or come up with own inventions. Examples:

<!--  built on 12.03.07 12.33 GMT Checksum E5322AE - OK, build continues  -->
<!--  built on 14.03.07 08.00 Checksum error!- build failed, sent notification mail -->
<!--  built on 14.03.07 08.10 Checksum E5322F3 - fix initiated and successful -->
<!--  verified fix mstephens 14.03.07 09.00 -->  (mstephens should be someone in upper management) 

/* Internet Explorer hack to prevent colour bleed and sync loss*/

/* do not change order or Firefox will go into hasWrongRender mode */

<!-- Whitespace HTML rendering mode on, do not change! -->

/* Last change stopped Google indexing this, please don't change anything! */

That's all, thank you! All that remains is to remind you of the Open Hack Day on 14th/15th/16th of June in Alexandra Palace. Sign up at http://www.hackday.org.