Intro to Load Balancing for Developers – The Gotchas

Well, now that we have discussed how you got here, and what options load balancing offers you, let’s hit upon what you’re up against when porting your application to run under a load balancer. Many resources online say that most applications can be moved behind a load balancer unchanged. I beg to differ with this view, there are a host of issues that crop up even if your application is not retaining state information. So we’ll hop right into it. Note that all of these core issues have solutions, some have many options for resolving them, but if you don’t know they exist, you’ll be blindsided by them, forewarned is indeed fore-armed.

If you’re new to this series, you can find the complete list of articles in the series on my personal page here

Logging

Your system no doubt uses logs for a variety of things including management reporting, security auditing, and problem resolution. Most applications do, and load balancing makes utilizing logs harder. The problem is that your tools are all pointed at the logs on server1, and that was the entirety of your logs… But now you have logs on server1, server2, server3, server4, and all must be combined in some way to give you accurate reporting. Some load balancers and most ADCs take care of this for you with centralized logging or even reporting replacement functionality. Some, but not all. There are 3rd party tools out there for log aggregation, some commercial, some open source. I’d get one and familiarize yourself with it while your application is still in test – if you can make a log aggregation tool work for you, then your code doesn’t need to change at all, and none of your functionality breaks… You just have to point your reporting mechanisms at the location you use to store your aggregated logs. One problem resolved.

Client IP tracking

Lori and I have an application that records the IP address of those who log in, which sits behind a load balancer (an ADC, actually). While the IP address recording was just thrown in there because we could, I actually started using that information for reporting of people logging in as guest. All it does (and all most of this type of function does) is pull the IP Address out of the headers and throw it into a database. The problem is that our load balancer is a proxy for all users. That allows us to expose the Virtual IP and not actually expose any of our servers to the world except through the IP/Ports we dictate. Unfortunately, by virtue of being a full proxy, it replaces the IP Address field with the load balancers’ IP address. So it appeared that everyone in the world was logging into our app from the load balancer. Not the best situation for the reporting I was doing, and really not the best for our web server logs.

The easy fix for this one is to change your source code to work off of the x-forwarded-for header and make certain that your load-balancer is configured to support x-forwarded-for. Sadly, some load balancers don’t support this header, so you’ll have to think of something more inventive in those cases or eliminate the need to track the IP of users.

Persistence

Whoo boy, there are few words to make a developer with experience developing behind a load balancer shudder like persistence. Here’s the deal, if your app is tracking state, and that state is stored on the web/app server, then when the user returns and gets directed to a different server, you’ve lost all context for their experience. There are a variety of ways to fix this in any load-balanced environment, but they either aren’t optimal or require not just recoding, but re-architecting portions of your application. If you don’t maintain state, or use the browser to maintain state for you by passing it back with each response, then this is a total non-issue for you and you can move along – because the client will supply context info each time it returns, you don’t have to worry about which server you’re going to.

Which brings us back to the first option – use the browser to track state. In large applications with lots of database interaction this option isn’t feasible, but in smaller applications that just have controls on a page being fed back with each submit, you can do this rather readily. It is more difficult in newer applications – AJAX apps and advanced .NET functionality, but it’s wholly doable, just takes some forethought about how your application is used and what goes where.

The easiest solution to this whole problem is the group sidearm/server affinity/persistent connections. All of these options let you pass off a request to the server, and then always return to the same server (though how you return is different for each), but this introduces some issues of its own. For one, the ability to balance load amongst your servers is minimized because the load-balancer makes its decisions only on the first trip to the server – with some advanced load balancing algorithms that take server feedback these technologies can actually negate the benefits of load balancing. Still, this is the right solution for apps that have a pretty evenly spread load across all pages, so consider your client use cases and think about whether one of these technologies will solve your problem.

Another solution is to shift the storage of per-connection persisted data to the database. Unless you have a pretty high-end database server, both in hardware and software, this just moves the problem. If you’re running something like Oracle RAC, it’s a viable option, but if you’ve got a single-instance database on a low-to-mid-tier commodity server, you’re probably not going to be satisfied with this solution – it takes code to implement, and if you rearrange your code and then at the end discover that you have just switched the load and the single point of failure to your database, you will likely not be a happy camper. Thus, I don’t recommend this course, though in some situations it might be the right one.

Finally, you could rewrite the app to not utilize state at all. This is more work if your app is already finished… But it is the most robust of all of these solutions. If you’re just designing an app that you hope to be huge, avoid the Fail Whale and write it this way from day one.

SSL persistence

Another nasty bit – that is very similar to the persistence header above, but has unique problems of its own is SSL persistence. Yes indeedy, it’s passingly difficult to decrypt a stream that was encrypted for another server’s public key. And while this issue can be resolved by giving all the app servers that run a particular application the same cert (this is done, I suspect SANS doesn’t approve), there are other issues. Like the fact that when a client comes back and is directed by the load balancer to a different server, there is no existing connection, so the client and server have to renegotiate. Some load balancers and most ADCs provide SSL termination to resolve this issue, terminating the SSL session at the load balancer and communicating from the load balancer to the backend server – because it is all on your private network – in the clear. For security reasons, in some applications this is not a viable option, and even if it is, you need to check with your load balancing vendor to see if they support this mechanism. The most common solution to this problem is the sidearm/server affinity/persistent connections set of solutions mentioned above, because once a client connects to a server it is always redirected to that same server and all of these issues go away. Just test the effect this will have on your load balancing algorithm before going this route.

 Other options and issues

Of course in this short blog post I can’t hit everything, but these are the major issues I’ve seen. And I’m not touching on the things that a full-blown ADC can do for you that load balancers don’t. I think I’ll take next week’s blog to review load balancing algorithms so you know which does what, then we’ll start to peel away the power of an ADC – which really is amazing in comparison to simple layer 4 (commonly called L4 by networking folks) load balancing.

Until next time,

Don.

Published Mar 25, 2009
Version 1.0

Was this article helpful?

3 Comments

  • Don_MacVittie_1's avatar
    Don_MacVittie_1
    Historic F5 Account
    Hey Sriram!

     

     

    Very cool that you're getting things done that are moving into production!

     

     

    It's kind of hidden. Once you know where to look, it makes sense, but the first time it's a little painful. Log into the GUI, choose pools, select any pool, and then choose members. It's the first dropdown.

     

     

    I can argue that makes sense because the members are what is balanced... But I expected to find it at the top level of Pool when I first went looking.

     

     

    Tomorrow's article is about the different algorithms - all of the ones we support and ones with a following that we don't, mostly those used by popular software vendors.

     

     

    Hope that helps!

     

    Don.
  • Don_MacVittie_1's avatar
    Don_MacVittie_1
    Historic F5 Account
    Albert,

     

    Thanks for commenting! Mentioning RAC was certainly not meant as an exclusive reference, just wanted to mention someone most of my readers would recognize. It probably helps that they're a partner that I've been working with for the last few months, so RAC sprang to mind while writing.

     

    I just read the database scale-out document on your site, looks great, though that's of course a first-blush impression.

     

     

    Your example leaves me thinking of queue processing... How to ensure that only one server is servicing a DB-based queue... Hmmm. ;-)

     

     

    Thanks again,

     

    Don.

     

  • Don_MacVittie_1's avatar
    Don_MacVittie_1
    Historic F5 Account
    Rick:

     

    Agreed, but wanted to leave that for the ADC discussion because it is a more advanced solution. Will check out your stuff before I get there.

     

     

    Regards,

     

    Don.