Microservices and HTTP/2
It's all about that architecture. There's a lot of things we do to improve the performance of web and mobile applications. We use caching. We use compression. We offload security (SSL and TLS) to a proxy with greater compute capacity. We apply image optimization and minification to content. We do all that because performance is king. Failure to perform can be, for many businesses, equivalent to an outage with increased abandonment rates and angry customers taking to the Internet to express their extreme displeasure. The recently official HTTP/2 specification takes performance very seriously, and introduced a variety of key components designed specifically to address the need for speed. One of these was to base the newest version of the Internet's lingua franca on SPDY. One of the impacts of this decision is that connections between the client (whether tethered or mobile) and the app (whether in the cloud or on big-iron) are limited to just one. One TCP connection per app. That's a huge divergence from HTTP/1 where it was typical to open 2, 4 or 6 TCP connections per site in order to take advantage of broadband. And it worked for the most part because, well, broadband. So it wouldn't be a surprise if someone interprets that ONE connection per app limitation to be a negative in terms of app performance. There are, of course, a number of changes in the way HTTP/2 communicates over that single connection that ultimately should counteract any potential negative impact on performance from the reduction in TCP connections. The elimination of the overhead of multiple DNS lookups (not insignificant, by the way) as well as TCP-related impacts from slow start and session setup as well as a more forgiving exchange of frames under the covers is certainly a boon in terms of application performance. The ability to just push multiple responses to the client without having to play the HTTP acknowledgement game is significant in that it eliminates one of the biggest performance inhibitors of the web: latency arising from too many round trips. We've (as in the corporate We) seen gains of 2-3 times the performance of HTTP/1 with HTTP/2 during testing. And we aren't alone; there's plenty of performance testing going on out there, on the Internets, that are showing similar improvements. Which is why it's important (very important) that we not undo all the gains of HTTP/2 with an architecture that mimics the behavior (and performance) of HTTP/1. Domain Sharding and Microservices Before we jump into microservices, we should review domain sharding because the concept is important when we look at how microservices are actually consumed and delivered from an HTTP point of view. Scalability patterns (i.e. architectures) include the notion of Y-axis scale which is a sharding-based pattern. That is, it creates individual scalability domains (or clusters, if you prefer) based on some identifiable characteristic in the request. User identification (often extricated from an HTTP cookie) and URL are commonly used information upon which to shard requests and distribute them to achieve greater scalability. An incarnation of the Y-axis scaling pattern is domain sharding. Domain sharding, for the uninitiated, is the practice of distributing content to a variety of different host names within a domain. This technique was (and probably still is) very common to overcome connection limitations imposed by HTTP/1 and its supporting browsers. You can see evidence of domain sharding when a web site uses images.example.com and scripts.example.com and static.example.com to optimize page or application load time. Connection limitations were by host (origin server), not domain, so this technique was invaluable in achieving greater parallelization of data transfers that made it appear, at least, that pages were loading more quickly. Which made everyone happy. Until mobile came along. Then we suddenly began to realize the detrimental impact of introducing all that extra latency (every connection requires a DNS lookup, a TCP handshake, and suffers the performance impacts of TCP slow start) on a device with much more limited processing (and network) capability. I'm not going to detail the impact; if you want to read about it in more detail I recommend reading some material from Steve Souder and Tom Daly or Mobify on the subject. Suffice to say, domain sharding has an impact on mobile performance, and it is rarely a positive one. You might think, well, HTTP/2 is coming and all that's behind us now. Except it isn't. Microservice architectures in theory, if not in practice, are ultimately a sharding-based application architecture that, if we're not careful, can translate into a domain sharding-based network architecture that ultimately negates any of the performance gains realized by adopting HTTP/2. That means the architectural approach you (that's you, ops) adopt to delivering microservices can have a profound impact on the performance of applications composed from those services. The danger is not that each service will be its on (isolated and localized) "domain", because that's the whole point of microservices in the first place. The danger is that those isolated domains will be presented to the outside world as individual, isolated domains, each requiring their own personal, private connection by clients. Even if we assume there are load balancing services in front of each service (a good assumption at this point) that still means direct connections between the client and each of the services used by the client application because the load balancing service acts as a virtual service, but does not eliminate the isolation. Each one is still its own "domain" in the sense that it requires a separate, dedicated TCP connection. This is essentially the same thing as domain sharding as each host requires its own IP address to which the client can connect, and its behavior is counterproductive to HTTP/2*. What we need to do to continue the benefits of a single, optimized TCP connection while being able to shard the back end is to architect a different solution in the "big black box" that is the network. To be precise, we need to take advantage of the advanced capabilities of a proxy-based load balancing service rather than a simple load balancer. An HTTP/2 Enabling Network Architecture for Microservices That means we need to enable a single connection between the client and the server and then utilize capabilities like Y-axis sharding (content switching, L7 load balancing, etc...) in "the network" to maintain the performance benefits of HTTP/2 to the client while enabling all the operational and development benefits of a microservices architecture. What we can do is insert a layer 7 load balancer between the client and the local microservice load balancers. The connection on the client side maintains a single connection in the manner specified (and preferred) by HTTP/2 and requires only a single DNS lookup, one TCP session start up, and incurs the penalties from TCP slow start only once. On the service side, the layer 7 load balancer also maintains persistent connections to the local, domain load balancing services which also reduces the impact of session management on performance. Each of the local, domain load balancing services can be optimized to best distribute requests for each service. Each maintains its own algorithm and monitoring configurations which are unique to the service to ensure optimal performance. This architecture is only minimally different from the default, but the insertion of a layer 7 load balancer capable of routing application requests based on a variety of HTTP variables (such as the cookies used for persistence or to extract user IDs or the unique verb or noun associated with a service from the URL of a RESTful API call) results in a network architecture that closely maintains the intention of HTTP/2 without requiring significant changes to a microservice based application architecture. Essentially, we're combining X- and Y-axis scalability patterns to architect a collaborative operational architecture capable of scaling and supporting microservices without compromising on the technical aspects of HTTP/2 that were introduced to improve performance, particularly for mobile applications. Technically speaking we're still doing sharding, but we're doing it inside the network and without breaking the one TCP connection per app specified by HTTP/2. Which means you get the best of both worlds - performance and efficiency. Why DevOps Matters The impact of new architectures - like microservices - on the network and the resources (infrastructure) that deliver those services is not always evident to developers or even ops. That's one of the reasons DevOps as a cultural force within IT is critical; because it engenders a breaking down of the isolated silos between ops groups that exist (all four of them) and enables greater collaboration that leads to more efficient deployment, yes, but also more efficient implementations. Implementations that don't necessarily cause performance problems that require disruptive modification to applications or services. Collaboration in the design and architectural phases will go along way towards improving not only the efficacy of the deployment pipeline but the performance and efficiency of applications across the entire operational spectrum. * It's not good for HTTP/1, either, as in this scenario there is essentially no difference** between HTTP/1 and HTTP/2. ** In terms of network impact. HTTP/2 still receives benefits from its native header compression and other performance benefits.1.6KViews0likes2CommentsThis Simple Trick Can Net You Faster Apps
I am often humbled by the depth of insight of those who toil in the trenches of the enterprise data center. At our Agility conference back in August, my cohort and I gave a presentation on the State of Application Delivery. One of the interesting tidbits of data we offered was that, over the course of the past year, our iHealth data shows a steady and nearly even split of HTTP and HTTPS traffic. To give you an example, my data from October was derived from over 3 million (3,087,211 to be precise) virtual servers. Of those, roughly 32% were configured to support HTTP, and another 30% were supporting HTTPS. Now, I’ve been looking at this data for more than a year, and it has stayed roughly the same with only slight variations up or down, but always within a couple percentage points of each other. But it wasn’t until a particularly astute customer spoke up that I understood why that split existed in the first place. After all, the rise of SSL Everywhere is well-documented. Our own data supports it, industry data supports it, and the move to support only TLS-enabled connections from browser via HTTP/2 is forcing it. But why, then, the split? “Redirects,” the customer told me, giving me a look that seemed to question how I had not seen that before. Indeed. The Curse of Knowledge strikes again. Once elucidated, it seems obvious. And of course, sites are going to encourage HTTPS but they aren’t going to sacrifice their web presence in doing so. That means gently herded millions of customers who have been taught to type in “http” to a more secure site. That’s what redirects do. But they do more than just enable a more secure application experience. They add the application experience’s evil nemesis to the equation. That’s right. [cue dramatic, spine-tingling music] Latency. You see, a redirect tells the browser "you know, you should load this URI instead”. And then browser says, “okay, I’ll do that.” And then it has to basically start over. The existing TCP connection is invalid. A new one, requiring a repeat of the TCP handshake and then adding on the requirement to negotiate TLS or SSL requirements. All this adds up to more time. It negatively affects the application experience by dragging out the connection process. This is particularly noticeable on mobile connections, where compute and bandwidth is often constrained and leads to “hanging pages” and other horrific web app loading experiences. Poor performance leads to abandonment. Abandonment leads to loss of revenue or conversions. And loss of either leads to, well, not a good place. But I wouldn’t be offering commentary on a problem if I didn’t have a solution cause, Midwestern gal here. Turns out you can eliminate redirects and their negative effect on the web application experience a couple of ways. First, and for those security minded folks the best, use HTTP Strict Transport Security (HSTS) headers instead. Once responses are received with HSTS headers, the browser is forced to subsequently behave in a manner compliant with the policy imparted. For example, it will automatically change any insecure (http) links to secure (https) links. That means http://mydomain.com/mystuff/ will automatically become https://mydomain.com/mystuff/. Once a browser sees an HSTS header from a site, it will not use HTTP again. Even if you type it into the address bar and try to force it, it will refuse to do so, instead replacing it with HTTPS and making the request securely. By specifying a really long “max-age”, say a year (that’s 31,536,000 seconds for one non-leap year), you eliminate the drag on performance from future redirects, and ensure a faster, more pleasant application experience for not only mobile users, but all users. It’s just more likely that mobile customers will actually notice a difference, given the differences between mobile and tethered connectivity. Another option is to ensure that you aren’t relying on temporary redirects (HTTP 302). You want to make sure you’re at least using permanent redirects (HTTP 301) to force browsers to use the secure location for as long as possible in the future. Permanent redirects are cached locally, so they can be lost due to cache cleaning, but they’re better than temporary redirects. Worried about the operational cost to update every web application server? Fear not, header insertion is (or should be) a basic capability of any application delivery solution you’re using for load balancing or web application security services. They can insert headers transparently into an HTTP response with a few lines of configuration or code, reducing the effort required to virtually (heh, pardon my pun) nothing. Neither the user not the application should notice anything except for an improvement in overall performance. It’s a simple change, but one that can have a noticeable impact on the application experience (a.k.a. web performance).191Views0likes1CommentCaching for Faster APIs
Pop quiz: Do you know what the number one driver cited in 2016 for networking investments was? No peeking! If you guessed speed and performance, you guessed right. If you guessed security don’t feel bad, it came in at number two, just ahead of availability. Still, it’s telling that the same things that have always driven network upgrades, improvements, and architectures continue to do so. We want fast, secure, and reliable networks that deliver fast, secure, and reliable applications. Go figure. The problem is that a blazing fast, secure, and reliable network does not automatically translate into a fast, secure, and reliable application. But it can provide a much needed boost. And I’m here to tell you how (and it won’t even cost you shipping and handling fees). The thing is that there have long been web app server (from which apps and APIs are generally delivered) options for both caching and compression. The other thing is that they’re often not enabled. Caching headers are part of the HTTP specification. They’re built in, but that means they’re packaged up with each request and response. So if a developer doesn’t add them, they aren’t there. Except when you’ve got an upstream, programmable proxy with which you can insert them. Cause when we say software-defined, we really mean software-defined. As in “Automation is cool and stuff, but interacting with requests/responses in real-time is even cooler.” So, to get on with it, there are several mechanisms for managing caching within HTTP, two of which are: ETag and Last-Modified. ETag The HTTP header “ETag” contains a hash or checksum that can be used to compare whether or not content has changed. It’s like the MD5 signature on compressed files or RPMs.While MD5 signatures are usually associated with security, they can also be used to determine whether or not content has changed. In the case of browser caching, the browser can make a request that says “hey, only give me new content if it’s changed”. The server-side uses the ETag to determine if it has and if not, sends back an empty HTTP 304 response. The browser says “Cool” and pulls the content from its own local cache. This saves on transfer times (by reducing bandwidth and round trips if the content is large) and thus improves performance. Last-Modified. This is really the same thing as an ETag but with timestamps, instead. Browsers ask to be served new content if it has been modified since a specific date. This, too, saves on bandwidth and transfer times, and can improve performance. Now, these mechanisms were put into place primarily to help with web-based content. Caching images and other infrequently changing presentation components (think style-sheets, a la CSS) can have a significant impact on performance and scalability of an application. But we’re talking about APIs, and as we recall, APIs are not web pages. So how does HTTP’s caching options help with APIs? Well, very much the same way, especially given that most APIs today are RESTful, which means they use HTTP. If I’ve got an app (and I’ve got lots of them) that depends on an API there are still going to be a lot of content types that are similar, like images. Those images can (and should) certainly be cached when possible, especially if the app is a mobile one. Data, too, for frequently retrieved content can be cached, even if it is just a big blob of JSON. Consider the situation in which I have an app and every day the “new arrivals” are highlighted. But they’re only updated once a day, or on a well-known schedule. The first time I open the menu item to see the “new arrivals”, the app should certainly go get the new content, because it’s new. But after that, there’s virtually no reason for the app to go requesting that data. I already paid the performance price to get it, and it hasn’t changed – neither the JSON objects representing the individual items nor the thumbnails depicting them. Using HTTP caching headers and semantics, I can ask “have you changed this yet?” and the server can quickly respond “Not yet.” That saves subsequent trips back and forth to download data while I click on fourteen different pairs of shoes* off the “new arrivals” list and then come back to browse for more. If the API developer hasn’t added the appropriate HTTP cues in the headers, however, you’re stuck grabbing and regrabbing the same content and wasting bandwidth as well as valuable client and server-side resources. An upstream programmable proxy can be used to insert them, however, and provide both a performance boost (for the client) and greater scalability (for the server). Basically, you can insert anything you want into the request/response using a programmable proxy, but we’ll focus on just HTTP headers right now. The basic pattern is: 1: when HTTP_REQUEST { 2: HTTP::header insert "ETag" "my-computed-value" 3: } Really, that’s all there is to it. Now, you probably want some logic in there to not override an existing header because if the developer put it in, there’s a good reason. This is where I mini-lecture you on the cultural side of DevOps and remind you that communication is as critical as code when it comes to improving the deployment and delivery of applications. And there’s certainly going to be some other stuffs that go along with it, but the general premise is that the insertion of caching-related HTTP headers is pretty simple to achieve. For example, we could insert a Last-Modified header for any JPG image: 1: when HTTP_RESPONSE { 2: if { [HTTP::header "Content-Type" ] equals "Image/jpeg" } { 3: HTTP::header insert "Last-Modified" "timestamp value" 4: } 5: } 6: } We could do the same for CSS, or JS, as well. And we could get more complex and make decisions based on a hundred other variables and conditions. Cause, software-defined delivery kinda means you can do whatever you need to do. Another reason a programmable proxy is an excellent option in this case is because it further allows you to extend HTTP unofficial functionality when servers do not. For example, there’s an unofficial “PURGE” method that’s used by Varnish for invalidating cache entries. Because it’s unofficial, it’s not universally supported by the web servers on which APIs are implemented. But a programmable proxy could be used to implement that functionality on behalf of the web server (cause that’s what proxies do) and relieve pressure on web servers to do so themselves. That’s important when external caches like memcached and varnish enter the picture. Because sometimes it’s not just about caching on the client, but in the infrastructure. In any case, HTTP caching mechanisms can improve performance of APIs, particularly when they are returning infrequently changing content like images or static text. Not taking advantage of them is a lost opportunity. * you shop for what you want, I’ll shop for shoes.355Views0likes0CommentsOptimizing IoT and Mobile Communications with TCP Fast Open
There's a lot of focus on the performance of mobile communications given the incredible rate at which mobile is outpacing legacy PC (did you ever think we'd see the day when we called it that?) usage. There's been tons of research on the topic ranging from the business impact (you really can lose millions of dollars per second of delay) to the technical mechanics of how mobile communications is impacted by traditional factors like bandwidth and RTT. Spoiler: RTT is more of a factor than is bandwidth in improving mobile app performance. The reason behind this isn't just because mobile devices are inherently more resource constrained or that mobile networks are oversubscribed or that mobile communications protocols simply aren't as fast as your mega super desktop connection, it's also because mobile apps (native ones) tend toward the use of APIs and short bursts of communication. Grab this, check for an update on that, do this quick interaction and return. These are all relatively small in terms of data transmitted, which means that the overhead from establishing a connection can actually take more time than the actual exchange. The RTT incurred by the three-step handshake slows things down. That same conundrum will be experienced by smart "things" that connect for a quick check-in to grab or submit small chunks of data. The connection will take longer than the data transmission, which seems, well, inefficient, doesn't it? Apparently other folks thought so too, and hence we have in Internet Draft form a proposed TCP mechanism to alleviate the impact of this overhead known as "TCP Fast Open". TCP Fast Open Draft @ IETF This document describes an experimental TCP mechanism TCP Fast Open (TFO). TFO allows data to be carried in the SYN and SYN-ACK packets and consumed by the receiving end during the initial connection handshake, and saves up to one full round trip time (RTT) compared to the standard TCP, which requires a three-way handshake (3WHS) to complete before data can be exchanged. However TFO deviates from the standard TCP semantics since the data in the SYN could be replayed to an application in some rare circumstances. Applications should not use TFO unless they can tolerate this issue detailed in the Applicability section. The standard relies on the existence of a cookie deposited with the client that indicates a readiness and willingness (and perhaps even a demand) to transmit some of the data in the initial SYN and SYN-ACK packets of the TCP handshake. The cookie is generated by the app (or gateway, the endpoint) upon request from the client. There's no special TCP behavior on this request, so it seems likely this would be handled during the "initial setup" of a thing. On subsequent communications in which the TFO cookie is present, the magic happens. The app (or gateway, the endpoint) recognizes the cookie and is able to grab the data and start processing - before the initial handshake is even complete. While the use of 'cookies' is more often associated with HTTP, it is also found within the realm of TCP (SYN cookies are a popular means of attempting to detect and prevent SYN flood attacks). Needless to say, such a mechanism is particularly of interest to service providers as their networks often act as gateways to the Internet for mobile devices. Reducing the time required for short-burst communications ultimately reduces the connections that must be maintained in the mobile network, thus relieving some pressure on the number of proxies - virtual or not - required to support the growing number of devices and things needing access. A word of caution, however. TFO is not designed for nor meant to be used for every application. The draft clearly spells out applicability as being to those applications where initial requests from the client are of a size that they are less than the TCP MSS. This is because otherwise the server still has to wait until after the handshake completes to gather the rest of the data and formulate a response. Thus any performance benefit would be lost. Proceed with careful consideration, therefore, in applying the use of TCP Fast Open but do consider it, particularly if data sets are small, as may be the case with things reporting in or checking for updates.479Views0likes3CommentsMeasuring and Monitoring: Apps and Stacks
One of the charter responsibilities of DevOps (because it's a charter responsibility of ops) is measuring and monitoring applications once they're in production. That means both performance and availability. Which means a lot more than folks might initially think because generally speaking what you measure and monitor is a bit different depending on whether you're looking at performance or availability*. There are four primary variables you want to monitor and measure in order to have the operational data necessary to make any adjustments necessary to maintain performance and availability: Connectivity This determines whether or not upstream devices (ultimately, the client) can reach the app (IP). This is the most basic of tests and tells you absolutely nothing about the application except that the underlying network is reachable. While that is important, of course, connectivity is implied by the successful execution of monitors up the stack and thus the information available from a simple connectivity test is not generally useful for performance or availability monitoring. ICMP pings can also be detrimental in that they generate traffic and activity on systems that, in hyper-scale environments, can actually negatively impact performance. Capacity This measure is critical to both performance and availability, and measures how close to "full" the connection capacity (TCP) of a given instance is. These variables are measured against known values usually obtained during pre-release stress / load tests that determine how many connections an app instance can maintain before becoming overwhelmed and performance degrades. App Status This simple but important measure determines whether the application (the HTTP stack) is actually working. This is generally accomplished by sending an HTTP request and verifying that the response includes an HTTP 200 response. Any other response is generally considered an error. Systems can be instructed to retry this test multiple times and after a designated number of failures, the app instance is flagged as out of service. Availability This is often ignored but is key to determining if the application is responding correctly or not. This type of monitoring requires that the monitor be able to make a request and compare the actual results against a known "good" result. These are often synthetic transactions that test the app and its database connectivity to ensure that the entire stack is working properly. App Status and Availability can be measured either actively or passively (in band). When measured actively, a monitor initiates a request to the application and verifies its response. This is a "synthetic" transaction; a "fake" transaction used to measure performance and availability. When measured passively, a monitor spies on real transactions and verifies responses without interference. It is more difficult to measure availability based on application content verification with a passive monitor than an active one as a passive monitor is unlikely to be able to verify responses against known ones because it doesn't control what requests are being made. The benefit of a passive monitor is that it isn't consuming resources on the app instance in order to execute a test and it is measuring real performance for real users. You'll notice that there's a clear escalation "up the stack" from IP -> TCP -> HTTP -> Application. That's not coincidental. Each layer of the stack is a critical component in the communication that occurs between a client and the application. Each one provides key information that is important to measuring both performance and availability. The thing is that while the application may be responsible for responding to queries about its status in terms of resource utilization (CPU, memory, I/O), everything else is generally collected external to the application, from an upstream service. Most often that upstream service is going to be a proxy or load balancer, as in addition to monitoring status and performance it needs those measurements to enable decisions regarding scale and availability. It has to know how many connections an app has right now because at some point (a predetermined threshold) it is going to have to start distributing load differently. Usually to a new instance. In a DevOps world where automation and orchestration are in play, this process can be automated or at least triggered by the recognition that a threshold has been reached. But only if the proxy is actually monitoring and measuring the variables that might trigger that process. But to do that, you've got to monitor and measure the right things. Simply sending out a ping every five seconds tells you the core network is up, available and working but says nothing about the capacity of the app platform (the web or application server) or whether or not the application is actually responding to requests. HTTP 500, anyone? It's not the case that you must monitor everything. As you move up the stack some things are redundant. After all, if you can open a TCP connection you can assume that the core network is available. If you can send an HTTP request and get a response, well, you get the picture. What's important is to figure out what you need to know - connectivity, capacity, status and availability - and monitor it so you can measure it and take decisive action based on that data. Monitoring and measuring of performance and availability should be application specific; that is, capacity of an app isn't just about the platform and what max connections are set to in the web server configuration. The combination of users, content, and processing within the application make capacity a very app-specific measurement. That means the systems that need that data must be aligned better with each application to ensure not only optimal performance and availability but efficiency of resources. That's one of the reason traditionally "network" services like load balancing and proxies are becoming the responsibility of DevOps rather than NetOps. * Many variables associated with availability - like system load - also directly impact performance and can thus be used as part of the performance equation.256Views0likes0CommentsBeyond Scalability: Achieving Availability
Scalability is only one of the factors that determine availability. Security and performance play a critical role in achieving the application availability demanded by business and customers alike. Whether the goal is to achieve higher levels or productivity or generate greater customer engagement and revenue the venue today is the same: applications. In any application-focused business strategy, availability must be the keystone. When the business at large is relying on applications to be available, any challenge that might lead to disruption must be accounted for and answered. Those challenges include an increasingly wide array of problems that cost organizations an enormous amount in lost productivity, missed opportunities, and damage to reputation. Today's applications are no longer simply threatened by overwhelming demand. Additional pressures in the form of attacks and business requirements are forcing IT professionals to broaden their views on availability to include security and performance. For example, a Kaspersky study[1] found that “61 percent of DDoS victims temporarily lost access to critical business information.” A rising class of attack known as “ransomware” has similarly poor outcomes, with the end result being a complete lack of availability for the targeted application. Consumers have a somewhat different definition of “availability” than the one found in text-books and scholarly articles. A 2012 EMA[2] study notes that “Eighty percent of Web users will abandon a site if performance is poor and 90% of them will hesitate to return to that site in the future” with poor performance designated as more than five seconds. The impact, however, of poor performance is the same as that of complete disruption: a loss of engagement and revenue. The result is that availability through scalability is simply not good enough. Contributing factors like security and performance must be considered to ensure a comprehensive availability strategy that meets expectations and ensures business availability. To realize this goal requires a tripartite of services comprising scalability, security and performance. Scalability Scalability is and likely will remain at the heart of availability. The need to scale applications and dependent services in response to demand is critical to maintaining business today. Scalability includes load balancing and failover capabilities, ensuring availability across the two primary failure domains – resource exhaustion and failure. Where load balancing enables the horizontal scale of applications, failover ensures continued access in the face of a software or hardware failure in the critical path. Both are equally important to ensuring availability and are generally coupled together. In the State of Application Delivery 2015, respondents told us the most important service – the one they would not deploy an application without – was load balancing. The importance of scalability to applications and infrastructure cannot be overstated. It is the primary leg upon which availability stands and should be carefully considered as a key criteria. Also important to scalability today is elasticity; the ability to scale up and down, out and back based on demand, automatically. Achieving that goal requires programmability, integration with public and private cloud providers as well as automation and orchestration frameworks and an ability to monitor not just individual applications but their entire dependency chain to ensure complete scalability. Security If attacks today were measured like winds we’d be looking at a full scale hurricane. The frequency, volume and surfaces for attacks have been increasing year by year and continues to surprise business after business after business. While security is certainly its own domain, it is a key factor in availability. The goal of a DDoS whether at the network or application layer is, after all, to deny service; availability is cut off by resource exhaustion or oversubscription. Emerging threats such as “ransomware” as well as existing attacks with a focus on corruption of data, too, are ultimately about denying availability to an application. The motivation is simply different in each case. Regardless, the reality is that security is required to achieve availability. Whether it’s protecting against a crippling volumetric DDoS attack by redirecting all traffic to a remote scrubbing center or ensuring vigilance in scrubbing inbound requests and data to eliminate compromise, security supports availability. Scalability may be able to overcome a layer 7 resource exhaustion attack but it can’t prevent a volumetric attack from overwhelming the network and making it impossible to access applications. That means security cannot be overlooked as a key component in any availability strategy. Performance Although performance is almost always top of mind for those whose business relies on applications, it is rarely considered with the same severity as availability. Yet it is a key component of availability from the perspective those who consume applications for work and for play. While downtime is disruptive to business, performance problems are destructive to business. The 8 second rule has long been superseded by the 5 second rule and recent studies support its continued dominance regardless of geographic location. The importance of performance to perceived availability is as real as scalability is to technical availability. 82 percent of consumers in a UK study[3] believe website and application speed is crucial when interacting with a business. Applications suffering poor performance are abandoned, which has the same result as the application simply being inaccessible, namely a loss of productivity or revenue. After all, a consumer or employee can’t tell the difference between an app that’s simply taking a long time to respond and an app that’s suffered a disruption. There’s no HTTP code for that. Perhaps unsurprisingly a number of performance improving services have at their core the function of alleviating resource exhaustion. Offloading compute-intense functions like encryption and decryption as well as connection management can reduce the load on applications and in turn improve performance. These intertwined results are indicative of the close relationship between performance and scalability and indicate the need to address challenges with both in order to realize true availability. It's All About Availability Availability is as important to business as the applications it is meant to support. No single service can ensure availability on its own. It is only through the combination of all three services – security, scalability and performance – that true availability can be achieved. Without scalability, demand can overwhelm applications. Without security, attacks can eliminate access to applications. And without performance, end-users can perceive an application as unavailable even if it’s simply responding slowly. In an application world, where applications are core to business success and growth, the best availability strategy is one that addresses the most common challenges – those of scale, security and speed. [1] https://press.kaspersky.com/files/2014/11/B2B-International-2014-Survey-DDoS-Summary-Report.pdf [2] http://www.ca.com/us/~/media/files/whitepapers/ema-ca-it-apm-1112-wp-3.aspx [3] https://f5.com/about-us/news/press-releases/gone-in-five-seconds-uk-businesses-risk-losing-customers-to-rivals-due-to-sluggish-online-experience246Views0likes0CommentsWhat Ops Needs to Know about HTTP/2
So HTTP/2 is official. That means all the talking is (finally) done and after 16 years of waiting, we've got ourselves a new lingua franca of the web. Okay, maybe that's pushing it, but we do have a new standard to move to that offers some improvements in the areas of resource management and performance that make it definitely worth taking a look at. For example, HTTP/2 is largely based on SPDY (that's Google's SPDY, for those who might have been heads down in the data center since 2009 and missed its introduction) which has proven, in the field, to offer some nice performance benefits. Since its introduction in 2009, SPDY has moved through several versions, resulting in the current (and according to Google, last) version of 3.1, has shown real improvements in page load times mostly due to a combination of reduction in round trip times (RTT) and use of header compression. An IDG research paper, "Making the Journey to HTTP/2", notes that "According to Google, SPDY has cut load times for several of its most highly used services by up to 43 percent. Given how heavily based it is on SPDY, HTTP/2 should deliver similarly significant performance gains, resulting in faster transactions and easier access to mobile users, not to mention reduced need for bandwidth, servers, and network infrastructure. " But HTTP/2 is more than SPDY with a new name. There are a number of significant differences that, while not necessarily affecting applications themselves, definitely impact the folks who have to configure and manage the web and application servers on which those apps are deployed. That means you, ops guy. One of the biggest changes in HTTP/2 is that it is now binary on the wire instead of text. That's good news for transport times, but bad news because it's primarily the reason that HTTP/2 is incompatible with HTTP/1.1. While browsers will no doubt navigate protocols (separately, of course), thus alleviating any concern that end-users will be able to access your apps if you move to the new standard, it's problematic for inter-app integration; i.e. all those external services you might use to build your app or site. The assumed HTTP/1.1 will not communicate with an HTTP/2 endpoint, and vice versa. Additionally, HTTP/2 introduces a dedicated header compression protocol, HPACK. While SPDY also supported header compression as a way to eliminate the overhead associated with redundant headers across lots (an average of 80) requests per page, it fell back on standard DEFLATE (RFC 1951) compression, which is vulnerable to CRIME (yes, I'm aware of the hilarious irony in that acronym but it is what it is, right?). Operational ramification: New header compression techniques will mean caches and upstream infrastructure which may act upon those headers will need to be able to speak HPACK. If you haven't been using SPDY, you may also not be aware of the changes to request management. HTTP 1.1 allowed for multiple requests over the same (single) connection but even that was found to be inadequate as page complexity (in terms of objects needing to be retrieved) increased. Browsers therefore would open 2, 3 or 6 connections per domain in order to speed up page loads. This heavily impacted the capacity of a web/app server. If a web server could manage 5000 concurrent (TCP) connections, you had to divide that by the average number of connections opened per user to figure out the concurrent user capacity. SPDY - and HTTP/2 - are based on the use of a single connection, introducing parallel request and response flows in order to address performance in the face of extreme complexity. So that means a 1:1 ratio between users and (TCP) connections. But that doesn't necessarily mean capacity planning is simpler, as those connections are likely to be longer lived than many HTTP/1.1 connections. Idle time out values in web servers may need to be adjusted and capacity planning will need to take that into consideration. Operational ramification: Idle time out values and maximum connections may need to be adjusted based on new TCP behavior. Some of the other changes that may have an impact are related to security. In particular, there has been a lot of confusion over the requirement (or non requirement) for security in HTTP/2. It turns out that the working group did not have consensus to require TLS or SSL in HTTP/2 and thus it remains optional. The market, however, seems to have other ideas as browsers currently supporting HTTP/2 do require TLS or SSL and indications are this is not likely to change. SSL Everywhere is the goal, after all, and browsers play a significant (and very authoritative) role in that effort. With that said, TLS optional in the HTTP/2 specification. But of course since most folks are supportive of SSL Everywhere, it is important to note that when securing connections HTTP/2 requires stronger cryptography. Ephemeral keys only Preferring AEAD modes like CGM Minimal key sizes 128 bit EC, 2048 bit RSA This falls squarely in the lap of ops, as this level of support is generally configured and managed at the platform (web server) layer, not within the application. Because browsers are enforcing the use of secure connections, the implications for ops start reaching beyond the web server and into the upstream infrastructure. Operational ramification: Upstream infrastructure (caches, load balancers, NGFW, access management) will be blinded by encryption and unable to perform their functions. Interestingly, HTTP/2 is already out there. The BitsUp blog noted the day the HTTP/2 official announcement was made that "9% of all Firefox release channel HTTP transactions are already happening over HTTP/2. There are actually more HTTP/2 connections made than SPDY ones." So this isn't just a might be, could be in the future. It's real. Finally. For a deeper dive into the history of HTTP and how the protocol has evolved over time, feel free to peruse this HTTP/2 presentation.725Views0likes0CommentsF5 Friday: Should you stay with HTTP/1 or move to HTTP/2 ?
Application experience aficionados take note: you have choices now. No longer are you constrained to just HTTP/1 with a side option of WebSockets or SPDY. HTTP/2 is also an option, one that like its SPDY predecessor brings with it several enticing benefits but is not without obstacles. In fact, it is those obstacles that may hold back adoption according to IDG research, "Making the Journey to HTTP/2". In the research, respondents indicated several potential barriers to adoption including backward compatibility with HTTP/1 and the "low availability" of HTTP/2 services. In what's sure to noticed as a circular dependency, the "low availability" is likely due to the "lack of backward compatibility" barrier. Conversely, the lack of backward compatibility with HTTP/1 is likely to prevent the deployment of HTTP/2 services and cause low availability of HTTP/2 services. Which in turn, well, you get the picture. This is not a phantom barrier. The web was built on HTTP/1 and incompatibility is harder to justify today than it was when we routinely browsed the web and were shut out of cool apps because we were using the "wrong" browser. The level of integration between apps and reliance on many other APIs for functionality pose a difficult problem for would-be adopters of HTTP/2 looking for the improved performance and efficacy of resource utilization it brings. But it doesn't have to. You can have your cake and eat it too, as the saying goes. HTTP Gateways What you want is some thing that sits in between all those users and your apps and speaks their language (protocol) whether it's version 1 or version 2. You want an intermediary that's smart enough to translate SPDY or HTTP/2 to HTTP/1 so you don't have to change your applications to gain the performance and security benefits without investing hundreds of hours in upgrading web infrastructure. What you want is an HTTP Gateway. At this point in the post, you will be unsurprised to learn that F5 provides just such a thing. Try to act surprised, though, it'll make my day. One of the benefits of growing up from a load balancing to an application delivery platform is that you have to be fluent in the languages (protocols) of applications. One of those languages is HTTP, and so it's no surprise that at the heart of F5 services is the ability to support all the various flavors of HTTP available today: HTTP/1, SPDY, HTTP/2 and HTTP/S (whether over TLS or SSL). But more than just speaking the right language is the ability to proxy for the application with the user. Which means that F5 services (like SDAS) sit in between users and apps and can translate across flavors of HTTP. Is your mobile app speaking HTTP/2 or SPDY but your app infrastructure only knows HTTP/1? No problem. F5 can make that connection happen. That's because we're a full proxy, with absolute control over a dual-communication stack that lets us do one thing on the client side while doing another on the server side. We can secure the outside and speak plain-text on the inside. We can transition security protocols, web protocols, and network protocols (think IPv4 - IPv6). That means you can get those performance and resource-utilization benefits without ripping and replacing your entire web application infrastructure. You don't have to reject users because they're not using the right browser protocol and you don't have to worry about losing visibility because of an SSL/TLS requirement. You can learn more about F5's HTTP/2 and SDPY Gateway capabilities by checking out these blogs: What are you waiting for? F5 Synthesis: Your gateway to the future (of HTTP) Velocity 2014 HTTP 2.0 Gateway (feat Parzych) F5 Synthesis Reference Architecture: Acceleration806Views0likes0CommentsProgrammability in the Network: Making up for lost time with user bribes
90% of your users want your site to perform well during peak periods, like the last month or so. That's according to a study by Compuware, in which they also found that most dissatisfied users will throw you under a bus if your app or site performs poorly while they're desperately trying to buy the right gift for their very demanding loved ones. In an application world driven by impatient and very vocal users, performance is king. Every year before Christmas we hear at least one horror story of a site that didn't perform up to expectations and is promptly made an example by, well, everyone. I could probably fill this post with examples, but that's not really the point. The point is maybe there's a way to counteract that, in real time, thanks to the power of programmability in the network. Most apps are adept at offering up discounts and promotions in real-time. These are generally powered by the application in question and based on a variety of user identifiable data. It would seem like the thing to do would be to apply those kinds of discounts - in real time - to users that may be experiencing poor performance at that moment. As incentive to see it through and remain engaged. As proof you care about the quality of their user experience. The problem is that the app itself - the one instance in that giant cluster of instances - with which the user is interacting has no visibility into the overall performance being experienced by the user. It may know the database was a bit slower than normal, but it can't measure it's own response time because, well, that has to happen after it responds and by that time, it's too late to inject a coupon or discount into the stream. What's needed is an external entity, something between the user and the app, that can identify in real-time that the app is taking longer than the typical holiday shopper will tolerate. Something, oh, I don't know, that maybe sits upstream and virtualizes the app in its role providing availability and security for the app and the user. Like an app proxy. Only it's not enough to be just an app proxy, it has to be a programmable app proxy because it's going to have to do some fast talking to the services that offer up coupons and promotions so it can inject it into the response along with a "sorry we're a bit slow today, here's a little something for your patience." The basic premise is that the proxy is (1) programmable and (2) has the ability to track response time against an expected result. So you might base a discount on a response time range, say if the app took 2 seconds to respond you're going to offer a 5% coupon. If the app took 2-4 seconds, 10%, and more than 4 seconds? 20%. Cause you're awesome like that. The proxy, upon noting the response time, determines whether or not it should insert a promotional code. If so, it reaches out to the database in charge of tracking these things and notes that "Bob" got a "5% discount" for poor performance. Not only does this attempt to make up for a less than stellar user experience, but it also provides an accounting of how many times performance was an issue, at what times of the day, on what days and how much money was spent to keep the business of users that, based on most studies, were almost certainly going to bad mouth you and give their money to your competitors. After the proxy does that thing it does, it inserts the appropriate code into the response and sends it on its way. A little customization here based on user name (if you've got that info) would be nice, by the way. Given this basic pattern, you can probably start imagining all sorts of interesting ways to interact with (reward or compensate) users based on a variety of variables like performance. The key is that some of those variables just aren't available to the app itself - only an intermediary might have that data readily available. The thing is that programmability in the network - particularly programmability based on something like node.js and its very robust set of packages - can be used to implement just about any kind of innovative service that enhances existing or offers new capabilities to the business. While certainly some vendor could bake this capability into a proxy and let you just configure a few data fields and voila! Except it's never that easy. You might want to offer discounts based on a range of performance while someone else just has a hard line above and below on which they want to provide such an incentive. Someone else might want to offer free shipping, instead or who knows what? That's kind of the point behind programmability; it lets organizations design, deliver and tailor services based on their business objectives and priorities, and their particular situation. Programmability imbues the network with the capability to innovate new services that directly result in business value; that have a real impact on the bottom line. The question is no longer what can the network do for you, but what can you do with the network.156Views0likes0Comments