The ability to manually adjust capacity is a huge advantage over traditional computing. But the real power of scaling in the cloud lies in dynamic scaling.
1. Dynamic scaling
This term—which I sometimes also refer to as cloud scaling—enables software to adjust the resources in your infrastructure without your interactive involvement. Dynamic scaling can take the form of proactive scaling or reactive scaling.
2. Proactive scaling
This involves a schedule for altering your infrastructure based on projected demand. we would configure our cloud management tools to run with a minimal infrastructure that supports our availability requirements during the early morning hours, add capacity in the late morning, drop back to the baseline until lunch, and so on. This strategy does not wait for demand to increase, but instead increases capacity based on a plan.
3. Cloud Scale
The cloud empowers you to alter your computing resources to meet your load requirements. You can alter your capacity both manually (by executing a command on a command line or through a web interface) and programmatically (through predefined changes in capacity or through software that automatically adjusts capacity to meet actual demand).
The ability to manually adjust capacity is a huge advantage over traditional computing. But the real power of scaling in the cloud lies in dynamic scaling.
4. Dynamic scaling
This term—which I sometimes also refer to as cloud scaling—enables software to adjust the resources in your infrastructure without your interactive involvement. Dynamic scaling can take the form of proactive scaling or reactive scaling.
Proactive scaling
This involves a schedule for altering your infrastructure based on projected demand. If you consider the application described back in Figure 7-1, we would configure our cloud management tools to run with a minimal infrastructure that supports our availability requirements during the early morning hours, add capacity in the late morning, drop back to the baseline until lunch, and so on. This strategy does not wait for demand to increase, but instead increases capacity based on a plan.
5.Reactive scaling
In this strategy, your infrastructure reacts to changes in demand by adding and removing capacity on its own accord. In the capacity valuation thought experiment, an environment engaging in reactive scaling might have automatically added capacity when it detected the unexpected spike in activity on the CMO blog.
Managing reactive scaling
Reactive scaling is a powerful rope you can easily hang yourself with. It enables you to react quickly to unexpected demand. If you fail to do any capacity planning and instead rely solely on reactive scaling to manage a web application, however, you probably will end up hanging yourself with this rope.
The crudest form of reactive scaling is utilization-based. In other words, when your CPU or RAM or other resource reaches a certain level of utilization, you add more of that resource into your environment. It makes for very simple logic for the monitoring system, but realistically, it’s what you need only a fraction of the time. We’ve already seen some examples of where this will fail:
Hiked-up application server processing that suffers from an I/O bound database server. The result is increased loads on the database server that further aggravate the situation.
An attack that will use up whatever resources you throw at it. The result is a spiraling series of attempts to launch new resources while your costs go through the roof.
An unexpected spike in web activity that begins taxing your infrastructure but only mildly impacts the end user experience. You know the activity will subside, but your monitor launches new instances simply because it perceives load.
A good monitoring system will provide tools that mitigate these potential problems with reactive scaling. I have never seen a system, however, that is perfectly capable of dealing with the last scenario. It calls for an understanding of the problem domain and the pattern of activity that determines you should not launch new instances—and I don’t know of an algorithmic substitute for human intuition for these decisions.
However your monitoring system defines its rules for reactive scaling, you should always have a governor in place. A governor places limits on how many resources the monitoring system can automatically launch, and thus how much money your monitoring system can spend on your behalf. In the case of the attack on your infrastructure, your systems would eventually end up grinding to a halt as you hit your governor limit, but you would not end up spending money adding an insane number of instances into your cloud environment.
A final issue of concern that affects both proactive and reactive scaling—but more so for reactive scaling—is the fallibility of Amazon S3 and the AWS APIs. If you are counting on reactive scaling to make sure you have enough resources, Amazon S3 issues will weaken your plans. Your system will then fail you.
Cloud Scale
The cloud empowers you to alter your computing resources to meet your load requirements. You can alter your capacity both manually (by executing a command on a command line or through a web interface) and programmatically (through predefined changes in capacity or through software that automatically adjusts capacity to meet actual demand).
The ability to manually adjust capacity is a huge advantage over traditional computing. But the real power of scaling in the cloud lies in dynamic scaling.
Dynamic scaling
This term—which I sometimes also refer to as cloud scaling—enables software to adjust the resources in your infrastructure without your interactive involvement. Dynamic scaling can take the form of proactive scaling or reactive scaling.
Proactive scaling
This involves a schedule for altering your infrastructure based on projected demand. If you consider the application described back in Figure 7-1, we would configure our cloud management tools to run with a minimal infrastructure that supports our availability requirements during the early morning hours, add capacity in the late morning, drop back to the baseline until lunch, and so on. This strategy does not wait for demand to increase, but instead increases capacity based on a plan.
Reactive scaling
In this strategy, your infrastructure reacts to changes in demand by adding and removing capacity on its own accord. In the capacity valuation thought experiment, an environment engaging in reactive scaling might have automatically added capacity when it detected the unexpected spike in activity on the CMO blog.
Whatever tool you pick, it should minimally have the following capabilities (and all three services I mentioned have them):
- To schedule changes in capacity for your application deployments
- To monitor the deployments for excess (and less than normal) demand
- To adjust capacity automatically based on unexpected spikes or falloffs in demand
Monitoring involves a lot more than watching for capacity caps and switching servers off and on. In Chapter 6, I covered the role monitoring plays in watching for failures in the cloud and recovering from those failures. A notification system is obviously a big part of monitoring for failures. You should have full logging of any change in capacity—whether scheduled or not—and email notifications of anything extraordinary.
Compared to disaster recovery, it’s not as critical for capacity planning purposes that your monitoring server be outside the cloud. Nevertheless, it’s a very good idea, and a critical choice if bandwidth management is part of your monitoring profile.
The monitoring checks on each individual server to get a picture of its current resource constraints. Each cloud instance has a process capable of taking the vitals of that instance and reporting back to the monitoring server. Most modern operating systems have the ability to operate as the status process for core system data, such as CPU, RAM, and other SNMP-related data. In addition, Java application servers support the JMX interfaces that enable you to query the performance of your Java virtual machines.
For security purposes, I prefer having the monitoring server poll the cloud instances rather than making the cloud instances regularly report into the monitoring server. By polling, you can put your monitoring server behind a firewall and allow no incoming traffic into the server. It’s also important to keep in mind that you need the ability to scale the monitoring server as the number of nodes it must monitor grows.
The process that checks instance vitals must vary for each instance based on its function. A load balancer can be fairly dumb, and so all the monitor needs to worry about is the server’s RAM and CPU utilization. A database server needs to be slightly more intelligent: the vitals process must review disk I/O performance for any signs of trouble. The most difficult monitoring process supports your application servers. It should be capable of reporting not simply how much the instance’s resources are being utilized, but what the activity on the instance looks like.
The monitoring server then uses analytics to process all of that data. It knows when it is time to proactively add and remove scale, how to recognize unexpected activity, and how to trigger rules in response to unexpected activity.
The procurement process in the cloud
Whether you scale dynamically or through a human pulling the levers, the way you think about spending money in the cloud is very different from a traditional infrastructure. When you add new resources into your internal data center or with a managed services provider, you typically need to get a purchase order approved through your company procurement processes. Finance approves the purchase order against your department’s budget, and the order goes off to the vendor. You don’t spend $3,000 on a server unless that spend is cleared through Finance.
Nothing in the AWS infrastructure prevents you from executing ec2-run-instances just one time on an EC2 extra-large instance and spending $7,000 over the course of a year. Anyone who has access to launch new instances or alter the scaling criteria of your cloud management tools has full procurement rights in the cloud. There’s no justification that an IT manager needs to make to Finance; it’s just a configuration parameter in a program that Finance may never touch.
Finance should therefore be involved in approving the monthly budget for the team managing the cloud infrastructure. Furthermore, controls should be in place to make sure any alterations in the resources you have deployed into the cloud are aligned with the approved budget. If you don’t put this human process in place, you may find Finance turning from the biggest supporter of your move into the cloud to your biggest critic.
Managing proactive scaling
A well-designed proactive scaling system enables you to schedule capacity changes that match your expected changes in application demand. When using proactive scaling, you should understand your expected standard deviation. You don’t need to get out the statistics textbooks…or worse, throw out this book because I mentioned a term from statistics. I simply mean you should roughly understand how much normal traffic deviates from your expectations. If you expect site activity of 1,000 page views/hour around noon and you are seeing 1,100, is that really unexpected? Probably not.
Your capacity for any point in time should therefore be able to handle the high end of your expected capacity with some room to spare. The most efficient use of your resources is just shy of their capacity, but scheduling things that way can create problems when your expectations are wrong—even when combined with reactive scaling. Understanding what constitutes that “room to spare” is probably the hardest part of capacity planning.
0 comments :
Post a Comment
Note: only a member of this blog may post a comment.