Description

Provides an overview of the concept of Vertical Auto-Scaling

Content / Solution:

Vertical Auto Scaling

Vertical Auto Scaling allows a user to create a set of rules to modify the amount of CPU or RAM allocated to an existing Cloud Server when a pre-defined threshold is breached. This allows users to scale the resources associated with a given server based on monitoring results. However, since changes to CPU/RAM require a restart of the server, the server also incurs brief downtime as part of this change.

How to Use Vertical Auto Scaling

To take advantage of Vertical Auto Scaling, you will need to deploy a single Cloud Server in your Cloud environment with the Cores per Socket setting set to "1".  Although the server can be running or stopped, a stopped server cannot generate the utilization necessary to trigger a Vertical Auto Scaling rule, so you will want to ensure that it is running at all times. See the following article for details on how to deploy a Cloud Server:

Once the server has been deployed, you will need to enable Cloud Monitoring for the server and bring it into service.  See the following article for details on how to enable Cloud Monitoring:

Once you have deployed your Cloud Server, you will need to create a new sub-administrator account with the 'Network' and 'Server' roles in the Admin UI.  This account will be used to call the Cloud APIs that perform the Vertical Auto Scaling actions.  Since Auto Scaling leverages our existing Cloud APIs, you will be able to track all of the Auto Scaling activity (e.g. start server, stop server, etc.) tied to this account in the Administrative Logs found in the Admin UI.

You will be prompted to add the username and password to the Auto Scaling Manager before you can create any Auto-Scaling rules.  (If you have already added sub-administrator credentials to the Auto Scaling Manager, then you can bypass this step.)  See the following article for details on how to create a sub-administrator account.

You are now prepared to create your Vertical Auto Scaling rule in the Cloud Monitoring portal.  See the following article for details on how to create a Vertical Auto Scaling rule:

When a Vertical Auto Scaling event is triggered based on the thresholds specified in the rule, the monitoring system identifies the current configuration of the specified Cloud Server.  If the Auto Scaling rule was triggered by a breach of the maximum Auto Scaling threshold, then the system will ADD 1 CPU or 1 GB of RAM to the Server until the Server's configuration is compliant with the Auto-Scaling rule(s).  If the Auto Scaling rule was triggered by a breach of the minimum Auto Scaling threshold, then the system will REMOVE 1 CPU or 1 GB of RAM resources from the server. 

The system will continue to perform these actions until one of these events occurs:

  1. Neither the maximum nor the minimum threshold is breached in the amount of time specified in the Auto Scaling rule
  2. The minimum resource limit specified in the Auto Scaling rule has been reached
  3. The Cloud Server has reached the minimum or maximum number of CPU or RAM resources allowed by the configuration of the data center.

Limitations

It is important to note that changing the amount of CPU or RAM assigned to a server requires a shutdown and restart of the server in order to update the resources.  You will need to ensure that your configuration is prepared for the downtime.

Important Note

Vertical Auto-Scaling will only work if the Server is configured to use 1 core per socket. You can manage the "cores per socket" setting for your server. See How to Manage a Cloud Server