This document provides a high-level overview of Cloud Backup Metering and Billing.
Cloud Backup customers are billed on the type and amount of storage consumed by their data backups at the back-end storage. Cloud Backup storage consumption at the back-end is metered on a regular basis. Currently, daily billing and metering data provide this information to the clients and are available to the customers via the summary and details available to the customers via Usage Reports.
The amount of back-end storage consumed varies from the size of actual customer's data being backed-up due to a number of factors, including:
- Compression and deduplication
- Secondary Copies
- Data aging affected by Storage Policies
- Synthetic Full Backups from automatic consolidation process
- Failure rate of backup jobs
Since early 2016, Cloud Backup incorporates the new Chargeback capabilities to reflect the actual (on average) backup storage consumption at the back-end considering the above factors.
Usage Reports can be generated in the CloudControl UI by the Primary Administrator, or a User with the Reports role.
Information on the various reports generated through Cloud Control UI can be accessed via the following links:
- Backup Usage Reports
- Backup Dashboard
- How to Create a Backup Job History Report
- Cloud Backup - Reading Summary Usage and Backup Usage Reports for Cloud Backup Information
- Cloud Backup - Reading Backup and Restore Job Reports
- Error Messages in Reports
- Usage data is retained for 1 year
- Usage data can be viewed as a range of dates of up to 1 month, either online or downloaded in CSV format.
- Summary Usage Report shows the total consumption for the period selected,
- Detailed Usage Report can be used to drill down into daily storage consumption of each asset being backed up.
- Storage usage is measured in GB
- Prior to Cloud Backup version 3.9 (released late 2016) , storage usage was Metered once a week and the report was run Monday every week. Post R3.9 roll out, this has been moved into a daily reporting.
- The usage reports properly reflect the actual backup storage consumed at the back-end, taking into account the Storage Policy and the various factors outlined above.
Metering On Back-end Storage
Metering and Billing within Cloud backup is done based on the Back-end Storage Usage per customer. Reasoning behind this is, although client side compression allows the customer to save on storage costs, metering and billing on back-end storage allow Cloud Backup to pass on the benefits of large-scale scale deduplication. Metering occurs at 11 pm every day and takes a time slice as to what the state of deduplication and compression is at that time.
Please Note : After System Backups complete and the deduplication occurs there can be a shift in the system wide compression rates.
Data protection and redundancy can be achieved with secondary copies that can be configured in Storage Policy. Secondary copies are metered and billed and will add onto the storage requirements for the customer.
Deduplication Technical Overview
The following is the general workflow for de-duplication:
Generating signatures for data blocks
- A block of data is read from the source and a unique signature for the block of data is generated by using a hash algorithm.
- Data blocks can be compressed (default), encrypted (optional), or both. Data block compression, signature generation, and encryption are performed in that order on the source or destination host.
- The new signature is compared against a database of existing signatures for previously backed up data blocks on the destination storage. The database that contains the signatures is called the Deduplication Database (DDB).
- If the signature exists, the DDB records that an existing data block is used again in the destination storage. The associated MediaAgent writes the index information to the DDB on the destination storage, and the duplicate data block is discarded.
- If the signature does not exist, the new signature is added to the DDB. The associated MediaAgent writes both the index information and the data block to the destination storage.
- Signature comparison is done on a MediaAgent. For improved performance, you can use a locally cached set of signatures on the source host for the comparison. If a signature does not exist in the local cache set, it is sent to the MediaAgent for comparison.
Using MediaAgent roles
- During the deduplication process, two different MediaAgents roles are used. These roles can be hosted by the same MediaAgent or different MediaAgents:
- Data Mover Role: The MediaAgent has write access to disk libraries where the data blocks are stored.
- Deduplication Database Role: The MediaAgent has access to the DDB that stores the data block signatures.
- An object (file, message, document, and so on) written to the destination storage may contain one or many data blocks. These blocks might be distributed on the destination storage whose location is tracked by the MediaAgent index. This index allows the blocks to be reassembled so that the object can be restored or copied to other locations. The DDB is not implemented during the restore process.
Data Deduplication FAQ
- What is deduplication ?
Data deduplication (dedupe) is a method of reducing storage by eliminating redundant data including duplicate blocks of data. . Please see technical overview below to understand the process further.
- What are the benefits of deduplication ?
Deduplication has a variety of benefits, including :
- By identifying unique data and processing them during backup operations, deduplication also reduces network traffic.
- Through deduplication we allow customers to enjoy the benefits of sharing our infrastructure. Since we backup data only once,the cost of storage is dramatically reduced. To demonstrate this lets assume 10 customers backup the same Windows operating system. Cost of backing up this OS is shared and split between all these customers. ie) each customer might be only charged for a fraction of what they actually backup.
- By reducing the quantity of data transferred to the storage media, the media can support additional capacity. If the data is later restored/recovered, the system automatically decompresses the data and restores it to its original state.
- Does Cloud Backup access my live production data to execute the deduplication process?
Short Answer is No. Cloud Backup has no information on the actual data being written on disk. Cloud Backup uses a method of signature generation and comparison to "dedupe" uncompressed data.
- Does deduplication impact my ability to restore / recover my backed up data ?
Ability to restore/recover data is not impacted by deduplication operations. Deduplication happens on our back end systems and is handled internally to provide the customer with the greatest savings possible.
- What Compression options can the system support ?
Due to the complexity of the data compression processes, the system does not expose the data compression options to the users. Dimension data, with expert advice, ensures that the system is configured with the ideal compression operation. On a high level, the following data compression options are configured :
- Software compression which includes options to compress the data in the:
- Hardware compression for libraries with tape media at the individual data path
- Software compression which includes options to compress the data in the:
- Is there a way to predict Deduplication Ratios ?
Unfortunately no. Deduplication depends on a myriad of complex factors which will change the deduplication ratios.
At the time of writing this document, Cloud Backup globally averaged at around 62% compression across all regions by deduplication of items such as operating systems, database server software, and common applications.
- What factors impact Deduplication Ratios ?
Some of the factors are presented below : Please note : This is not an exhaustive list.
- Backup of commonly backed-up data like Operating systems or applications, deduplication savings will be significantly higher
- Pre-compressed data (like images and videos) might provide negligible savings. Compressed data often increases in size if it is again subjected to compression.
- Highly transactional data or frequently changing data might produce a significant delay in deduplication process, thereby reducing the storage savings.
- Since our systems are multi-tenanted, deduplication ratios are dependent on the number of customers sharing the same block of data.
Can deduplication cause fluctuations in my billing ?
Although we try to reduce the noise in billing due to the changing deduplication rates, this is not uncommon. For demonstration purposes, lets assume a simple case where three customers backup 1 GB of data without any data aging and other operations .
# What Happens Deduplication Savings Customer A Customer B Customer C 1 Customer A backs up a block of data. Pre-dedupe size 1 GB 0 % Is charged for the entire backup size 1GB N/A N/A 2 Customer B backs up the same block of data (1GB). Deduplication happens and the block is shared with A 50% Billing is 50% of the original bill from Step 1 Is Charged for 500 MB instead of the 1 GB due to deduplication Savings N/A 3 Customer C backs up the same block of data (1GB). Deduplication happens and the block is shared with A & B 66% Billing is 34% of the original bill from Step 1 Bill is 68% of the original bill from Step 2 Is Charged for 333 MB instead of the 1GB due to deduplication Savings 4 Customer C deletes their backup data. Deduplication ratios fall because the block is shared between 2 customers instead of the original 3 50% Billing is 50% of the original bill from Step 1 . Compared to Step 3 , billing might increase even without Customer A executing any actions. Bill is reverted back to the original bill from Step 2
* The data above is for demonstration purposes only. Actual results may vary. The above case is an exaggeration of the impact of deduplication. In reality, there is a number of blocks of data within a backup job and the blocks are shared with a number of customers to reduce the risk of the above case happening.
*Note Deduplication savings are in addition to file compression savings.
- How is newly deployed environments different ?
Newly deployed environments can lead to artificially high but unsustainable compression rates (of up to 90%), predominantly due to backups being seeded with test VM's. Over time, there will be a shift towards more manageable and realistic numbers.
- My Powered off hosts has a growing usage? How is this possible ?
In powered off hosts, metering data can significantly change. Compressed data is compressed at a fixed rate. Deduplication will be in flux as this is a shared compression saving rate in addition to the base compression. This means that even though the system is powered off the de-duplicated sections of data are still subject to changing compression rates.
Cloud Backup Plans:
- Cloud Backup Essentials
- Cloud Backup Advanced
- Cloud Backup Enterprise
For further details on the different Cloud Backup plans, see the Plans section of Introduction to Cloud Backup Metering and Billing