What is the size of your RDS backups and snapshots?

According to the Amazon documentation, many users never pay for automated backups and DB Snapshots

There is no additional charge for backup storage, up to 100% of your total database storage for a region. (Based upon our experience as database administrators, the vast majority of databases require less raw storage for a backup than for the primary dataset, meaning that most customers will never pay for backup storage.)

But what if you are not one a lucky one? How do you know if you can increase the retention period of a RDS instance without ending up paying for extra storage? Or how much do you need to reduce your retention period to use only the (free) quota?

Here it gets tricky: there is no API to find out the total size of your backups or snapshots and there is nothing available in the console either.

As snapshots are incremental, it is really down to the activity of your database and it might be quite hard to predict the storage extra you are going to pay at the end of the month if you have a fixed retention or some business requirements on manual snapshots.

Sometime it might even be better to allocate more storage to the RDS instances in the region – and have access as well to higher IOPS on standard SSD storage – than simply pay for the extra storage of the snapshots.

But how can you figure out the best balance to take advantage of the free storage or forecast your backup storage? There is no direct way to find it out but you can estimate the cost – and so indirectly the size –  by running a billing report.

Billing report and backup usage information

You can get a billing report from the console going to the Billing & Cost Management and:

  • select “Reports”
  • choose “AWS Usage Report”
  • select “Amazon RDS Service”
  • specify time period and  granularity
  • download report in XML or CVS format.
  • open the report and check the “TotalBackupUsage” and “ChargedBackupUsage”

Knowing now the cost and the AWS S3 pricing for your region, you can now determine the storage.

How can I automate the process?

You can generate and have a Hourly Cost Allocation Report delivered automatically to a S3 bucket and process the information and create your alarms and logic accordingly.

For more information see the Monthly Cost Allocation Report page

 

Base performance and EC2 T2 instances

Almost three years ago AWS launched the now very popular T2 instances, EC2 servers with burstable performance. As Jeff Barr wrote in 2014:

Even though the speedometer in my car maxes out at 150 MPH, I rarely drive at that speed (and the top end may be more optimistic than realistic), but it is certainly nice to have the option to do so when the time and the circumstances are right. Most of the time I am using just a fraction of the power that is available to me. Many interesting compute workloads follow a similar pattern, with modest demands for continuous compute power and occasional needs for a lot more.

It took a while for the users to fully understand the benefits of the new class and how to compute and monitor CPU credits but the choice between different instance types was very straightforward.

A bit of history…

Originally there were only 3 instance types (t2.micro, t2.small and t2.medium) and base performance, RAM and CPU Credits were very clear, growing linearly.

Instance     Base    RAM    Credits/hr

t2.micro     10%     1.0     6     
t2.small     20%     2.0     12     
t2.medium    40%     4.0     24

And price too. A t2.medium was effectively equivalent to 2 small instances or 4 micro instances. Both as credits, base rate and price. So far so good.

At the end of 2015, AWS introduced an even smaller instance, the t2.nano but the approach was still the same:

Instance     Base    RAM    Credits/hr

t2.nano      5%     0.5     3

Still same approach, nothing different.

Now large and even bigger!

But AWS extended the T2 class in the large range too, having first a t2.large in June 2015 and t2.xlarge and t2.2xlarge at the end of 2016. A lot more flexibility and a class that can cover many use cases with the chance of vertical scaling but finally the linear grow was broken:

Instance  RAM    Credits/hr Price/hr

t2.large        2        8      $0.094
t2.xlarge       4       16      $0.188
t2.2xlarge      8       32      $0.376

So far so good, the price per hour doubles according to vCPU and Memory. So a t2.2xlarge is equivalent to 4 t2.large. But what about the base performance?

Instance    Base Performance

t2.large      60% (of 200%)
t2.xlarge     90% (of 400%)
t2.2xlarge   135% (of 800%)

A t2.2xlarge is not equivalent to 4 t2.large.

I have a better base rate that I can run forever as average for cCPU running 4 nodes of t2.large (I would be able to average 30% on every vCPU) versus running a single t2.2xlarge (where I have a base performance of less than 17% on every vCPU) for the very same price.

The bigger you go, the lower the base performance by vCPU is.

So what?

vitaly-145502
Even without the loss in term of base performance, you have many reasons to choose more small instances: better HA in multi AZ, easier horizontal scaling, better N+1 metrics

But with T2 large+ instances even the AWS pricing strategy pushes you away from a single instance.

Unless you have an application that definitely benefit from a single larger t2 instance (for example a database server), spread out your load across smaller instances, with the T2 class you have one more reason for that.

The recently announced instance size flexibility for EC2 reserved instances makes it even easier to adapt the instance type even if you have a lot of RI capacity.

S3 Reduced Redundancy Storage is (almost) dead

As a software developer and architect I have spent countless hours discussing the benefits, the costs and the challenges of deprecating an API or a service in a product. AWS has the opportunity and the business model to skip the entire discussion and simply use the pricing element to make a service useless.

Let’s take the Reduced Redundancy Storage option for Amazon S3. It has been around since 2010 and the advantage versus standard storage is (actually, was) the cost. As for the AWS documentation:

It provides a cost-effective, highly available solution for distributing or sharing content that is durably stored elsewhere, or for storing thumbnails, transcoded media, or other processed data that can be easily reproduced. The RRS option stores objects on multiple devices across multiple facilities, providing 400 times the durability of a typical disk drive, but does not replicate objects as many times as standard Amazon S3 storage

Again, the only benefit of Reduced Redundancy Storage is the cost. Once you remove the cost discount, it’s a useless feature.

imageedit_1_2408963963

If you make it more expensive than standard storage you are effectively deprecating it without having to change a single SDK or API signature. And that’s exactly what AWS did lowering the price of standard storage class without changing the one for the Reduced Redundancy Storage option.

These are the prices for US East (N. Virginia) but similar differences apply in other regions:

Amazon S3 Reduced Redundancy Storage
imageedit_4_5408119950

The only real change was then moving the RRS from the main S3 page (where now the options do not include RRS anymore) to a separate one.

Amazon S3

imageedit_7_4172100111

S3 Reduced Redundancy Storage is still there, they did not even increase the price of the service. But it’s a dead feature and you have no reason to use it anymore. An amazing approach to the challenge of deprecation.