What is the size of your RDS backups and snapshots?

According to the Amazon documentation, many users never pay for automated backups and DB Snapshots

There is no additional charge for backup storage, up to 100% of your total database storage for a region. (Based upon our experience as database administrators, the vast majority of databases require less raw storage for a backup than for the primary dataset, meaning that most customers will never pay for backup storage.)

But what if you are not one a lucky one? How do you know if you can increase the retention period of a RDS instance without ending up paying for extra storage? Or how much do you need to reduce your retention period to use only the (free) quota?

Here it gets tricky: there is no API to find out the total size of your backups or snapshots and there is nothing available in the console either.

As snapshots are incremental, it is really down to the activity of your database and it might be quite hard to predict the storage extra you are going to pay at the end of the month if you have a fixed retention or some business requirements on manual snapshots.

Sometime it might even be better to allocate more storage to the RDS instances in the region – and have access as well to higher IOPS on standard SSD storage – than simply pay for the extra storage of the snapshots.

But how can you figure out the best balance to take advantage of the free storage or forecast your backup storage? There is no direct way to find it out but you can estimate the cost – and so indirectly the size –  by running a billing report.

Billing report and backup usage information

You can get a billing report from the console going to the Billing & Cost Management and:

  • select “Reports”
  • choose “AWS Usage Report”
  • select “Amazon RDS Service”
  • specify time period and  granularity
  • download report in XML or CVS format.
  • open the report and check the “TotalBackupUsage” and “ChargedBackupUsage”

Knowing now the cost and the AWS S3 pricing for your region, you can now determine the storage.

How can I automate the process?

You can generate and have a Hourly Cost Allocation Report delivered automatically to a S3 bucket and process the information and create your alarms and logic accordingly.

For more information see the Monthly Cost Allocation Report page

Alexa, what is the capital of Israel?

rob-bye-103200The status of Jerusalem has been in the news a lot in the last few weeks, since Donald Trump confirmed the US now recognize the city as capital of Israel. And the recent UN voting on rejecting the recognition.

If maps and location-based services present unexpected challenges in a disputed territory to the developer who targets an international audience, voice services are the new frontier.

While preparing a few examples for my talk at Codemotion Berlin last October,  I asked Alexa (in German) the question “Alexa, what is the capital of Israel?”

And on October 3rd, the answer was

Die Hauptstadt von Israel ist Jerusalem

that translates in a short, direct but controversial

The capital of Israel is Jerusalem

An answer that might upset a significant number of users of the Amazon service. Apparently while Mr Trump followed the advice from Alexa, Amazon rectified the answer in the meantime. Asking the very same question today, you have a longer

Israel hat Jerusalem zu seiner Hauptstadt erklaert, diese wird jedoch nicht von allen Staaten anerkannt

that translates in

Israel has declared Jerusalem to be its capital, but it is not recognized by all states

Someone might argue that actually most states do not recognize it but it is definitely a more accurate answer than the initial one and that targets a wider audience.

This is something easier to address in a voice service than a geolocation decoding challenge,  but it is still an example of the problems that a software developer has to face to target an international in a disputed territory.

 

Triggering a failover when running out of credits on db.t2

Since 3 years ago, RDS offers the option of running a MySQL database using the T2 instance type, currently from db.t2.micro to db.t2.large. These are low-cost standard instances that provide a baseline level of CPU performance – according to the size — with the ability to burst above the baseline using a credit approach.

You can find more information on the RDS Burst Capable Current Generation (db.t2) here and on CPU credits here.

What happens when you run out of credits?

There are two metrics you should monitor in CloudWatch, the CpuCreditUsage and the CpuCreditBalance. When your CpuCreditBalance approaches zero, your CPU usage is going to be capped and you will start having issues on your database.

It is usually better to have some alarms in place to prevent that, either checking for a minimum number of credits or spotting a significant drop in a time interval. But what can you do when you hit the bottom and your instance remains at the baseline performance level?

How to recover from a zero credit scenario?

The most obvious approach is to increase the instance class (for example from db.t2.medium to db.t2.large) or switch to a different instance type, for example a db.m4 or db.c3 instance. But it might not be the best approach when your end users are suffering: if you are running a Multi-AZ database in production, this is likely not the fastest option to recover, as it first requires a change of instance type on the passive master and then a failover to the new master.

imageedit_7_3403406284

You can instead try a simple reboot with failover: the credit you see on CloudWatch is based on current active host and usually your passive master has still credits available as it is usually less loaded than the master one. As for the sceenshot above, you might gain 80 credits without any cost and with a simple DNS change that minimizes the downtime.

Do I still need to change the instance class?

Yes. Performing a reboot with failover is simply a way to reduce your recovery time when having issues related to the capped CPUs and gain some time. It is not a long term solution as you will most likely run out of credits again if you do not change your instance class soon.

To summarize, triggering a failover on a Multi-AZ RDS running on T2 is usually a faster way to gain credits than modifying immediately the instance class.