Triggering a failover when running out of credits on db.t2

Since 3 years ago, RDS offers the option of running a MySQL database using the T2 instance type, currently from db.t2.micro to db.t2.large. These are low-cost standard instances that provide a baseline level of CPU performance – according to the size — with the ability to burst above the baseline using a credit approach.

You can find more information on the RDS Burst Capable Current Generation (db.t2) here and on CPU credits here.

What happens when you run out of credits?

There are two metrics you should monitor in CloudWatch, the CpuCreditUsage and the CpuCreditBalance. When your CpuCreditBalance approaches zero, your CPU usage is going to be capped and you will start having issues on your database.

It is usually better to have some alarms in place to prevent that, either checking for a minimum number of credits or spotting a significant drop in a time interval. But what can you do when you hit the bottom and your instance remains at the baseline performance level?

How to recover from a zero credit scenario?

The most obvious approach is to increase the instance class (for example from db.t2.medium to db.t2.large) or switch to a different instance type, for example a db.m4 or db.c3 instance. But it might not be the best approach when your end users are suffering: if you are running a Multi-AZ database in production, this is likely not the fastest option to recover, as it first requires a change of instance type on the passive master and then a failover to the new master.

imageedit_7_3403406284

You can instead try a simple reboot with failover: the credit you see on CloudWatch is based on current active host and usually your passive master has still credits available as it is usually less loaded than the master one. As for the sceenshot above, you might gain 80 credits without any cost and with a simple DNS change that minimizes the downtime.

Do I still need to change the instance class?

Yes. Performing a reboot with failover is simply a way to reduce your recovery time when having issues related to the capped CPUs and gain some time. It is not a long term solution as you will most likely run out of credits again if you do not change your instance class soon.

To summarize, triggering a failover on a Multi-AZ RDS running on T2 is usually a faster way to gain credits than modifying immediately the instance class.

Percona Live: Do Not Press That Button

PLD-17-01.pngIf I had to mention a single technical blog that I always find informative and I have followed for many years, I would say without doubts the Database Performance Blog from Percona. That is why I am so keen to attend this year the Percona Live Open Source Database Conference in Dublin and present a lighting talk Do Not Press That Button” on September 26th. You can find more about my short session on RDS here. Looking forward to Dublin!

The moment a project fails

The moment you realize a project you have been developing does not deliver can take different forms. It might even be accidental. For me the reality check was a tall de carrers sign in the streets of Barcelona.

A keen and slow runner, in the last few months I have been developing a tool to crawl and keep up to date running events for RaceBase World. The project was a mix of AWS Lambda, Scrapy and Python, able to collect over 25K races around the world and keep them up to date. Not an easy task.

The goal was simple: sometime you are lucky enough to plan your holidays around a marathon abroad, possibly one of the largest events around the world.  More often you plan your vacations or business trips and then you simply wonder if there is a running event in the area.

I have now been in Barcelona for a few weeks, I have been looking for what I believed was an unlikely road race in summer and my lovely crawler could not find any. And Runner’s World Spain could not find one too. 

Still a sign on the door of the building where I live is telling me that up to 5 thousand runners are going to run in Barcelona next Sunday for la Cursa Barça. No better way to prove that my global database is inefficient.

You can of course try to collect million races worldwide, something that is very hard to achieve and will anyway generate too much noise for the end user. But having only a few thousand events globally will include only the large races (the ones a runner can find without any help from RaceBase World or any other website) and a few local random ones.

And the moment I cannot trust my own project to find a race, I can consider it a failure. But before working on a new idea or a new (local) approach to discover new races, it is time to join la Cursa Barça and forget Python.

From Macedonia to Codemotion

As the BBC recently reported, Matthew Nimetz has spent the last 23 years trying to find a name for the republic of Macedonia that can be accepted both in Skopje and Athens. But a solution for the Macedonia naming dispute has not been agreed yet.

Screenshot from 2017-08-10 18-43-36

What name should a developer use today when working on location-based services? The user friendly Macedonia or the formal but longer The Former Yugoslav Republic of Macedonia ? How can you make your users in Skopje and Athens both happy?

This is one of the examples I might use at the next Codemotion in Berlin. I do not expect to discuss in half an hour all the geopolitical challenges targeting an international audience and their workarounds, but I am very excited to present the talk “The (accidental) political developer” .

dhvqufnumaevjlh

You can find the abstract here and an introduction to the topic in my previous posts, The (accidental) political software developer and Location-based services and countries on AWS.

See you on October 12 & 13 at Kulturbrauerei!

Run Alexa Run

I have always been a slow but keen runner. And I have always loved joining running events as an excuse to travel and to have one more weekend on the road. What is now called a runcation.

While looking for one more reason to play with Alexa and Amazon Lambda, developing a simple skill to find the next race in a country was an obvious choice.

Thanks to Matt, you can read more about the experiment on RaceBase World.

Alexa and RaceBase World

My next challenge?

Rely on Alexa’s results to choose my next race.  With over 16000 events, from 5K to 100 miles, in 168 countries on RaceBase World,  a bug in the still very basic Lambda function might be costly:  I am getting ready for a less ordinary race and an obscure destination.

Below  is a short audio demo, the code is available on GitHub. The skill is currently available on the UK store only.