Alexa, what is the capital of Israel?

The status of Jerusalem has been in the news a lot in the last few weeks, since Donald Trump confirmed the US now recognize the city as capital of Israel. And the recent UN voting on rejecting the recognition.

If maps and location-based services present unexpected challenges in a disputed territory to the developer who targets an international audience, voice services are the new frontier.

While preparing a few examples for my talk at Codemotion Berlin last October,  I asked Alexa (in German) the question “Alexa, what is the capital of Israel?”

And on October 3rd, the answer was

Die Hauptstadt von Israel ist Jerusalem

that translates in a short, direct but controversial

The capital of Israel is Jerusalem

An answer that might upset a significant number of users of the Amazon service. Apparently while Mr Trump followed the advice from Alexa, Amazon rectified the answer in the meantime. Asking the very same question today, you have a longer

Israel hat Jerusalem zu seiner Hauptstadt erklaert, diese wird jedoch nicht von allen Staaten anerkannt

that translates in

Israel has declared Jerusalem to be its capital, but it is not recognized by all states

Someone might argue that actually most states do not recognize it but it is definitely a more accurate answer than the initial one and that targets a wider audience.

This is something easier to address in a voice service than a geolocation decoding challenge,  but it is still an example of the problems that a software developer has to face to target an international in a disputed territory.

 

Enabling encryption at rest for a running RDS instance

Since a few months ago, Amazon RDS supports encryption at rest for db.t2.small and db.t2.medium database instances. As AWS points out, to save money without compromising on security, you can also run small production workloads on T2 database instances.

Unless you are running Previous Generation DB Instances or you can only afford to run a db.t2.micro (the only T2 where encryption of the storage is not supported), there is really no justification anymore to skip encryption at rest on AWS.

How to encrypt a new instance

Enabling encryption at rest for a new RDS instance is simply a matter of setting a parameter in the CLI create-db-instance request

[--storage-encrypted | --no-storage-encrypted]

or a check-box in the RDS console. But what about existing instances? As for today, you cannot simply modify the property encryption of the running instance.

Snapshot approach

The simplest way is to have an encrypted MySQL instance is to terminate the existing instance with a final snapshot (or take a snapshot in a read only scenario). With the encryption option of RDS snapshot copy, it is possible to convert a unencrypted RDS instance into encrypted simply starting a new instance from the encrypted snapshot copy:

aws rds copy-db-snapshot --source-db-snapshot-identifier --target-db-snapshot-identifier --kms-key-id arn:aws:kms:us-east-1:******:key/016de233-693e-4e9c-87e8-**********

where the kms-key-id is the KMS encryption key.

Unfortunately this is very simple but still requires a significant downtime as you will not be able to write to your RDS instance from the moment that you take the first snapshot to the moment the new encrypted instance is available. This can be a matter of minutes or hours, according to the size of your database.

No or limited downtime?

There are at least two more options on how to encrypt the storage for an existing RDS instance:

1) use AWS Database Migration Service, aka DMS: source and target will have the same engine and same schema but target will be encrypted. Maybe feasible but it is usually not suggested to use DMS for homogeneous engines.

2) use a native MySQL read replica with a similar approach to the one documented by AWS to move RDS MySQL Databases from EC2 classic to VPC.

Encrypting and promoting a read replica

Le’t see how we can leverage MySQL native replication to convert an unencrypted RDS instance to encrypted RDS instance with reduced down time. All the tests below have been performed on a MySQL 5.7.19 (the latest available RDS MySQL) but should work on any MySQL 5.6+ deployment. Let’s assume the existing instance is called test-rds01 and a master user rdsmaster

1. We create a RDS read replica test-rds01-not-encrypted of the existing instance test-rds01.

aws rds create-db-instance-read-replica --db-instance-identifier test-rds01-not-encrypted --source-db-instance-identifier test-rds01

2. Once the read replica is available, we stop the replication using the RDS procedure “CALL mysql.rds_stop_replication;” Note that not having super user on the instance, the procedure is the only available approach to stop the replication.

$ mysql -h test-rds01-not-encrypted.cqztvd8wmlnh.us-east-1.rds.amazonaws.com -P 3306 -u rdsmaster -pMyDummyPwd --default-character-set=utf8 -e "CALL mysql.rds_stop_replication;"
+---------------------------+
| Message |
+---------------------------+
| Slave is down or disabled |
+---------------------------+

3. We now can save the the binary log name and position from the RDS replica that we will need later on calling:

$ mysql -h test-rds01-not-encrypted.cqztvd8wmlnh.us-east-1.rds.amazonaws.com -P 3306 -u rdsmaster -pMyDummyPwd --default-character-set=utf8 -e "show slave status \G"
*************************** 1. row ***************************
Slave_IO_State:
(...)
Relay_Master_Log_File: mysql-bin-changelog.275872
(...)
Exec_Master_Log_Pos: 3110315

4. We can now create a snapshot test-rds01-not-encrypted of the RDS replica test-rds01-not-encrypted as the replication is stopped.

$ aws rds create-db-snapshot --db-snapshot-identifier test-rds01-not-encrypted --db-instance-identifier test-rds01-not-encrypted

5. And once the snapshot test-rds01-not-encrypted is available, copy the content to a new encrypted one test-rds01-encrypted using a new KMS key or the region and account specific default one:

$ aws rds copy-db-snapshot --source-db-snapshot-identifier test-rds01-not-encrypted --target-db-snapshot-identifier test-rds01-encrypted --kms-key-id arn:aws:kms:us-east-1:03257******:key/016de233-693e-4e9c-87e8-******

6. Note that our original RDS instance test-rds01 is still running and available to end users, we are simply building up a large Seconds_Behind_Master. Once the copy is completed, we can start a new RDS instance test-rds01-encrypted in the same subnet of the original RDS instance test-rds01

$ aws rds restore-db-instance-from-db-snapshot --db-instance-identifier test-rds01-encrypted --db-snapshot-identifier test-rds01-encrypted --db-subnet-group-name test-rds

7. After waiting for the new instance to be available, let’s make sure that the new and original instances share the same security group and that that TCP traffic for MySQL is enabled inside the security group itself. Almost there.

8. We can now connect to the new encrypted standalone instance test-rds01-encrypted reset the external master to make it a MySQL replica of the original one.

mysql> CALL mysql.rds_set_external_master (
-> ' test-rds01.cqztvd8wmlnh.us-east-1.rds.amazonaws.com'
-> , 3306
-> ,'rdsmaster'
-> ,'MyDummyPwd'
-> ,'mysql-bin-changelog.275872'
-> ,3110315
-> ,0
-> );
Query OK, 0 rows affected (0.03 sec)

9. And we can finally start the encrypted MySQL replication on test-rds01-encrypted

mysql> CALL mysql.rds_start_replication;
+-------------------------+
| Message |
+-------------------------+
| Slave running normally. |
+-------------------------+
1 row in set (1.01 sec)

10. If all goes well, we can now check the Slave_IO_State calling

mysql> show slave status \G

on the encrypted MySQL instance. And we should see the value of Seconds_Behind_Master going down checking the status once in a while. For example:

Seconds_Behind_Master: 4561

Once the database catches up – Seconds_Behind_Master is down to zero – we have finally a new encrypted test-rds01-encrypted instance in sync with the original not encrypted test-rds01 RDS instance.

11. We can now start again the replica on the not encrypted RDS read replica test-rds01-not-encrypted that is still in a stopped status, in the very same way to make sure that the binary logs on the master get finally purged and do not keep accumulating.

mysql> CALL mysql.rds_start_replication;
+-------------------------+
| Message |
+-------------------------+
| Slave running normally. |
+-------------------------+
1 row in set (1.01 sec)

12. It’s is time to promote the read replica and have our application switching to the new encrypted test-rds01-encrypted instance. Our downtime starts here and as a very first step we want to make test-rds01-encrypted a standalone instance calling the RDS procedure:

CALL mysql.rds_reset_external_master

13. We can now point our application to the new encrypted test-rds01-encrypted or we can as well rename our RDS instances to minimize the changes. Let’s go with the swapping approach:

aws rds modify-db-instance --db-instance-identifier test-rds01 --new-db-instance-identifier test-rds01-old --apply-immediately

and once the instance is in available state (usually 1-2 minutes) again:

aws rds modify-db-instance --db-instance-identifier test-rds01-encrypted --new-db-instance-identifier test-rds01 --apply-immediately

We are now ready for the final cleanup, starting with the now useless test-rds01-not-encrypted read replica.

14. Before deleting the old not encrypted test-rds01-old, make sure you do need to the backups anymore: switching the instance your N days retention policy on automatic backups is now gone. It is usually better to stop (not delete) the old not encrypted test-rds01-old instance, until the N days are passed and the new encrypted test-rds01 instance has the same number of automatic snapshots.

15. Done! You can now enjoy your new encrypted RDS instance test-rds01

In short

Downtime is not important? Create an encrypted snapshot and create a new RDS instance. Otherwise use MySQL replication to create the encrypted RDS while your instance in running and swap them when you are ready.

Alexa & Grandma

The sound is not as good as a Bose speaker. I sadly had to remove a few broken valves to make space for a working audio device. But setting up Alexa and testing my Big World skill on Grandma’s radio is priceless. Next step, setting it up inside the radio.

IMG_20171128_072042097

Triggering a failover when running out of credits on db.t2

Since 3 years ago, RDS offers the option of running a MySQL database using the T2 instance type, currently from db.t2.micro to db.t2.large. These are low-cost standard instances that provide a baseline level of CPU performance – according to the size — with the ability to burst above the baseline using a credit approach.

You can find more information on the RDS Burst Capable Current Generation (db.t2) here and on CPU credits here.

What happens when you run out of credits?

There are two metrics you should monitor in CloudWatch, the CpuCreditUsage and the CpuCreditBalance. When your CpuCreditBalance approaches zero, your CPU usage is going to be capped and you will start having issues on your database.

It is usually better to have some alarms in place to prevent that, either checking for a minimum number of credits or spotting a significant drop in a time interval. But what can you do when you hit the bottom and your instance remains at the baseline performance level?

How to recover from a zero credit scenario?

The most obvious approach is to increase the instance class (for example from db.t2.medium to db.t2.large) or switch to a different instance type, for example a db.m4 or db.c3 instance. But it might not be the best approach when your end users are suffering: if you are running a Multi-AZ database in production, this is likely not the fastest option to recover, as it first requires a change of instance type on the passive master and then a failover to the new master.

imageedit_7_3403406284

You can instead try a simple reboot with failover: the credit you see on CloudWatch is based on current active host and usually your passive master has still credits available as it is usually less loaded than the master one. As for the sceenshot above, you might gain 80 credits without any cost and with a simple DNS change that minimizes the downtime.

Do I still need to change the instance class?

Yes. Performing a reboot with failover is simply a way to reduce your recovery time when having issues related to the capped CPUs and gain some time. It is not a long term solution as you will most likely run out of credits again if you do not change your instance class soon.

To summarize, triggering a failover on a Multi-AZ RDS running on T2 is usually a faster way to gain credits than modifying immediately the instance class.