I am taking three months off from public speaking to recharge my batteries and work on new content, but I will back on stage next February at CloudWorld 2021 to talk about cost optimization on AWS with a special focus on storage and data. This is a new topic I never covered before but that I am really passionate about. The title and abstract of my session at DeveloperWeek are below.
Learn how to manage better your costs on AWS, make your bill less scary or your credits last longer
Should you really always run your cluster in multiple availability zones? How can a transition rule to Glacier double your storage costs? I want to monitor and understand my data transfer costs, where should I start? Why are backups eating my database budget? What, one more storage class on S3? Following so-called “best practices” works only when you fully understand the implications, costs included. We will address a few cloud anti-patterns, making your bill smaller and your deployment better.
Looking forward to one more virtual conference in 2021!
Excited, surprised, humbled at being recognised by the AWS Heroes program. I am now a AWS Data Hero, the first one in Germany. Looking forward to this new opportunity and challenge!
A recap of the news articles I wrote for InfoQ in September 2020.
Multi-Cloud: Worst Practice or the Future of Public Cloud?
Corey Quinn, cloud economist at The Duckbill Group, recently argued that multi-cloud is “the worst practice to be avoided by default”. Not everyone agrees.
Google Cloud has recently made MySQL 8.0 available on Cloud SQL, the managed relational database service for MySQL, PostgreSQL, and SQL Server.
AWS Introduces New EBS Volume io2 With Higher Durability and IOPS/GiB
AWS recently introduced a new provisioned IOPS volume type (io2) for high-performance databases and workloads that offers a durability of 99.999% and the ability to provision up to 500 IOPS for every GiB of storage.
Public Beta of Google Cloud API Gateway Now Available
At the recent Google’s Cloud Next virtual conference, Google announced the public betaof API Gateway, a fully-managed Google Cloud service to create and monitor APIs for serverless workloads.
Is the AWS Free Tier Really Free?
Corey Quinn, cloud economist at The Duckbill Group, argues that the free tier in AWS is broken and AWS should change it. The free models of the main cloud providers differ and might not help beginners in following best practices in cloud deployments.
Using Serverless Backends to Iterate Quickly on Web Apps
In a series of three technical articles, AWS has recently shown the advantage of building serverless backends to iterate quickly on web apps and be able to follow changing product requirements. These development methodology and architecture allow flexibility but increase coupling with cloud vendor services.
Amazon Cloudwatch Dashboards Supports Sharing
AWS recently introduced the ability to share Amazon CloudWatch Dashboards with users who do not have access to the AWS account. This feature opens up new use cases for dashboards, including sharing metrics and information on big screens, or embed real-time information in public pages.
In a series of three technical articles, AWS has recently introduced the new “Serverless LAMP stack”. But not everyone in the open-source community believes that the successor of the LAMP stack is proprietary technologies from a single vendor, and alternative approaches have been suggested.
The AWS Serverless LAMP Stack: The Future of PHP or Vendor Lock-in?
Amazon RDS Proxy is a new fully managed, highly available database proxy for MySQL and PostgreSQL databases running on Amazon RDS and Aurora. The service is tailored to serverless architectures and other applications that open and close database connections at a high rate
At re:Invent in Las Vegas in December 2019, AWS announced the public preview of RDS Proxy, a fully managed database proxy that sits between your application and RDS. The new service offers to “share established database connections, improving database efficiency and application scalability”.
One of the key features was the ability to increase application availability, significantly reducing failover times on a Multi AZ RDS instance. Results were indeed impressive.
But a key limitation was that there was no opportunity to change the instance size or class once the proxy has been created. That means it could not be used to reduce downtime during a vertical scaling of the cluster and made the deployment less elastic.
Time for a second look?
Last week AWS announced finally the GA of RDS Proxy and I thought it was a good time to take a second look at the service. Any further improvements in the failover? Can you now change the instance size once the proxy has been created?
Weird defaults?
One of the first and few values you should choose when you set up an Amazon RDS Proxy is it the idle client connection timeout. It is already hard to figure out the optimal value in an ideal scenario. But having a user interface that suggests a default of 30 minutes with a label that states “Max: 5 minutes” makes it more difficult. Almost all if the drop down list let you set any value up to 1 hour.
5 or 30 minutes?
Let us play!
I created again a test-rds and a test-proxy and I decided to perform the very same basic tests I did last December. I started two while loops in Bash, relying on the MySQL client, each one asking every 2 seconds the current date and time to the database:
$ while true; do mysql -s -N -h test-proxy.proxy-***.eu-central-1.rds.amazonaws.com -u testuser -e "select now()"; sleep 2; done
$ while true; do mysql -s -N -h test-rds.***.eu-central-1.rds.amazonaws.com -u testuser -e "select now()"; sleep 2; done
The difference between the test-proxy and the test-rds is significant: it takes 132 seconds for the RDS endpoint to recover versus only 20 seconds for the proxy. Amazing difference and even better than what AWS promises in a more reliable and significant test.
But what happens when I trigger a change of the instance type?
While the numbers for the test-rds do not change significantly, the proxy is simply gone. Once the database cluster behind changes, the proxy endpoint is still available but it does not connect to the database anymore. Changing time out does not help, with no simple way to recover.
test-proxy
ERROR 9501 (HY000) at line 1: Timed-out waiting to acquire database connection ERROR 9501 (HY000) at line 1: Timed-out waiting to acquire database connection ERROR 9501 (HY000) at line 1: Timed-out waiting to acquire database connection ERROR 9501 (HY000) at line 1: Timed-out waiting to acquire database connection ERROR 9501 (HY000) at line 1: Timed-out waiting to acquire database connection
Amazon RDS Proxy is a very interesting service. And it could be an essential component in many deployments where increase application availability is critical. But I would have expect a few more improvements since the first preview. The lack of support for changes of the instances makes it still hard to integrate it in many scenarios where RDS is currently used.
Yesterday I had the chance to talk live with Federico Razzoli about cost optimization on the cloud, with the main focus on relational databases. How to save a few dollars running MySQL on AWS? What about RDS? Check out the video, hope you find some useful tips!
Cost optimization on AWS, a live conversation with Federico Razzoli
We all love metrics. We all need numbers. And different stakeholders need different numbers. Numbers that will drive key decisions inside your organization and for your customers. Becoming a data driven organization requires having reliable data in the first place (…)
You can read my post about generating reports and KPIs with throw-away databases on the Funambol Tech Blog: how we decoupled reporting and user activity, leveraging RDS snapshots to generate throw-away copies of our MySQL databases on AWS.
Funambol Tech Blog – Walking the tight rope of cloud development.