From the PowerShell Custom Runtime for Amazon Lambda to MongoDB Atlas Serverless, from SynLapse, a critical Synapse Analytics vulnerability in Azure, to AWS IoT ExpressLink: a recap of my articles for InfoQ in June.
AWS Releases IoT ExpressLink: Cloud-Connectivity Software for Hardware Modules
Cockroach Labs recently released their annual cloud report which evaluates the performance of AWS, Microsoft Azure and Google Cloud for common OLTP workloads. Differently from the past, this year’s report does not indicate a best overall provider, but concludes that AMD instances outperform Intel ones. ARM instances were not covered in the tests.
MongoDB Atlas Serverless Instances and Data API Now Generally Available
At the recent MongoDB World 2022 conference, MongoDB announced that serverless instances for Atlas and Data API are now generally available. The new managed serverless option introduces a tiered pricing, with automatic discounts on daily usage.
AWS Introduces PowerShell Custom Runtime for Lambda
AWS recently announced a new PowerShell custom runtime for AWS Lambda to run Lambda functions written in PowerShell. With the new runtime developers can write native PowerShell code in Lambda without having to compile it, simplifying deployment and testing.
This is the first article of a two-part series playing with Amazon Rekognition and flags from around the world. Today we will focus on testing the default behavior of Rekognition Image, in the second part we will use Rekognition Custom Labels to build a custom machine learning model and detect national flags.
Object and scene detection is a key feature of Rekognition Image and a powerful and useful technology to search and filter large image collections. It is possible to build applications to search, and organize images, assigning labels based on the visual content. According to the documentation:
The DetectLabels_API lets you automatically identify thousands of objects, scenes, and concepts and returns a confidence score for each label. DetectLabels uses a default confidence threshold of 50.
Rekognition supports thousands of labels belonging to common categories. Will it recognize a flag? Can we use the service to map and search national flags? The AWS Management Console offers a very good “try demo” option that works for any .jpeg or .png image no larger than 5MB. Let’s give it a try, uploading a photo I took at the Haus der Kulturen der Welt in Berlin last week.
Good news! The label “Flag” is definitely supported by Rekognition Image and the confidence is very high, 99.6%.
As a dataset of regional and national flags, we will use the ones from the open source repository region-flags, a collection of flags for geographic region and subregion codes maintained by Google.
Graphic images of flags are not the only option, or even the main testing scenario: there are also photos with flags. For real pictures where a flag is present, we will rely on Unsplash, a sharing stock photography website with a liberal license.
We might test automatically the entire region-flags dataset later on using the SDK but for now we can start manually with a few selected countries and the management console.
The Stars and Stripes
The obvious benchmark is the flag of the United States of America. It should be an easy one to be detected, given that it is likely the main market for the product and the initial dataset for training.
Amazon Rekognition labels it as a “Flag” with a 99.7% confidence and “American Flag” with a 92.5% confidence (click on the images to see the data from the AWS console).
A good and reliable result. But is it enough to trust the service for labeling flags worldwide? How does Amazon Rekognition perform when we analyze real pictures including the Stars and Stripes?
Let’s add a second photo from Unsplash:
They have a confidence of 76.2% and 79.5% for the American Flag. The value drops to 70.1% if we instead choose a flag in a less common position.
Not ideal, but the confidence is still pretty good, all three images score between 70% and 80%. Can we then set our confidence at 70% and not validate further the responses? Time to move to different countries to understand how Rekognition works, what the confidence level represents and avoid side effects.
Amazon, this is not the Stars and Stripes!
Let us cross the Florida Straits and use the flag of Cuba. Some similarities in colors and patterns, but it is a very different flag and country.
The confidence level for the label flag is still very high (99.4%) but surprisingly Rekognition scores 82.5% for the American flag. Is it a glitch or a side effect of similarities in shape and design?** Is the model not trained?**
As an Italian cloud architect, the natural choice for a completely different flag is il Tricolore: different colors, different patterns. But at 87.6%, the confidence for the label American flag is even higher than for Cuba.
What is going on here? Let’s try a real picture, not just a graphic of the national flag.
Rekognition is doing an amazing job detecting both flags in the photo, but it is not able to recognize them as Italian ones. It is labeling them as American ones.
The American flag or the Malaysian one?
We could compare many other national flags but the results would be very similar. Most are correctly labeled as flags, almost all the ones with distinct stripes, but they are labeled as well as American flags with different levels of confidence.
No other national flag label is detected, neither an “Italian Flag”, nor a “Cuban Flag”. **Amazon Rekognition uses a hierarchical taxonomy of ancestor labels to categorize labels, but apparently the only child of “Flag” is “American Flag”. **
The closer a flag resembles the Stars and Stripes, the higher the confidence level: for the flag of Malaysia (92.4%), the value is very similar to the one of the United States (92,5%). Setting an arbitrary high confidence level might help reduce failures but it is not the strongest safety net for flags. We have a surprisingly similar result if we take a Malaysian flag from Unsplash, with a 83.8% confidence for the American flag that is even higher than the one we saw for the real flags of the United States that scored between 70% and 80%.
Not a flag
Mapping any flag to one of the United States is only part of the issue. Other images instead might raise questionable labels. For example, the_ Bandeira de Portugal _might be a positive result when your users search for dynamite, weapon, bomb or weaponry, all labels with a high 88.3% confidence level. Flag is not in the top labels as the PNG file is not recognised as a flag.
The Portuguese flag performs better in a real scenario, where it is once again labeled as flag, with a confidence level of 99.9%.
The above results have been consistent testing Rekognition Image in the last two years, but I wanted to check if something changed recently. Given the on-going war in Ukraine and the popularity of the Ukrainian flag in the last four months, I checked how Amazon Rekognition detects it.
The results are very similar to the Portuguese ones. The flag itself is not recognized as such, with labels that are way off, topping with “Home Decor”. Once a real picture is used, it is labeled as a flag with a high confidence (above 99%) but again no match for the country.
Any feedback from Amazon?
The first time I noticed the issue, I raised a ticket to AWS Support and the feedback was straightforward:
I was in touch with the Rekognition engineering team as well as the Rekognition product team and I have relayed your message over to them. They acknowledged that Rekognition is currently not trained to identify the flags.
That is fair, but I would then recommend removing the value “American Flag” until it is the only child of “Flag”: it gives little benefits with mostly false positives in non-US scenarios.
What have we learned so far? Is it worth using Rekognition Image?
Rekognition Image is doing a good job in decoding objects as flags in photos in very different conditions. It is not able instead to recognize different flags and it is labeling incorrectly most of the objects as American flags
**Accuracy is always relative. **Setting an artificial confidence level for the results of Amazon Rekognition is not enough. National flags are not the most important label and training scenario for machine learning but they are an example of the challenges when image detection is not handled properly.
**This is not an AWS problem, this is our problem as developers **integrating a managed service like Rekognition Image in our product or service. You will be the public face for your end users. If you need to integrate image recognition capabilities in your application, you have to manage the risks and challenges yourself.
If you need more reliable data, you need to take the next step in the image detection journey.
It is now time to take the problem in our own hands and train a model to better recognize flags. How can we achieve that with Rekognition Custom Labels? How is Amazon Rekognition going to perform? We will discuss that in a separate article.
Thanks for making it this far! I am always looking for feedback to make it better, so please feel free to reach out to me via LinkedIn or email.
I am a lazy cloud architect with a background in site reliability engineering. That’s why I immediately felt in love with the idea behind CloudWatch Anomaly Detection when it was announced almost three years ago.
What is anomaly detection?
Regardless of the algorithm used to determine the outliers, anomaly detection is the process of discovering values that differ considerably from the majority of the data and should raise suspicions and alarms. The availability of a managed service, based on machine learning, that alerts a SRE if something goes wrong is too good to be ignored. CloudWatch Anomaly Detection is that option, without integrating third party tools or relying on more complex services like Amazon Lookout for Metrics.
Configuring CloudWatch Anomaly Detection
In a few seconds you can add an alarm that will help monitor even the simplest website. A service with a pricing that is not too high or complicated. What can go wrong with Anomaly Detection? Not too much. As long as you do not consider it a catch-all alarm replacing any other one you have configured in CloudWatch.
While the expected values represent normal metric behavior, the threshold of Anomaly Detection is based on standard deviation, as the label in the console suggests: “Based on a standard deviation. Higher number means thicker band, lower number means thinner band”.
The only not trivial step in the setup is deciding the threshold: what is a good number? Small with possibly many false alarms? High with the chance of missing some outliers? A bigger challenge is to remember that the algorithm cannot know the constraints of your system or the logic behind your product. Let’s give it a try.
Monitoring coconut orders
Let’s assume you have a successful website where you sell coconuts and you want to monitor the number of completed purchases per minute. You have thousands of orders at peak time, a few hundreds during the night with some daily and weekly patterns. Lucky you, that is many coconuts! How can you monitor the online shop? How do you adapt the alarms for seasonality and trend changes?
Without Anomaly Detection, you should have at least two static alarms in CloudWatch to catch the following cases:
the “Zero Orders” scenario: it likely indicates that something is broken in the shop. A simple static alarm, catching zero values for the shortest sensible period will not raise many false positives.
the “Black Friday” scenario: it is much harder to define a safe upper boundary but you can for example create an alarm at 130% of the maximum value you achieved in the previous month.
None of these two static alarms helps if the orders fall by half during the day or if the pattern suddenly changes and you lose 30% of your daily orders. You still do not account for seasonality but these static alarms are better than no monitoring.
Here comes CloudWatch Anomaly Detection: with a few clicks, you can configure an alarm and be notified when the pattern of the orders changes.
Can you simply configure the smart alarm, discard the static ones and trust the magic of machine learning? Let’s take a step back and look at one of the very first presentations of Anomaly Detection.
The example used to highlight the seasonality and the benefits of the new option shows a range band – regardless of how many standard deviations – withnegative values. But the ConsumedWriteCapacityUnits metric cannot be negative. A subpar example?
Going below zero
The ConsumedWriteCapacityUnits one is not a corner case. Most AWS and custom metrics have only positive values. Selecting randomly some metrics in the dashboard:
you cannot have negative orders in the coconut (custom) metric
you cannot have negative IOPS on RDS
you cannot have a negative CPU or ACU for Aurora Serverless
Considering 100s metrics, there are only a few that can occasionally go below zero. But the gray band in Anomaly Detection often does.
If you set up a static zero alarm as previously discussed, just keep it: one based on Anomaly Detection might not react as quickly as a static one. The ML option can help finding outliers but it is not the fastest way to catch a broken system with no orders.
For example, during the quieter hours, a “zero orders” scenario would not be immediately an outlier.
Ideally there should be a flag in CloudFront to enforce positive values. But only you know the pattern of your service and a strength of CloudWatch Anomaly Detection is the simple setup. It just works.
Let’s do a simple test to show the difference between constrained values and an algorithm based on machine learning. Let’s roll a dice.
Rolling a dice
One dice, six faces, and numbers between 1 and 6. No pattern and no outliers. There are no 0s and no 7s, there are no values outside the fixed range when you roll a dice. But Anomaly Detection cannot know that.
How can we test it? Let’s roll a dice in CloudWatch with the AWS CLI and a one line bash script roll-a-dice:
Adding the script to the crontab, we can have a new random value in CloudWatch every minute.
* * * * * /home/ubuntu/roll-a-dice
We now set up Anomaly Detection on the custom dice metric, wait a few days and see what the AWS algorithm thinks of the randomic pattern. How is it going to apply machine learning algorithms to the dice’s past data and create a model of the expected values?
Anomaly Detection is doing a good job given the circumstances but a zero or a seven might not (immediately) trigger an alarm.
Rolling a dice is way too simple and it has no predictable patterns, but if you have hard boundaries in your values, you should have a separate static alarm for that. Relying only on Anomaly Detection is suboptimal. Let’s now challenge CloudWatch and the AWS algorithm with something more complicated, a skyline.
Drawing the NYC skyline
Last year I presented a session at re:Invent, drawing the NYC skyline with Aurora Serverless v2. A SQL script triggered the spikes in the CPU and the Aurora Capacity Unit (ACU) of the serverless database, drawing a basic skyline of New York City in CloudWatch.
Let’s run that SQL script multiple times, for days, for weeks. Is CloudWatch Anomaly Detection going to forecast the NYC skyline?
Reusing the same logic from re:Invent, we can run it on a Aurora Serverless v2 endpoint, adding a 30 minutes sleep between executions and looping. This translates to a single bash command:
while true; do mysql -h nyc.cluster-cbnlqpz*****.eu-west-1.rds.amazonaws.com -u nyc < nyc.sql; sleep 1800; done;
Unfortunately, even after a couple of weeks, the range of Anomaly Detection is still not acceptable.
What is the problem here? A key sentence explains how the service works: “Anomaly detection algorithms account for the seasonality and trend changes of metrics. The seasonality changes could be hourly, daily, or weekly”.
Our loop has a fixed period but it is not hourly, daily or weekly. It is 30 minutes plus the execution of the SQL script. The data points at 7:47 UTC and 8:47 UTC are unrelated. The data points at 7:47 UTC on different days have nothing in common, we do not have a standard and supported seasonality.
But is this really the problem? Let’s change the approach slightly and run the SQL script hourly. It is a single line in the crontab:
Does the new period work better with Anomaly Detection? Let’s wait a few days and see the new forecasted range.
After a couple of days the overlap is still not perfect and the baseline for the CPU is generous but there is now a clear pattern. The outliers are not too different from the ones we saw with the coconuts.
If we suddenly change the crontab entry from hourly to every two hours, we notice that Anomaly Detection was indeed forecasting an hourly pattern.
The seasonality of the data is a key element. A periodic pattern is not enough, an hourly, daily or weekly one is required.
What did we learn? Is it worth using CloudWatch Anomaly Detection?
CloudWatch Anomaly Detection is easy to configure, almost free, and is a great addition to a monitoring setup. There are very few reasons not to use it.
You should add Anomaly Detection to your existing static alarms in CloudWatch, not simply replace them.
Make sure that your pattern is hourly, daily, or weekly.
Thanks for making it this far! I am always looking for feedback to make it better, so please feel free to reach out to me via LinkedIn or email.
Coconut photo by Tijana Drndarski and dice photo by Riho Kroll. Re:Invent photo by Goran Opacic. All other photos and screenshots by the author. The AWS bill for running these tests was approximately 120 USD, mainly ACU for Aurora Serverless. Thanks AWS for the credits. Thanks to Stefano Nichele for some useful discussions about the benefits and challenges of CloudWatch Anomaly Detection.
From Google Cloud Media CDN to EC2 I4i Instances, from AlloyDB to SageMaker Serverless Inference: a recap of my articles for InfoQ in May.
AWS Releases First Graviton3 Instances
AWS has recently announced the general availability of the C7g instances, the first EC2 instances running Graviton3 processors. Designed for compute-intensive workloads, they provide always-on memory encryption, dedicated caches for every vCPU, and support for pointer authentication.
Amazon Rekognition Introduces Streaming Video Events
AWS recently announced the general availability of Streaming Video Events, a new feature of Amazon Rekognition to provide real-time alerts on live video streams.
Google Cloud Introduces PostgreSQL-Compatible AlloyDB for Enterprise Database Workloads
Google Cloud recently announced AlloyDB for PostgreSQL, a managed PostgreSQL-compatible service targeting enterprise deployments. AlloyDB is a full-featured cloud database supporting atomicity, consistency, isolation and durability (ACID)-compliant transactions.
AWS Introduces Storage-Optimized I4i Instances for IO-Heavy Workloads
AWS recently introduced the EC2 I4i instance type for data-intensive storage and IO-heavy workloads requiring fast access to medium-sized datasets. The new instances can benefit high-performance real-time relational databases, distributed file systems, data warehouses and key-value stores.
Google Cloud Introduces Media CDN for Content Delivery
Google Cloud recently announced the general availability of Media CDN, a content delivery network targeted to media and entertainment companies. The streaming platform supports advertising insertion and AI/ML analytics.
Amazon SageMaker Serverless Inference Now Generally Available
Amazon recently announced that SageMaker Serverless Inference is generally available. Designed for workloads with intermittent or infrequent traffic patterns, the new option provisions and scales compute capacity according to the volume of inference requests the model receives.
Amazon MSK Serverless Now Generally Available
AWS recently announced that Amazon MSK Serverless is now generally available. The serverless option to manage an Apache Kafka cluster removes the need to monitor capacity and automatically balances partitions within a cluster.
More news? A recap of my articles for InfoQ in April.
This autumn I will be back at the Serverless Architecture Con, this time in Berlin, to talk about serverless databases. The title of my session? The Future of Relational Databases on the Cloud.
The major cloud providers offer different options to run a relational database on the cloud. A recent approach is to rely on so-called serverless databases that offer both traditional TCP connections and HTTP API access. In a short journey to databases on the cloud, we will compare different approaches and services, explore the main benefits and limitations of a serverless RDBMS versus a more traditional managed database.
From Fauna transactional database to infrastructure as SQL on AWS, from RDS and Aurora PostgreSQL vulnerabilities to AWS Firewall Manager: a recap of my articles for InfoQ in April.
Infrastructure as SQL on AWS: IaSQL is Now Open Source and SaaS
IaSQL, the company behind a service that models AWS infrastructure using SQL, has recently announced that IaSQL is available as open source and software as a service.
Amazon EC2 Introduces Automatic Recovery of Instances by Default
Amazon recently announced that EC2 instances will now automatically recover in case they become unreachable due to underlying hardware issues. Automatic recovery migrates the instance to a different hardware while retaining instance ID, private IP addresses, Elastic IP address, and metadata.
RDS and Aurora PostgreSQL Vulnerability Leads to AWS Deprecating Many Minor Versions
A researcher at the security company Lightspin recently explained how she obtained credentials to an internal AWS service using a PostgreSQL extension and exploiting a local file read vulnerability on RDS. AWS confirmed the issue and deprecated dozens of minor versions of Amazon Aurora and RDS for PostgreSQL.
AWS Introduces Lambda Function URLs to Simplify Serverless Deployments
AWS recently announced the general availability of Lambda Function URLs, a feature that lets developers directly configure a HTTPS endpoint and CORS headers for a Lambda function without provisioning other services.
Fauna, the company behind the Fauna transactional database, recently announced the general availability of event streaming, a push-based stream that sends changes at both the document and collection levels to subscribed clients.
More news? A recap of my articles for InfoQ in March.
Today AWS announced that Amazon Aurora Serverless v2 is generally available. Below a screenshot of my very first test of the new serverless cluster, to check some data and my experiments with the preview, take a look at my “Drawing the New York City skyline with Amazon Aurora Serverless v2” session that I presented at re:Invent 2021.