Flagging Flags: Nine Numbers with Amazon Rekognition

I recently published an article where I played with Amazon Rekognition and flags from around the world. Few friends and developers asked for more numbers, either out of curiosity or because they had some suggestions or doubts.

Is the “Stars and Stripes” the flag with the highest confidence for the label “American Flag”? Are all the flags labelled as “Flag”? Does the quality of the PNG file affect the label detection? 

Before training a model to better recognize flags with Rekognition Custom Labels, I decided to publish more results and the full dataset. Here are nine numbers and trends for the 255 flags available in the repository. Once more I rely on the images from the open source region-flags, a collection of flags for geographic region and subregion codes maintained by Google. 

128 Flags

The outcome is a coin toss: almost a perfect 50% (128 out of 255) of the flags is labelled “Flag”. Only 98 of them with a confidence above 90%, 73 above 98% and 56 above 99%. OK, a flag is not always a flag. In doubt, toss a coin, a cheaper algorithm than an API request.

Flag is a Flag

98 American Flag

As we already noticed, many flags, including the Cuban and Malaysian ones, are labelled as “American Flag”. How many PNG files are decoded as “American Flag”? There are 98 of them, with 27 above 90% and one above 98%. A high confidence level alone is not always a safety net.

Flag US

Only one Stars and Stripes

The flag with the highest confidence for “American Flag” is the one of the United States Peru. No kidding, a very high 98.3 %. The real “Stars and Stripes” is actually not in the top ten for “American Flag”

Flag Peru

No Syrup, two Maples

Two labels, “Maple” and “Maple Leaf” matched the Canadian flag. And only the Canadian one. Perfect match. Well done Rekognition!

Flag Canada

Eleven Outdoor Flags

What do Slovenia, Laos and Kosovo have in common? Their flags are all labelled “Outdoor”, Laos with a staggering 99.47% confidence level. Whether you are looking for rock climbing in Nong Khiaw or kayaking through Si Phan Don, the flag of Laos is apparently the country’s best marketing tool.

Flag Laos

One Lollipop

There is only one lollipop detected. And we cannot even share it. The flag of Dominica, which features a sisserou parrot, the national bird emblem, got the only (incorrect) candy.

Flag Dominica

61 Stars

Almost a quarter of the flags have the “Star Symbol”, 25 with a confidence above 90% and the European Union leading at 97%. Brexit or not, the twelve golden stars on a blue background are an easy catch for Rekognition.

Flag EU

239 Symbols

Almost every flag has “Symbol” as a label (94%), with 75 of them at 99% confidence level. India, Georgia and Peru are all above 99.99%. Whatever Symbol means.

Flag India

19 Animals

From Kiribati to Mexico, from American Samoa to Uganda, Rekognition does a good job finding animals inside flags: of the 19 decoded, only 3 are false positives. While the parent label (Rekognition uses a hierarchical taxonomy of ancestor labels) is good, the species itself is often wrong: a “Penguin” for Uganda, a “Chicken” for Mexico. Whoops.

Flag Uganda

Size of PNG is not significant

There is not any significant discrepancy testing the flags at 1000px or 250px, with confidence level slightly higher or lower, but without a significant pattern. This is somehow expected as the models are likely trained with images scaled down to a fixed size to reduce the computational load.

Testing All Flags

How can you quickly test all the flags? Amazon Rekognition, the AWS CLI and a while loop is the answer. You upload the dataset to a S3 bucket, and you run a simple command in the AWS CLI:

aws s3api list-objects --bucket <my-bucket> \
    --query 'Contents[].{Key: Key}' | jq .[].Key > list-countries.csv
cat list-countries.csv | while read flag
do
   aws rekognition detect-labels --image "{\"S3Object\":{\"Bucket\":\"<my bucket>\",\"Name\":"$flag\"}}"  >  $flag.json
done

Not elegant, but it just works. Every output file is a JSON, one file for every flag. Here is an example output (Italy) and here is a zip with the output for all the countries.

Conclusions

Maybe there is not one single Stars and Stripes but we have only one Lollipop. 

The decoding of flags on Amazon Rekognition is quite unreliable but, with a few exceptions, the decoding of objects and animals inside the flags is accurate. 

Please don’t take the numbers too seriously. As already acknowledged by AWS Support, Rekognition is currently not trained to identify flags. These numbers are just a warning and a reminder that results from image recognition have to be validated and used carefully

How can we improve our results and have some confidence in the flag detection process?  We will soon play with Rekognition Custom Labels and discuss the results in a separate article. 

Thanks for making it this far! I am always looking for feedback to make it better, so please feel free to reach out to me via LinkedIn or email.

Credits

All the screenshots are from the author and PNG files are from the region-flags repository.