Amazon Rekognition, this is not the Stars and Stripes

With the sensitive topic of biased results for face recognition, Amazon Rekognition has been in the news a lot recently. Amazon announced a one-year moratorium on allowing law enforcement to use the facial recognition platform. IBM decided not to offer facial recognition technology anymore. Microsoft took a conservative approach too.

But Amazon Rekognition is not limited to face recognition, and the benefits and risks for image analysis are much broader. Object and scene detection is a powerful and useful technology to search and filter large image collections. But even with objects, there might be side effects.

Image recognition, flags and bias

While learning the implications of software in disputed territories and partially recognized countries, I encountered a more trivial case for image detection: national flags.

As a dataset, I used the PNG image formats from two open source repositiories, Google Internationalization and country-flags.

The Stars and Stripes

The obvious benchmark is the flag of the United States of America. Amazon Rekognition labels it as a “Flag” with a 99.7% confidence and “American Flag” with a 92.5% confidence. A good and reliable result. But is it enough to trust the service for labelling flags worldwide?

What about Cuba?

Let us cross the Florida Straits and use the flag of Cuba as a comparison. Some similarities in colors and patterns, but still a very different flag. Not only politically.

The confidence level for the label “Flag” is still very high (99.4%) but surprisingly Rekognition has a 82.5% confidence of being the American flag. A glitch or a side effect of similarities in shape and design?

Il Tricolore

As an Italian, the natural choice for a different flag is il Tricolore: different colors, different patterns. But the confidence for the label “American flag” is even higher than for Cuba: 87.6%.

Malaysia

You can compare many other national flags but the results will be very similar. All are correctly labelled as “Flag” but they are labelled as well “American Flag” with various levels of confidence.

No other national flag label is detected, neither an “Italian Flag”, nor a “Cuban Flag”. Amazon Rekognition uses a hierarchical taxonomy of ancestor labels to categorize labels, but apparently the only child of “Flag” is “American Flag”. This likely reflects the main market for the product and the initial dataset for training.

{ "Name": "American Flag", "Confidence": 87.66943359375, "Instances": [], "Parents": [ { "Name": "Flag" }, { "Name": "Symbol" } ] }

The closer a flag resembles the Stars and Stripes, the higher the confidence level: for the flag of Malaysia (92.4%) is very similar to the one of the United States (92,5%). A demonstration that setting an arbitrary high confidence level might help but will not be safe in every scenario.

Any feedback from Amazon?

Last year I raised a ticket to AWS Support and the feedback was straightforward and honest:

I was in touch with the Rekognition engineering team as well as the Rekognition product team and I have relayed your message over to them. They acknowledged that Rekognition is currently not trained to identify the flags.

This is not an Amazon problem, this is your problem as a developer relying on an external service. If you integrate image recognition capabilities in your application, you have to manage the risks and challenges yourself. You cannot bury your head in the sand and hope for the best.

Accuracy is always relative

Setting an artificial confidence level number for the results of Amazon Rekognition is not enough. National flags are not the most important challenge for AI but they are an example of the risks when image detection is not handled properly. And mislabelled flags could even escalate tensions in conflict zones, disputed territories, or partially recognized states.

Fur further posts and talks on the challenges of geolocations, check saorico.com

Dev Around the Sun

Looking forward to be a speaker next week at Dev Around the Sun, a 24-hour international fundraiser organised by the .NET Foundation for the Direct Relief’s Coronavirus Fund.

@DevAroundTheSun
@DevAroundTheSun

It all starts on May 12 at 2PM CEST, I will be live on May 13 at 7AM CEST. Title of my talk and abstract below. Schedule, how to donate and all the details at DevAroundTheSun.org

Hey, where is my country? Make all your end users happy

What looks like a simple choice in a drop-down list, can turn into a PR nightmare. Integrating an external mapping service can unintentionally make many of your users unhappy. How can software developers and startups manage location-based services in disputed territories or partially recognized state? How can you make all your users happy? A few tips and tricks for the developer who targets an international audience but wants to rely on location data to control new features.

TestCon Europe 2020

For the very first time next October I will attend and speak at a testing conference and it will be the biggest software testing conference in Europe, TestCon Europe 2020. I will cover one of my favorite topics, geolocation and geopolitical challenges in the software world: “Hey, Where is My Country? How to Test Your App and Website for Geolocation and Geopolitical Challenges”.  

The abstract is below, more on the topic on saorico.com. Looking forward to be back to Vilnius!

Abstract

What looks like a simple choice in a drop down list, it can turn into a nightmare. Integrating an external mapping service can unintentionally make many of your users unhappy. How can we test websites and apps in disputed territories or partially recognized state?

Many airlines were forced recently to change the name of Taiwan on their booking systems. Hotel chain website where banned in mainland China for labeling Tibet as an independent country. Ukrainian users were upset because Crimea was removed from the map of their land. “What is the capital of Israel?” is a question that has triggered different answers from voice virtual assistants. We will go on a virtual tour around the world to see how disputed territories or partially recognized states are handled by online services and discuss how we can test and spot unintended geopolitical issues in our products.


Apple’s Crimea map, two months later

At the end of November Apple was in the news because of the choice to change Crimea map to meet Russian demands.

Handling disputed territories in online services is a wide and challenging topic without a simple solution. Combining a critical area like Crimea with the largest tech company in the world, makes news.

Let’s see first how the largest broadcaster in the world, the BBC, covered the topic. On November 29th, the headline was “Apple changes Crimea map to meet Russian demands” stating that

Apple has complied with Russian demands to show the annexed Crimean peninsula as part of Russian territory on its apps. (…) The BBC tested several iPhones in Moscow and it appears the change affects devices set up to use the Russian edition of Apple’s App Store”.

Apple changes Crimea map to meet Russian demands

That Google had implemented a similar approach and Apple had not yet commented on the decision was irrelevant. What was coming was obvious and headline the following day: “Ukrainians condemn Apple’s Crimea map change”

“A huge scandal” or not, a reaction from the Ukrainian foreign minister was obvious and expected. As well the position by most European countries or boycott threats for Apple products.

Apple was still silent, but a comment finally followed the next day and made again headlines: Apple to take ‘deeper look’ at disputed borders

Apple to take 'deeper look' at disputed borders

Quoting from the article: “the company follows international and domestic laws and the change, which is only for users in Russia, had been made because of new legislation there”. And: “we review international law as well as relevant US and other domestic laws before making a determination in labelling on our Maps and make changes if required by law.”

Damage limitation? Three headlines in three days is not good for a topic that a tech company would like to sweep under the carpet. As there is no obvious way to make every user happy: once the users notice the differences, there is no way back.

How did they solve the problem? How deep was the “deeper look”? 

Let’s test almost two months later Apple Maps with Zenmate, a VPN client that offers IP addresses around the world, Russia included. Typing Crimea from Kiev or Berlin, the first suggestion is for “Crimea, Ukraine”.

Let’s now search Crimea on Apple Maps from an IP address in Moscow.

The first suggestion is “Crimea, Russia”. Similar differences apply to other cities and regional borders in the area.

What changed? Nothing.

For Apple, for Google and for other many tech companies, providing different results to different audiences is a lesser evil. And it is the easier way “to make sure (…) customers can enjoy using Maps and other Apple services, everywhere in the world.”  For a tech company, the scenario where local authorities force them to comply and change names is not the worst one. They can at least blame the “local legislation”, hoping to avoid too many headlines.

Is this a problem for Crimea only? Not really. To read more about the challenges of geolocation and disputed territories, check saorico.com

Talk at Factory Berlin

How can software developers and startups manage location-based services in disputed territories or partially recognized state? Looking forward to present “Hey, where is my country? Software development and territorial disputes” at Factory Berlin.

This event is for members only but if you are interested get in touch, I will discuss location-based services in disputed territories or partially recognized state at other events in Berlin and Cologne in the next few weeks.