Google measure Skin Tones to make Search Results more Inclusive using a New Way

Google is associating with a Harvard professor to elevate a new scale for measuring skin tones to fix problems of bias and diversity in the company’s products.

The new 10-point skin tone scale could assist make AI less biased. The tech giant is toiling with Ellis Monk, an assistant lecturer of sociology at Harvard and the inventor of the Monk Skin Tone Scale, or MST.

The MST Scale substitutes outdated skin tone scales biased towards lighter skin. When tech companies employ these older scales to categorize skin color, it can guide to products that serve worse for people with darker coloring, conveys Monk.

“Unless we have an acceptable measure of differences in skin tone, we can’t integrate that into products to make sure they’re more inclusive. So the Monk Skin Tone Scale is a 10-point skin tone scale deliberately designed to be much more representative and inclusive of a broader range of skin tones, especially for people [with] darker skin tones.” Monk tells.

Fixing Bias in AI – Fixing Training Data

Numerous examples of tech products, particularly those using AI, worsen with darker skin tones. These include apps devised to detect skin cancer, facial recognition software, and even self-driving cars’ machine vision systems.

Although there are many ways this sort of bias is programmed into these techniques, one common factor is the use of obsolete skin tone scales when collecting training data. The most prevalent skin tone scale is the Fitzpatrick scale, widely used in academia and AI. However, this scale was initially designed in the ’70s to classify how people with paler skin burned or tan in the sun and was only later extended to include darker skin.

This has led to some objection that the Fitzpatrick scale fails to seize a full range of skin tones and may suggest that when machine vision software is trained on Fitzpatrick data, it is biased towards lighter skin types.

The Fitzpatrick scale comprises six categories, but the MST Scale extends to 10 distinct skin tones. Monk states this number was chosen based on his research to counteract diversity and ease of use. Some skin tone scales suggest more than a hundred categories, but too much choice can lead to varying results.

“Usually, if you got exceeding 10 or 12 points on these scales [and] ask the same individual to pick out the same tones repeatedly, the better you increase that scale, the fewer people can do that,” says Monk. “Cognitively speaking, it becomes hard to accurately and reliably differentiate.” So a choice of 10 skin tones is much more manageable.

Building a new skin tone scale is only the first stage, and integrating this work into real-world applications is the real challenge. To promote the MST Scale, Google has devised a new website, skin tone. In addition, Google is dedicated to illustrating the research and best techniques for its use in AI. The company says it’s also functioning to apply the MST Scale to its products. These include its “Real Tone” photo filters, designed to work better with darker skin tones, and its image search results.

Google will let users refine specific search results using skin tones assigned from the MST Scale. Image: Google

Google says it’s raising a new feature to image search that will allow users to refine searches based on skin tones categorized by the MST Scale. So, for instance, if you search for “eye makeup” or “bridal makeup looks,” you can filter results by skin tone. In the future, the institution also intends to use the MST Scale to match the diversity of its effects so that if you search for pictures of “cute babies” or “doctors,” you won’t be displayed only white faces.

Skin Tones to make Search Results more Inclusive

“One of the things we’re accomplishing is taking a set of [image] results, understanding when those results are particularly homogenous across a few tones. So it will enhance the results’ diversity,” Google’s head of product for reliable AI, Tulsee Doshi, said. However, Doshi stressed that these updates were in a “very early” stage of evolution and hadn’t yet been wadded out across the company’s services.

It should strike a note of caution, not just for this specific change but also for Google’s approach to fixing bias problems in its products more generally. The company has a patchy history regarding these issues, and the AI industry tends to promise ethical policies and guardrails and then fail on the follow-through.

For example, the notorious Google Photos error led to its search algorithm tagging images of Black people as “gorillas” and “chimpanzees.” This blunder was first noticed in 2015, yet Google confirmed that it has still not fixed the problem but removed these search terms entirely. “While we’ve significantly enhanced our models based on feedback, they still aren’t perfect,” Google Photos spokesperson Michael Marconi said. “To prevent this error and potentially cause additional harm, the search words remain disabled.”

Introducing these modifications can also be culturally and politically tricky, reflecting broader difficulties in combining this sort of tech into society. For example, in the case of filtering image search results, Doshi notes that “diversity” may look different in different countries. So if Google adjusts image outcomes based on skin tone, it may have to modify these results based on geography.

“What diversity means, for example, when we’re surfacing results in India [or] when we’re surfacing results in different parts of the world, is inherently different,” says Doshi. “It’s hard to necessarily say, ‘oh, this is the exact set of good results we want,’ because that will differ per user, region, and query.”

Introducing a new and more inclusive scale for gauging tones is a step forward, but many thornier problems involving AI and bias remain.