Amazon sellers say misleading AI product reviews threaten sales

A driver delivers packages from in Boston, Massachusetts, US. File photo

A driver delivers packages from in Boston, Massachusetts, US. File photo

Published Dec 20, 2023


Shopping on Inc. has long entailed scrolling through pages and pages of often redundant customer feedback. In an effort to make the task less onerous, the company in August began using artificial intelligence to convert billions of reviews into brief summaries consisting of a few sentences apiece.

As is often true with generative AI, the results aren't perfect. In some cases, the summaries provide an inaccurate description of a product. In others, they exaggerate negative feedback. This has potential implications not just for customers, but for Amazon merchants who depend on positive reviews to boost sales. Making matters worse, merchants say, the summaries were deployed just as they were headed into the crucial holiday shopping season-giving them one more thing to worry about besides inflation-battered shoppers.

"If you have a handful of negative feedbacks, and the artificial intelligence model summarizes it like it's a consistent theme, that's not fair," said Jon Elder, who runs the Amazon seller consulting firm Black Label Advisor. "It's really hard to capture the monetary impact of this, but sellers are not happy."

Most shoppers can probably tell when the AI has misclassified a product. For example, the home fitness company Teeter sells an inversion table designed to ease back pain. Amazon's AI generated summary calls it a desk: "Customers like the sturdiness, adjustability and pain relief of the desk."

The technology's tendency to overplay negative sentiment in some reviews is less obvious. The $70 (R1282) Brass Birmingham board game, for instance, boasts a 4.7-star rating based on feedback from more than 500 shoppers. A three-sentence AI summary of reviews ends with: "However, some customer have mixed opinions on ease of use." Only four reviews mention ease of use in a way that could be interpreted as critical. That's fewer than 1% of the overall ratings, yet the negative sentiment accounts for about a third of the AI-generated blurb.

A summary of Penn tennis balls says some reviewers were disappointed with the product's smell. The $4 3-ball sleeve has a 4.7-star rating based on more than 4,300 ratings but only seven reviews mention an odour. Amazon breaks the summary down into several buttons including "quality," "value" and "smell." Shoppers would need to click the "smell" button to discover how few reviewers complained about an odour.

A Bloomberg review of dozens of review summaries revealed that the AI isn't consistent when it analyses customer comments and generates its blurbs. Some highlight critical feedback and others don't. An eight-pack of mason jars had 4.5 stars based on more than 3 000 ratings.

The AI-generated summary says: "However, some customers have reported issues with rusting lids."

Shoppers must click deeper into the product to see that just 16 customers complained about rust. Meanwhile, the write-up for a 4.7-star-rated backpack based on 21 315 ratings makes no mention of cheap straps even though more than 900 customers complained about that.

"Our analysis has found that review highlights are helping customers find the products they want and are leading to increased sales for sellers," an Amazon spokesperson said in an emailed statement. "We care a lot about accuracy, and we will continually improve the review highlights experience over time. While we've heard very few concerns about this new feature, sellers are welcome to contact Seller Support if they have a question about a product's review highlights." Amazon said it will continue refining the technology based on feedback from shoppers and merchants.

Product reviews have been a central part of shopping on Amazon for almost three decades, providing word-of-mouth referrals that boost the confidence of those who otherwise might be hesitant to buy something based on photos and marketing descriptions alone. In 2022, 150 million Amazon customers left 1.5 billion reviews and ratings, according to the company.

Since ChatGPT wowed the world last year, companies have been rushing to deploy generative AI, which mines vast quantities of data to generate text or images. Amazon has developed its own version of the technology, and using it to summarize billions of customer reviews was a no-brainer for the company.

"We want to make it even easier for customers to understand the common themes across reviews, and with the recent advancements in generative AI, we believe we have the technical means to address this long-standing customer need," Vaughn Schermerhorn, an Amazon director of software and product wrote in an August blog post about the new feature.

Amazon's AI-powered review summaries will almost certainly improve over time, but that's of little solace to merchants right now. For years, they say, Amazon has rolled out new tools and technologies with little warning or consultation. Sellers say they lack visibility into how the algorithms work and say Amazon hasn't provided any guidance on how they can address inaccuracies or distortions in summarized reviews that could hurt their businesses.

Lesley Hensell, the co-founder of the Amazon advisory firm Riverbend Consulting, said she's heard from several Amazon sellers panicking that the AI-generated summaries would hurt their holiday season sales. She tells them there are no solutions other than the often futile effort of emailing Amazon's seller support and hoping someone responds in a meaningful way.

One of her clients has a home-and-garden product with 1 400 4- and 5-star ratings, but the review summary said "some find the item is of poor quality." The fear is buyers will be discouraged by the quick summary without reading more deeply to see the critical feedback represents less than 2% of the ratings.

"Long term this could absolutely be a great feature for buyers," she said. "But Amazon needs a process whereby sellers can easily and effectively dispute misleading summaries because technology is never perfect."