IT OFTEN arrives as you stroll from the kerb to your front door. An e-mail with a question: how many stars do you want to give your Uber driver? Rating systems like the ride-hailing firm’s are essential infrastructure in the world of digital commerce. Just about anything you might seek to buy online comes with a crowdsourced rating, from a subscription to this newspaper to a broken iPhone on eBay to, increasingly, people providing services. But people are not objects. As ratings are applied to workers it is worth considering the consequences—for rater and rated.

User-rating systems were developed in the 1990s. The web held promise as a grand bazaar, where anyone could buy from or sell to anyone else. But e-commerce platforms had to create trust. Buyers and sellers needed to believe that payment would be forthcoming, and that the product would be as described. E-tailers like Amazon and eBay adopted reputation systems, in which sellers and buyers gave feedback about transactions. Reputation scores appended to products, vendors and buyers gave users confidence that they were not about to be scammed.

Such systems then spread to labour markets. Workers for gig-economy firms like Uber and Upwork come with user-provided ratings. Conventional employers are jumping on the bandwagon. A phone call to your bank, or the delivery of a meal ordered online, is now likely to be followed by a notification prompting you to rate the person who has just served you.

Superficially, such ratings also seem intended to build trust. For users of Uber, say, who will be picked up by drivers they do not know, ratings look like a way to reassure them that their ride will not end in abduction. Yet if that was once necessary, it is no longer. Uber is a global firm worth tens of billions of dollars and with millions of repeat customers. Its customers know by now that the app records drivers’ identities and tracks their route. It is Uber’s brand that creates trust; for most riders, waiting for a driver with a rating of 4.8 rather than 4.5 is not worth the trouble.

Rather, ratings increasingly function to make management cheaper by shifting the burden of monitoring workers to users. Though Uber regards its drivers as independent contractors, in many ways they resemble employees. The firm seeks to provide users with a reasonably uniform experience from ride to ride. And because drivers are randomly assigned to customers, it is the platform that cares whether rides lead to repeat business and which therefore bears the cost of poor behaviour by drivers. Ordinarily a firm in such a position would need to invest heavily in monitoring its workers—hiring staff to carry out quality assurance by taking Uber rides incognito, for instance. A rating system, however, reduces the need for monitoring by aligning the firm’s interests with those of workers. (Drivers with low ratings risk having their profile deactivated.)

Outsourcing management like this appeals to cost-conscious firms of all sorts; hence the proliferation of technological nudges to rate one service worker or another. To work as intended, however, ratings must provide an accurate indication of how well workers conform to the behaviour that firms desire. Frequently, they do not. Raters may have no incentive to do their job well. They may ignore the prompt to rate a worker, or automatically assign the highest score. They may adhere to social norms that discourage leaving a poor rating, just as diners often leave the standard tip, however unexceptional the service. Uber’s customers often award drivers five stars rather than feel bad about themselves for damaging a stranger’s work prospects. And even when users are accurate, their ratings may reflect factors beyond a service provider’s control, such as unexpected traffic. Systems that allow users to leave more detailed feedback (as Uber’s has begun to) could address this, but at the cost of soaking up more time, which could mean fewer reviews.

When the quality of a match between a worker and a task is particularly important, the problem of sorting the signal from the noise in rating systems grows. Skilled managers can tell when a worker struggling in one role might thrive in another; rating systems can capture only expressions of customer dissatisfaction. Such difficulties also affect gig-economy platforms. Poor ratings on a job-placement site could reflect an inappropriate pairing between a worker with one set of skills and a firm that needs another, rather than the worker’s failure of effort or ability.

Platforms can reduce the potential for such errors by including more information about tasks and the workers who might tackle them. Yet they may discover to their chagrin that more information also provides users with more opportunities to discriminate. An analysis of Upwork, for example, found that employers of Indian descent disproportionately sought Indian nationals for their tasks. True, this particular sort of information could be concealed—and conventional management permits plenty of discrimination. But firms typically have a legal obligation not to discriminate, and to train managers accordingly.

Overrated

Management is underappreciated as a contributor to success. Recent work by Nicholas Bloom, John Van Reenen and Erik Brynjolfsson suggests that good management matters more than the adoption of technology for a company’s performance. Even so, the use of ratings seems sure to grow. They are, as “Left Outside”, a pseudonymous blogger, puts it, a genuine disruptive technology: cheap enough to be adopted widely even if inferior to established practice. Further advances could improve such systems, as is common with disruptive technology. Artificial-intelligence programs may one day know how much people enjoyed a taxi ride better than they do themselves. In the meantime, management risks being left to the wrong sort of stars.

*Sources cited in this article

“Designing online marketplaces: trust and reputation mechanisms”, by Michael Luca, Innovation Policy and the Economy, National Bureau of Economic Research, 2016.

Peer-to-peer markets, by Liran Einav, Chiara Farronato and Jonathan Levin, Annual Review of Economics, 2016.

Diasporas and outsourcing: evidence from oDesk and India, by Ejaz Ghani, William Kerr and Christopher Stanton, Management Science, 2014.

What drives differences in management?, by Nicholas Bloom, Erik Brynjolfsson, Lucia Foster, Ron Jarmin, Megha Patnaik, Itay Saporta-Eksten and John Van Reenen, NBER working paper, 2017.