47% of Jobs at High Risk of Automation? 3 Reasons to Not Freak Out About That Study

Are robots actually going to take all of our jobs? We don’t know. But a lot of smart people are convinced it’s going to happen.

For evidence, many of them point to the 2013 study by Michael Osborne and Carl Benedikt Frey called The Future of Employment, which concluded that 47% of US jobs are at high risk of automation in the near future. This statistic used in many alarming, credible-looking headlines, like this one:

 

Or this one:

These headlines are misleading. The Future of Employment is a well thought out analysis of which jobs might be susceptible to automation from a technical standpoint. But that’s not the same thing as saying that any particular job will be automated. And this study isn’t meant as a prediction of the future. Let’s unpack their methodology and examine a few features of it that illustrate why.

The methodology: The entire study is based on a federally-maintained database of US occupations called the O*NET survey. Taken sporadically every few years, this survey asks workers in every occupation to rate how good they have to be at over a hundred different skills, abilities, work activities, and areas of knowledge in order to be successful at their jobs. These numeric “level” scores form the basis of the authors’ comparison of one job’s computerization potential to another.

The O*NET survey also asks workers to report the relative importance to job performance of the same skills, abilities, activities, and knowledge areas for which they reported the level required to do their work, but the authors left this “importance” metric out of their model.

The authors picked 70 jobs that were fairly representative of the US workforce. Then, in consultation with machine learning experts, for each occupation they asked a question: “Can the tasks of this job be sufficiently specified, conditional on the availability of big data, to be performed by state of the art computer-controlled equipment?” Occupations for which the answer was “yes” were marked with a 1; no’s were given a 0. They then modeled a probability of computerization for 632 other jobs based on correlations between the O*NET scores and the hand-labelled computerizability marks of the 70 representative jobs they chose.

If you’re worried that the study’s 47% conclusion means we’ll have 47% unemployment, here are a few features of this methodology that should put you at ease:

The model used availability of “state of the art” technology as its only standard. Occupations were marked as computerizable if they could be replaced using the most cutting-edge technology available – regardless of how expensive, difficult to use, or socially unacceptable that technology might be. To use this study to predict that 47% of jobs will disappear by 2034, you have to assume that McDonald’s will fire all its workers even if the robots that replace them cost a million dollars each and make sexually suggestive remarks to customers. That won’t happen; McDonald’s will only fire its employees if the tech to replace them can offer a pleasant experience to customers for a low investment. Osborne and Frey didn’t attempt to analyze factors other than technical feasibility. This narrows the scope of their analysis. 

The model left out O*NET’s “importance” scores. If you’re thinking about whether a job can be automated, you have to consider the relative importance of tasks. For example, a barista at a fancy cafe might pour cool designs into the foam of my latte – a skill that machines would have a hard time replicating. But in evaluating whether her job can be automated, I’d give less weight to her pouring acumen and more weight to her ability to quickly produce my coffee. Osborne and Frey had to have gone through this kind of thought process for the 70 occupations they hand-labelled to train their probability model. So I can’t understand why they left out the metrics that would have extended this thought process into their objective analysis. Leaving them out undermines their model’s predictive value.

The paper gives us very little information about Osborne and Frey’s process of hand-labelling the 70 occupations that was the foundation of their probability model. What machine learning experts did they consult about which jobs? How did they consider whether a job was computerizable? What activities associated with work did they consider to be essential job tasks? Each job is complicated enough to have its own study analyzing its potential for automation. We don’t know whether the authors spent 6 months at hand-labelling their training data, or whether they did it in an afternoon. Knowing more about that process would be a big help to understanding how seriously to take their conclusions.

A quick glance at Osborne and Frey’s list of occupations shows how the above issues may have affected their conclusions. For example: bank tellers, which the model assigned a probability of becoming computerizable of 98%. Since ATMs have existed for 30 years, that probability makes sense. But more bank tellers are employed today than in the 1980s. Even if cheap alternatives to a job exist, it doesn’t necessarily mean that job will disappear. The same goes for umpires and sports referees, which the model also gave a probability of 98%. Nobody seriously talks about getting rid of them, despite the fact that every baseball broadcast checks an umpire’s called strike three against a computer simulation.

Don’t get me wrong: automation does pose a potential threat. The Future of Employment is a fascinating landscape of the US economy, and is extremely helpful towards understanding that threat by answering the question, “what percentage of US jobs have a skill performance level requirement that current machine learning experts say can be automated in the next 20 years?” Reading the paper, you really get a sense of what skills, abilities, and areas of knowledge we should cultivate in our workforce to ensure widespread prosperity and social equality. But don’t freak out: the paper doesn’t mean a future of high unemployment is certain. Such a prediction requires a much wider and deeper analysis.