A Better Cocktail for Hiring Success
If you’ve ever hired more than two people, you’ve probably hired badly.
You’re not alone — one recruiting firm studied 20,000 of their own placements of senior executives and found that within 18 months of hiring, 40% of them had been pushed out, failed, or quit.[1]
The only real shocker is that they admitted it publicly. Bad hiring is the norm. Other sources report 18 month failure rates of 35 50% — and very little accountability for it.
It doesn’t have to be this way. Just because everybody else hires in a hit or miss fashion, doesn’t mean you have to. (Nor must you undertake an expensive, slow and weighty “Topgrading” style initiative, unless you’re General Electric.)
The tools to easily drive better hiring practices — much better hiring practices — have been around for decades and can be deployed for a few hundred dollars per finalist.
Seem too good to be true?
Can you really trust that there’s good data, widely ignored? Judge for yourself — it seems to be human nature to ignore improvements until we’re forced to pay attention.
Doctors in the US have had to admit to killing over 250,000 people a year due to medical errors, and the medical industry has struggled for decades to make doctors use the data that’s in front of them to drive better care. The airline industry became a model of data-driven safety only after the FAA forced it to.
Another human activity that largely ignores its own readily available data is hiring. There are terrific hiring tools and they’re generally ignored by practitioners.
The gold standard study, Hunter and Hunter’s “Validity and utility of alternative predictors of job performance”[2] was published in 1984, and updated in 1998 by Schmidt and Hunter’s survey of 85 years of research findings.[3] The data was further confirmed by a talk Schmidt gave in 2013, summarizing a further 10 years of research.[4] (Schmidt reported that the granddaddy of selection tools, the basic test of General Mental Aptitude, has gotten even stronger since 1998.)
In short, huge improvements in hiring are possible. One 2014 study found mis hiring could be cut from over 60% to 10%[5]; others show you can improve the odds of a new hire being likely to succeed in their new role from 45% likely to over 90% likely.
What stands out for me is how resistant the hiring community is to data — and, what a huge opportunity this presents for the recruiters and hiring managers willing to embrace a methodical and data driven approach.
Shared Value: Success
Recruiters want a high success rate. So do managers. When they struggle with low success rates, they sometimes blame each other — the manager complains the recruiter doesn’t send good candidates, and the recruiter complains that the manager is using the wrong selection criteria and getting exactly what they’re asking for — they’re asking for the wrong thing.
Some recruiters ignore the tools because they honestly feel their intuition is great (and they haven’t run their own 18 month retention numbers for a reality check). Other times they may feel that their intuition is all they have to differentiate themselves. (A better differentiator would be to offer a deep pool of applicants and a
strong, proven, data driven process for selection that firmly guides the hiring manager to success, and uses intuition as a part of the whole.)
Two Rules and One Trick
There are two rules and one trick to dramatically improving hiring success.
The first rule is, stop using intuitive hiring practices that reinforce your current untested beliefs or that elevate your intuition above facts and data. (Your intuition is useful and will get its due during the structured interviews. Save it for then.) Resolve to be methodical and data driven.
(The more comfortable you are with your current intuitive practices, the more likely you’re using comfort as a crutch, protecting yourself from challenge and growth.)
The second rule is, describe the job in behavioral terms, and de emphasize low selection elements like years of experience (validity 0.13) and years of education (validity 0.10). If you don’t have a good behavioral job description process, create or borrow one. I like the database at O*Net and the PXT Select Performance
Profiling approach (see below — and note that I resell PXT Select). Behavioral performance models are very powerful.
The trick is, to combine multiple selection tools into a potent “cocktail” of mutually reinforcing ingredients. Just as cancer treatments and antibiotics must often be combined to maximize their effects, hiring selection tools do too.
Full List of Selection Tools
Here’s a table showing how well each tool works, in isolation, to predict job success for an applicant — that is, it predicts accurately whether candidate A will succeed in job B. (Operational validity: 1.0 is perfect prediction and 0.0 is useless.)
Tools to Predict Job Success
PROCEDURE OR PREDICTOR | OPERATIONAL VALIDITY (R) |
GMA tests[a] | 0.65 |
Employment interviews (unstructured)[d | 0.60 |
Employment interviews (structured)[c] | 0.58 |
Peer ratings[ | 0.49 |
Job knowledge[y] | 0.48 |
Integrity tests[b] | 0.46 |
Behavioral consistency method[x] | 0.45 |
Job tryout procedure[w] | 0.44 |
Assessment centers[k] | 0.37 |
Biographical data measures[g] | 0.35 |
GPA[r] | 0.34 |
Work sample tests[t] | 0.33 |
Reference checks[f] | 0.26 |
SJT (knowledge)[j] |
0.26 |
SJT (behavioral tendency)[u] |
0.26 |
Emotional Intelligence (ability)[p] |
0.24 |
Emotional Intelligence (mixed)[q] |
0.24 |
Conscientiousness[e] |
0.22 |
Person job fit measures[i] |
0.18 |
Job experience[h] |
0.13 |
Person organization fit measures[s] |
0.13 |
Emotional Stability[v] |
0.12 |
T & E point method[m] |
0.11 |
Years of education[n] |
0.10 |
Interests[o] |
0.10 |
Additional Cocktail Ingredients
Here’s the table showing how well each tool adds validity, when added as a second ingredient in a cocktail based on the best single tool, GMA testing.
See lettered end notes; data from “The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 95 Years of Research Findings” by Frank Schmidt.
Additional Ingredients for Predicting Job Success
SELECTION PREDICTOR ADDED TO GMA | % GAIN IN VALIDITY |
Integrity tests[b] | 20 |
Employment interviews (structured)[c] | 18 |
Employment interviews (unstructured)[d] | 15 |
Conscientiousness[e] | 8 |
Reference checks[f] | 8 |
Biographical data measures[g] | 6 |
Job experience[h] | 4 |
Person job fit measures[i] | 3 |
SJT (knowledge)[j] | 3 |
Assessment centers[k] | 2 |
Peer ratings[l] | 2 |
T & E point method[m] | 1 |
Years of education[n] | 1 |
Interests[o] | 1 |
Emotional Intelligence (ability)[p] | 1 |
Emotional Intelligence (mixed)[q] | 1 |
GPA[r] | 1 |
Person organization fit measures[s] | 1 |
Work sample tests[t] | 0 |
SJT (behavioral tendency)[u] | 0 |
Emotional Stability[v] | 0 |
Job tryout procedure[w | 0 |
Behavioral consistency method[x | 0 |
Job knowledge[y] | 0 |
It astonished me that, once you know someone’s GMA score, you learn almost nothing additional from even the effortful and obviously good techniques that you may have used successfully in the past. Even Work Samples and Job Tryout add zero additional useful knowledge or predictive power!
Start Being Methodical
The most important thing you can do immediately to improve your hiring success is to (a) stop flying by the seat of your pants and instead embrace a methodical approach; (b) create a behavior based job description (not a junk drawer of vague tasks or areas of activity); and (c) use at least two of the tools on the selection
tool list, starting with a good GMA test.
"Hiring the best people requires confidence on the part of the people doing the hiring: confidence that they are attracting the best, confidence that they are able to recognize the best, confidence that the best will want to work for them. That sort of confidence only comes from having a consistent, repeatable methodology that lets you repeat successes and learn from mistakes." Stephen R. Balzac, Ph.D., author, “Organizational Psychology for Managers“
Be certain your GMA test has been proven not to have any disparate racial or gender impacts! (Using a test that does is illegal in the US and much of the rest of the world.) The one I believe in and use with my clients, PXT Select, is described here. (And work with an HR professional to ensure you’re staying well within the laws regarding the use of these tools!)
Read more about pre employment testing pros and cons here.
Test for Team Fit
My longtime friend and former boss David, a serial CTO, recommends a specific approach to testing for team fit. When you’re down to your finalists, arrange a team interview, then sit back and watch the body language of all participants. If a finalist is uncomfortable with your team, you might want to probe for reasons. If your
team is uncomfortable with the finalist, you definitely want to probe. You wouldn’t want to use comfort as a red line test. (If you allowed the potential discomfort of an all white team to nix a black candidate, you’d be breaking the law, violating social norms, and harming your team — all at the same time.)
David once had a candidate ignore the team and address all her answers to David — even when the team members asked questions. He hired her, and sure enough she was a lone wolf who played for David’s attention while ignoring the team’s needs.
Create a Job Performance Profile
Describing a job in terms of behaviors and performance is not easy. Fortunately there are tools to help.
I like the library of profiles available through O*Net, and the one provided by Wiley’s PTX Select product. Here’s what I’m working with to help a national trade association as they look to fill a newly created COO role:
O*Net Profile for a General Operations Manager
Performance Model for Operations Manager (partial)
The full performance model includes Thinking Styles and Interests – a backgrounder on this tool can be requested here.
Candidates Scored vs a Performance Model
There are no generically “good candidates” only good fit candidates — you must assess the candidate as a fit to your culture and to unique demands of the role.
For example a candidate might only succeed in a Data Scientist role if they used judgment that was deliberative, methodical, and almost entirely factual and avoiding intuition. But someone in an Operations Manager role must move quickly amid ambiguity and can only succeed by being fast and intuitive. The same candidate could be a great fit for one role and a terrible fit for the other.
Conclusion
Hiring excellence is yours for the taking.
Use the available data.
Resolve to be methodical in your approach.
Use a reputable and EEOC compliant GMA test, an integrity test, and structured interviews including behavioral (not hypothetical) questions.
Footnotes
[1]Kevin Kelly, CEO of executive search firm Heidrick & Struggles, re the firm’s internal study of 20,000 searches in an interview with Brooke Masters in “Rise of a headhunter” Financial Times, March 30, 2009.
[2]Hunter, John E., and Ronda F. Hunter. “Validity and utility of alternative predictors of job performance.” Psychological bulletin 96.1 (1984): 72.
[3]Schmidt, Frank L., and John E. Hunter. “The validity and utility of selection methods in personnel psychology: Practical and theoretical implications of 85 years of research findings.” Psychological bulletin 124.2 (1998): 262.
[4]The Validity and Utility of Selection Methods in Personnel Psychology: Practical and Theoretical Implications of 95 Years of Research Findings.
[5]Lorence, Michael S. (4 May 2014). The Impact of Systematically Hiring Top Talent: A Study of Topgrading as a Rigorous Employee Selection Bundle (Thesis). Georgia State University.
[a] From Schmidt, Shaffer, and Oh (2008, Table 3). Individual meta analytic estimates are reported in Table 1 on p. 838. The average of these estimates across eight meta analytic estimates (.647) is presented in Table 3 on p. 843.
We used this average in the current analyses.
[b] From Ones, Viswesvaran, and Schmidt (1993, Table 8). This operational validity is from predictive studies conducted on job applicants as in Schmidt and Hunter (1998); the same source was used in Schmidt and Hunter (1998),
but the operational validity reported in this table was corrected for IRR. The unrestricted observed correlation with GMA is .046 (Ones, 1993, Table 3).
[c], [d] From McDaniel, Whetzel, Schmidt, and Maurer (1994, Table 4). This operational validity is from primary studies where overall job performance was measured using research purpose measures, and thus represents the most unbiased estimates available. The same source was used in Schmidt and Hunter (1998), but the operational validity estimates used in this table were corrected for IRR with the meta analytic reliability estimates for the interview measure from Conway, Jako, and Goodman (1995). When the predictor reliability estimate from the McDaniel et al. (1994) was used in correcting for IRR, the operational validity for structured and unstructured interviews were .53 and .46 and gain in validity over GMA tests was .088 and .33, respectively. The unrestricted observed correlations with GMA are .305 and .402 for structured and unstructured interviews, respectively (Salgado & Moscoso, 2002, Tables 4 and 3, respectively).
[e], [v] From Schmidt et al. (2008, Table 1). The unrestricted observed correlations with GMA are .069 for Conscientiousness and .159 for Emotional Stability (Judge, Jackson, Shaw, Scott, & Bruce, 2007, Table 3). True score correlations alone were reported in Judge et al. (2007). We attenuated the true score correlations for predictor unreliability in both variables using the psychometric information provided by Timothy A. Judge.
[f] From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is assumed to be zero as in Hunter and Schmidt (1998).
[g] From Rothstein, Schmidt, Erwin, Owens, and Sparks (1990, Table 5). The same source/validity was used in Schmidt and Hunter (1998). The unrestricted observed correlation with GMA is .761 (Schmidt & Hunter, 1988, p. 283).
[h] From Sturman (2003, Table 1). The unrestricted observed correlation with GMA is .069 (Judge et al., 2007, Table 3).
[i] From Kristof Brown, Zimmerman, and Johnson (2005, Table 1). The correlation with GMA (in fact, college GPA) is .023 (Cable & Judge, 1996); note that the value is based on one primary study.
[j], [u] From McDaniel, Hartman, Whetzel, & Grubb III (2007, Table 3). The unrestricted observed correlations with GMA are .589 and .364 for SJT (knowledge) and SJT (behavioral tendency), respectively (McDaniel et al., 2007, TORC or Threat of Reference Check in Hiring Table 3).
[k] From Arthur, Day, McNelly, and Edens (2003, Table 3). The correlation with GMA is .710 (Collins, Schmidt, Sanchez Ku, Thomas, McDaniel, & Le, 2003).
[l] From Hunter and Hunter (1984, Table 10). The same source/validity was used in Hunter and Schmidt (1998). Based on Schmidt and Hunter (1998), we used the unrestricted observed correlation with GMA is .594 (.50 without
correcting for RR).
[m], [x] From McDaniel, Schmidt, & Hunter (1988). The correlations with GMA are .000 and .682 for T &E point and behavioral consistency methods, respectively (Schmidt & Hunter, 1998); note that these are assumed values.
[n] From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value.
[o] From Hunter and Hunter (1984, Table 9). The same source/validity was used in Hunter and Schmidt (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value.
[p], [q] From Van Rooy and Viswesvaran (2004, Table 1 for ability based measures and Table 2 for mixed traits based measures). The unrestricted observed correlations with GMA are .497 and .245 for ability based measures and mixed traits based measures, respectively (Van Rooy, Viswesvaran, & Pluta, 2005, Tables 3 and 4, respectively).
[r] From Roth, BeVier, Switzer, and Schippmann (1996). The operational validity estimates for GPA (combination of college, graduate, and PhD/MD GPAs) and college GPA are the same. The unrestricted observed correlation with GMA is .619 (Robbins, Lauver, Le, Davis, Langley, & Carlstrom, 2004, Table 5).
[s] From Arthur, Bell, Villado, and Doverspike (2006, Table 1). The correlation with GMA (in fact, GPA) is .092 (Cable & Judge, 1996; 1997); we performed a small meta analysis of these two articles in order to derive the estimate used in this study.
[t] From Roth, Bobko, and McFarland (2005, Table 1). The unrestricted observed correlation with GMA is .585 (Roth et al., 2005, Table 4).
[w] From Hunter and Hunter (1984, Table 9). The same source/validity was used in Schmidt and Hunter (1998). Based on Schmidt & Hunter (1998), we used the correlation of .663 (.38 without correcting for RR) for this analysis.
[x] From Hunter and Hunter (1984, Table 9). The same source/validity was used in Schmidt and Hunter (1998). The correlation with GMA is zero (Schmidt & Hunter, 1998); note that this is an assumed value.
[y] From Hunter and Hunter (1984, Table 11). The same source/validity was used in Schmidt and Hunter (1998). Based on Schmidt and Hunter (1998), we used the correlation of .747 (.48 without correcting for RR) for this analysis.