Core Thesis
Modern law-enforcement screening is failing at the front end because it remains too dependent on subjective, clinician-mediated judgment and too underdeveloped as an evidence-based risk-screening system. The problem is not that agencies lack access to warning signs. It is that they continue to treat predictive hiring as an exercise in impression, discretion, and late-stage professional opinion rather than a disciplined process built around standardized, behavior-specific, empirically validated indicators of future misconduct. The article’s findings make that failure difficult to excuse. It tracked 6,075 hired officers over a five-year period and found that 15 of 19 prehire misbehavior indicators significantly related to later police misconduct, with hazard ratios reaching as high as 14.59. Yet those same prehire indicators had only minimal impact on actual hiring outcomes, meaning agencies often possessed relevant warning information and still hired through it.
That gap between what is knowable and what is operationally used is the real indictment. For decades, police reform has been structured around downstream correction: retraining, discipline, decertification, litigation, civilian oversight, and public scandal management. But a system that waits until after predictable misconduct becomes an excessive-force complaint, a sexual-misconduct allegation, a vehicle misuse incident, a written reprimand, a lawsuit, or an arrest is not functioning as a serious preventive system. It is functioning as a damage-control system. The article shows that prior occupational trouble, prior law-enforcement discipline, histories of temper and violence, domestic-violence indicators, bad credit, and other concrete markers of instability or irresponsibility are not random biography details. They are measurable signals with differing predictive force. Some are common enough to support population-level screening. Others are rare but severe enough to operate as critical red flags. What the current hiring architecture too often does instead is flatten those distinctions, dilute them through unstructured discretion, or subordinate them to generalized clinical impressions formed too late in the process.
The inadequacy of the present model is therefore structural, not incidental. Clinical evaluations may have a bounded role in determining whether an applicant presents psychopathology or other functional concerns in a post-offer medical setting. But the article demonstrates that this clinical model is a poor substitute for the primary work that police hiring actually requires: identifying which applicants bring empirically demonstrable misconduct risk into the institution before they are armed with authority, discretion, credibility, and the power to alter the lives of others. The article expressly criticizes the field’s lack of national standards, its weak criterion-related validation practices, and its reliance on qualitative evaluation rather than empirically grounded, behavior-specific benchmarks. That critique cuts to the core of the problem. A hiring system designed to predict misconduct should be anchored to what is observable, disaggregated, and validated—not to what feels persuasive in a clinical interview or appears manageable in a discretionary review.
This thought-piece therefore advances a direct claim: current law-enforcement screening tools are inadequate because they privilege subjective professional opinion over standardized predictive evidence. That inadequacy has public consequences. It permits departments to treat later misconduct as an unfortunate surprise when, in many cases, it was a foreseeable institutional choice. It also permits agencies to preserve the optics of rigor while avoiding the discipline of true evidence-based screening. The article points toward a different model—one built on behavior-specific benchmarks, structured decision rules, validated prehire indicators, and earlier screening architecture designed to remove high-risk candidates before the process turns to more expensive and more impressionistic late-stage evaluations. The reform question is not whether police departments should screen more. They already screen. The real question is whether they will stop screening by hunch and begin screening by evidence.
Executive Summary
Police reform debates are usually staged at the point of visible failure. An officer uses excessive force. A civilian alleges sexual misconduct. A patrol car is misused. A body-camera incident goes viral. A municipality pays out another settlement. A chief announces retraining, discipline, closer supervision, or renewed accountability. Then the cycle repeats. What is missing from that ritual is a more uncomfortable question: how much of this misconduct was predictable before the officer was ever hired? The article The Importance of Not Looking the Other Way: Prehire On- and Off-the-Job Misbehavior Predicts Subsequent Police Misconduct supplies a direct answer. A meaningful share of later police misconduct is not merely a posthire cultural development or a sudden moral collapse. It is associated with identifiable prehire warning signs that agencies already collect, often ignore, and inadequately integrate into hiring decisions.
The study examined an archival database of 8,539 police candidates across more than 150 municipal, county, state, and federal agencies in the United States. Of those candidates, 6,075 were hired and tracked for up to five years. The researchers evaluated four clusters of prehire misbehavior: prior occupational trouble and employment instability, prior trouble in law-enforcement jobs, prior temper problems and violence, and prior irresponsible behaviors. They then measured how those prehire signals related to later categories of police misconduct, including interpersonal misconduct and violence, property damage or misuse, conduct and competence concerns, and records of professional misconduct. Their findings were not marginal. Fifteen of nineteen prehire indicators significantly related to later misconduct, and some hazard ratios were very large. In other words, the study does not merely suggest that “bad background facts” matter in the abstract. It identifies concrete categories of prior behavior that carry measurable predictive value for future police misconduct.
Those findings alone would make the article important. But the more consequential point lies in what the agencies did with this information. The article examined whether negative prehire incidents materially reduced hiring prospects. The answer was essentially no. On average, disclosing prior misconduct or instability reduced hiring chances by only about 5%, and none of the individual confidence intervals excluded 1.00. That means departments were not operating in a world of ignorance. They were operating in a world where significant warning information existed, was gathered, and yet had very limited practical effect on whether candidates entered the profession. The article also found that prior law-enforcement experience did not inherently reduce risk and, in a number of categories, slightly increased liability for later misconduct. That finding undermines one of the profession’s most persistent assumptions: that prior badge experience is itself a protective credential.
This thought-piece takes those findings and presses the institutional question the article invites but does not itself fully develop. Why do current law-enforcement screening tools remain so inadequate even when the predictive signals are available? The answer advanced here is that the profession continues to rely too heavily on subjective, clinician-mediated screening logic and too little on objective, behavior-specific, empirically benchmarked decision architecture. Clinical evaluations may have a role, but they are too often treated as the prestige layer of the process—the stage that makes the system look rigorous, careful, and professionally grounded. The article itself explains that clinical psychologists are routinely involved in post-offer evaluations focused on psychopathology, interviews, and written assessments, but also notes that existing guidelines overlook behavioral risk assessment and lack discussion of empirically supported prehire indicators. That is a category error with serious consequences. A system designed to predict misconduct should not rest primarily on generalized impressions about traits, maturity, or judgment when more direct and validated risk indicators are available.
The sections that follow will argue that the existing hiring model mistakes impression for prediction. It does not fail because it asks too few questions. It fails because it asks the wrong kind of decision-maker to do too much of the predictive work, too late in the process, using tools that are too subjective, too variable, and too weakly anchored to criterion-linked evidence. The article shows what a better framework would look like: earlier use of validated prehire indicators, structured and behavior-specific benchmarks, verification through background investigation, and more disciplined use of empirical screening methods rather than broad, discretionary evaluation. That shift matters because police officers do not enter ordinary workplaces. They enter a profession that grants coercive power, discretionary authority, and the ability to inflict harm under color of law. In that setting, preventable hiring error is not merely an HR problem. It is a public-safety problem, a civil-rights problem, and an institutional-accountability problem.
The larger public point is therefore straightforward. Departments frequently portray later misconduct as a matter of isolated failure, inadequate supervision, or posthire drift. But if an agency hires through prior reprimands, warnings for negligence, unstable employment history, histories of temper and violence, domestic-violence indicators, or prior law-enforcement discipline, then later misconduct cannot honestly be framed as wholly unforeseeable. At some point the problem ceases to be the absence of information and becomes the misuse of information. That is where this thought-piece begins. It does not argue that no applicant can change, that every prior act should produce automatic exclusion, or that psychology has no place in police hiring. It argues something more measured and more damaging: a screening system built around subjective opinion rather than validated predictive evidence will continue to miss the misconduct it claims to screen out, and institutions will continue to call the result unfortunate rather than predictable. The article makes clear that the warning signs are there. The failure is that the profession still lacks the discipline to treat them as what they are.
I. Introduction: Police Hiring Still Mistakes Impression for Prediction
Police reform in the United States is usually narrated at the point of visible collapse. The public sees the body-camera clip, the excessive-force allegation, the sexual-misconduct complaint, the unlawful search, the lawsuit, the settlement, the criminal charge, or the delayed disciplinary response. Institutions then answer in a familiar register. There will be another review, another training cycle, another set of supervisory expectations, another claim that the department is committed to accountability. What almost never receives equal attention is the question that logically comes first: who was permitted to enter the institution, with what warning signs, and under what decision rules. The article at the center of this thought-piece begins from that neglected premise. It frames police misconduct not only as a posthire supervision problem, but as a screening and hiring problem that reform debates have largely overlooked. Its opening claim is direct: while most reform efforts focus on officers’ actions after they are hired, rigorous prehire screening is an underused mechanism for reducing later misconduct.
That shift in perspective matters because the profession has grown accustomed to treating misconduct as if it were largely emergent: a function of culture, stress, poor supervision, bad incentives, or moral decline after appointment. The article does not deny that posthire conditions matter. But it rejects the assumption that the prehire stage has little to contribute. Drawing on a dataset of 8,539 candidates across more than 150 agencies, the researchers tracked 6,075 hired officers for up to five years and examined whether specific forms of prior employment trouble and off-duty misbehavior predicted later police misconduct. They found that fifteen of nineteen unique prehire indicators significantly related to later misconduct, with some hazard ratios reaching 14.59. The practical implication is difficult to soften: later police misconduct is not always a surprise generated entirely inside the institution. In many instances, it is associated with signals that existed before the candidate ever put on the uniform.
The article also helps explain why the profession keeps missing what it claims to screen out. It is not because agencies ask no probing questions. On the contrary, the paper notes that law-enforcement agencies and screening psychologists in many jurisdictions already inquire into early life experiences, employment history, prior law-enforcement service, and off-duty conduct such as driving history and domestic violence. The defect is different. According to the article, the criterion-related validity of such practices is rarely examined, evidence-based policy guidance is lacking, and the information that is gathered is often evaluated qualitatively rather than against empirical benchmarks. That means the problem is not simply an information deficit. It is a decision-architecture deficit. Police hiring too often collects risk information without converting it into disciplined, defensible, evidence-based exclusion criteria.
That conclusion becomes even more damaging when the article turns from prediction to actual hiring behavior. After identifying significant prehire indicators of later misconduct, the researchers examined whether agencies materially used that information when deciding whom to hire. They found that negative prehire incidents had only a minimal effect on hiring outcomes. The average reduction in hiring probability was about 5%, and none of the individual confidence intervals excluded 1.00. In other words, departments were not operating in a world where warning signs were unavailable. They were operating in a world where such signs existed, were collected, and yet had little practical force at the front end of the hiring process. A profession cannot persuasively describe later misconduct as wholly unforeseeable when the same profession had access to indicators that meaningfully predicted later risk and still hired through them.
The article’s treatment of police reform more broadly reinforces that point. It surveys the now-familiar menu of reactive interventions: ballot initiatives, decertification reform, debates over qualified immunity, de-escalation training, contract reform, and community-based alternatives. The researchers do not dismiss those efforts. They instead identify their structural limitation. Accountability measures imposed after an officer is already on the force address consequences more than prevention. That distinction is central. A department may improve disciplinary response, expand decertification authority, and still preserve a hiring architecture that repeatedly imports avoidable risk. The article therefore positions evidence-based screening and selection as a vital but neglected strategy for curbing police misconduct. That is not a rhetorical flourish. It is the organizing premise of the study.
The reason this argument carries force is that the article does not rest on vague intuitions about character. It is anchored in a larger body of research on counterproductive work behavior and behavioral consistency. The paper situates itself in the literature showing that misconduct, deviance, and analogous antisocial behaviors often reflect stable individual differences and persist across time and contexts. It invokes the general proposition that past behavior predicts future behavior, but then goes further by insisting that police screening cannot stop at broad abstractions such as “low self-control” or generic “risk.” Agencies need empirically grounded, behavior-specific benchmarks that can distinguish which prior acts matter most, which require further scrutiny, and which carry less predictive significance. Without that level of specificity, departments are left with heuristics, impression, and inconsistent judgment.
That is where the deeper criticism of modern police hiring begins to take shape. The article repeatedly contrasts empirically validated, disaggregated indicators with the looser ways in which departments have historically processed applicant information. It says there is no standard set of misbehavior signals used across agencies or investigators. Worse, the information that is gathered is typically assessed based on agency and background-investigator experience rather than empirical benchmarks. This is not a minor administrative imperfection. It goes to the legitimacy of the screening enterprise itself. A system built around qualitative evaluation will tend to flatten important distinctions, overvalue some concerns, undervalue others, and allow precisely the sort of discretionary dilution that makes later institutional surprise possible. The profession then misdescribes the resulting misconduct as a supervision failure or an accountability failure when part of the problem is antecedent: the wrong people cleared the gate because the gate was not designed around validated prediction.
The article’s findings on lateral hiring sharpen that point even further. One of the most entrenched assumptions in policing is that prior law-enforcement experience should count as a stabilizing credential. Yet the study found the opposite pattern in several important respects. Candidates with prior law-enforcement experience did not inherently present reduced risk; in a number of misconduct categories they showed elevated liability, including greater risk of complaints for unprofessional conduct, sexual harassment, excessive force, inappropriate weapon use, and misuse of official vehicles. The article states plainly that prior law-enforcement experience does not inherently reduce misconduct risk. That finding is especially important because it exposes one of the profession’s habitual shortcuts. Departments eager to hire trained and certified candidates may treat prior experience as evidence of safety, when in fact the article shows it can function as a conduit for recycled risk.
The same dynamic appears in the article’s discussion of the lack of national standards. Although every state has a standards body governing law enforcement, their authority varies widely. Some regulate heavily, while others do little more than advise lawmakers. The article also notes substantial variation in how agencies evaluate information such as prior criminal background, drug use, and records of prior misconduct, especially for lateral applicants. It highlights the “muni shuffle,” in which an officer facing misconduct issues can resign, preserve certification, and move to another agency whose hiring process either does not fully review the prior trouble or treats it as administratively unusable. A decentralized system of that kind invites inconsistency by design. When one couples that fragmentation with qualitative review, the result is not merely uneven hiring. It is foreseeable institutional under-screening.
For present purposes, then, the introduction to this thought-piece must do more than restate that police misconduct is harmful. The article already establishes that at length, describing both the social cost of police violence and the broader financial, legitimacy, and public-safety consequences of misconduct. The more urgent point is narrower and more uncomfortable. Police departments often behave as though the true challenge begins after misconduct manifests. The article shows that this is too late and, in many instances, conceptually backward. The warning signs often appear before appointment. The institution already asks about them. The predictive value of many of those signs is measurable. Yet the hiring process continues to rely too heavily on loose judgment and too little on standardized, evidence-based screening. That is why the problem is not simply that some officers later go bad. The problem is that departments continue to mistake information gathering for disciplined prediction.
This thought-piece begins from that institutional failure. It does not argue that every prior act demands automatic exclusion, that no candidate can change, or that all psychological screening is useless. The article does not support those propositions, and neither should a careful analysis. What it does support is more than enough. It supports the conclusion that prehire misconduct signals can predict later police misconduct; that current police screening lacks national consistency and sufficient empirical benchmarking; that agencies often evaluate applicant risk qualitatively rather than systematically; and that they have used predictive prehire information only minimally in actual hiring decisions. On that record, later official surprise begins to look less like a good-faith reaction to unforeseeable misconduct and more like a product of a hiring system that still mistakes impression for prediction.
II. The Category Error: Treating a Predictive Screening Problem as a Clinical Impression Problem
One of the profession’s central mistakes is not that it ignores psychology, but that it asks the wrong form of evaluation to do the wrong kind of work. The article makes clear that law-enforcement applicants in the United States commonly undergo extensive preemployment psychological testing. It also shows that these assessments vary significantly across agencies and jurisdictions, both in content and rigor. That variability matters because the profession has long acted as if the presence of a psychological screen itself proves seriousness. It does not. The article’s critique is subtler and more consequential. It says that criterion-related validity is rarely examined, evidence-based guidance is lacking, and agencies do not have the behavior-specific empirical benchmarks needed to connect applicant information to later misconduct risk in a disciplined way. The result is a category error: a system that should be organized around validated prediction is too often organized around generalized professional judgment.
The footnote on page two is especially revealing because it identifies the standard against which the profession is falling short. The article states that screening psychologists typically rely on subject-matter-expert judgments and content- and construct-related validation, while professional standards, evidence-based HR principles, and legal requirements describe the standards for empirically establishing reliability, validity, and freedom from bias in employment-related assessments. That distinction is not cosmetic. It is the difference between a process that appears thoughtful and one that is demonstrably predictive. Subject-matter judgment may help generate hypotheses. It is not a substitute for criterion-linked validation. In a field as high-stakes as policing, where the costs of hiring error include coercive abuse, litigation, financial settlements, community distrust, and preventable harm, the reliance on generalized judgment rather than stronger predictive architecture is not a technical quibble. It is a structural weakness.
The article’s own framing of the research problem confirms that point. It repeatedly says agencies need empirically grounded, behavior-specific benchmarks. It criticizes prior studies for relying on composite indices such as “criminal history” or “past employment problems,” or on bundled indicators and broad psychological constructs such as low self-control. Those approaches may be informative at a theoretical level, but the article says they do not provide the kind of actionable screening guidance needed for transparent and defensible hiring decisions. Screening, the authors explain, is different from broad selection. It requires identifying concrete exclusion criteria tied to elevated risk. When agencies lack that type of evidence, they fall back on heuristics that may exclude low-risk candidates while missing high-risk ones. That is precisely the logic of an impression-driven system.
What makes this a category error rather than a mere implementation problem is the nature of the task itself. The article is not about diagnosing illness. It is about predicting misconduct. That predictive task requires a close match between the signal and the criterion. The paper invokes classic personnel-psychology reasoning that prior behavior, especially when closely matched to the behavior being predicted, is often the best signal of future performance or misconduct. It therefore organizes its analysis around direct, observable indicators: prior occupational trouble, prior trouble in law-enforcement jobs, prior temper problems and violence, and prior irresponsible behavior. These are not abstract personality narratives. They are behavioral categories tied to specific later outcomes such as excessive-force complaints, sexual misconduct accusations, vehicle misuse, reprimands, lawsuits, and criminal charges. The researchers are explicit that the goal is to move beyond validating general deviance principles and instead provide item-level insights that can guide targeted, defensible screening benchmarks.
That orientation is fundamentally different from a model that privileges broad clinical impression. The article notes that applicants underwent hiring suitability evaluations through a psychological service provider after receiving conditional offers, and that prehire misbehavior data were collected through a standardized background questionnaire followed by a structured follow-up interview. The study’s analytic emphasis, however, is not on the evaluator’s holistic impression of the candidate. It is on the predictive value of discrete behavior signals. The article’s most important practical recommendation is not “trust the seasoned screener.” It is to use empirically grounded indicators with known base rates and known predictive value. Tier 1 signals, such as prior written reprimands or suspensions in law-enforcement work, unfavorable termination conditions, bad credit, and negligence warnings, are recommended for systematic use because of their prevalence and predictive stability. Tier 2 signals, such as domestic-violence citations or prior unjustified use of force, are identified as rare but severe red flags that should not be overlooked. That is not impression management. It is benchmark-driven screening.
The article’s criticism of current practice becomes even sharper when it discusses how agencies actually process applicant information. It states that there is no standard set of misbehavior signals used by different agencies or background investigators and that the information gathered is typically evaluated qualitatively based on experience rather than empirical benchmarks. That sentence should be read for what it is: an indictment of unstructured discretion. Once a system knows that some signals strongly predict later misconduct, qualitative review becomes less defensible as the primary decision mechanism. It permits the evaluator to blur distinctions between common but informative indicators and rare but critical red flags. It also permits agencies to preserve the aesthetics of individualized assessment while avoiding the discipline that evidence-based screening requires. The article is careful in tone, but its findings point toward a harsher institutional truth: the profession has normalized a form of applicant review that sounds sophisticated while remaining underdeveloped as predictive science.
The article’s treatment of domestic violence, prior law-enforcement misconduct, and financial irresponsibility illustrates why this distinction matters. In each area, the authors do not rely on a generalized story about “good character” or “poor judgment.” They identify specific behavioral markers and ask whether those markers predict later police misconduct. They explain, for example, why domestic violence should not be understood as an isolated private incident but as a marker of violent and abusive behavior over time. They likewise explain why prior trouble in law-enforcement work carries special predictive force because it occurs in a highly commensurate setting: the same profession, similar norms, similar opportunities, similar authority structure. And they examine irresponsible behaviors such as moving violations, DUI, bad credit, and arrears on alimony or child support because those behaviors may signal broader patterns of irresponsibility or instability relevant to police work. In each case, the method is behavior-specific and criterion-linked. It is built to answer a predictive question.
By contrast, a system overly dependent on generalized clinical impression risks treating all concerning information as merely one component of a “whole person” narrative. That may sound humane or nuanced, but it is not what the article says a serious screening program needs. The authors repeatedly return to specificity. They criticize the sparse meta-analytic evidence available to agencies, note that older syntheses focused on only a few narrow predictors such as traffic tickets and arrest history, and explain that broad constructs do not yield the misbehavior-specific risk estimates required for actual screen-out decisions. The practical consequence is straightforward. When the institution relies on broad impressions rather than specific validated predictors, it becomes harder to explain why a candidate with a domestic-violence citation, prior reprimands, unfavorable termination history, or documented unjustified use of force was nevertheless deemed acceptable. The process may have felt careful. But according to the article’s own framework, feeling careful is not the same as being predictively competent.

The article’s findings on hiring behavior make the category error even more visible. If departments had truly built their screening systems around predictive signals, the presence of meaningful prehire indicators should have materially influenced hiring decisions. Yet the researchers found the opposite: only a minimal reduction in hiring probability followed disclosure of prehire misbehavior. At the same time, agencies responded far more decisively to posthire misconduct once it occurred. That asymmetry matters. It suggests that departments do not ignore misconduct as such; they are capable of treating it seriously. What they underuse is the predictive value of prehire conduct. That is exactly what one would expect from a system that treats hiring as an exercise in individualized judgment rather than structured risk management. It waits for misconduct to become institutional fact instead of treating prior behavioral signals as meaningful evidence of future liability.
The correct conclusion, then, is not that psychology has no role in police hiring. The article does not support that proposition. The proper conclusion is that predictive screening cannot responsibly be subordinated to generalized opinion. The article supports a model in which the front end of police hiring is organized around standardized background signals, empirically grounded benchmarks, and behavior-specific screen-out recommendations. In that model, professional judgment still exists, but it is bounded by validated evidence rather than used as a substitute for it. The present system too often reverses that order. It places too much weight on subjective evaluation and too little on what the data say about actual future misconduct. That is the category error this thought-piece seeks to expose. The institution has treated a predictive screening problem as if it were principally a matter of professional impression. The article shows why that approach is inadequate and why the costs of that inadequacy are borne not only by departments, but by the public they police.
III. What Better Prediction Actually Looks Like: Behavior-Specific, Standardized, Criterion-Linked Screening
If police agencies are serious about reducing predictable misconduct, the question is no longer whether they should screen. They already do. The question is whether they will continue using a screening model built around generalized concern and discretionary impression, or whether they will move toward one built around validated prediction. The article makes that divide unmistakable. It does not argue for more vague scrutiny. It argues for something more disciplined: empirically grounded, behavior-specific benchmarks that tell agencies which prehire acts actually forecast later police misconduct, which ones require heightened scrutiny, and which ones carry comparatively little predictive significance. That is a fundamentally different project from the one many agencies still perform. The current system often gathers information and calls the process rigorous. The article calls for a system that knows what the information means and uses it according to demonstrated predictive value.
The importance of that distinction becomes clear in the article’s criticism of earlier research and existing practice. The authors explain that much of the prior literature either focused on posthire misconduct, relied on composite indicators such as “criminal history” or “past employment problems,” or used broad constructs such as low self-control. Those approaches may be useful in academic theory or broad discussions of deviance, but the article says they do not generate the behavior-specific estimates needed for actual screen-out decisions. Screening, as the authors frame it, is not merely about identifying generally concerning people. It is about identifying concrete exclusion criteria tied to elevated risk. That means the best screening system is not the one with the most dramatic interview, the most probing questionnaire, or the most self-assured evaluator. It is the one with the clearest evidence connecting specific prehire signals to later misconduct outcomes.
The article’s own architecture shows what such a system looks like. Rather than collapsing all applicant risk into a single intuition about “suitability,” the researchers disaggregate prehire signals into four major domains: prior occupational trouble and employment instability; prior trouble in law-enforcement jobs; prior temper problems and violence; and prior irresponsible behavior. That structure matters because it translates abstract concern into categories that can actually be investigated, documented, and assessed against later outcomes. It also reflects an important conceptual point: risk in policing is not monolithic. An applicant with a history of repeated job hopping and negligence warnings does not present the same pattern of liability as an applicant with a domestic-violence citation or a confirmed history of unjustified force. The article insists on this differentiation because prediction improves when the signal is closely matched to the conduct the institution claims it wants to prevent.
The article is especially strong on the principle of criterion matching. It invokes a longstanding idea from personnel psychology: prior behavior is often the best signal of future behavior when the predictor resembles the criterion. That is why the paper does not stop at generalized biographies. It examines prior occupational trouble as a predictor of later work misconduct. It examines prior law-enforcement trouble as an especially powerful predictor because of commensurability—the similarity of pressures, norms, tasks, and opportunities between one policing role and another. And it examines off-duty aggression, domestic violence, financial irresponsibility, and other nonwork behaviors not because they are morally unattractive in the abstract, but because they may function as observable signals of future misuse of authority, professional misconduct, or poor judgment once the applicant enters policing. In other words, better prediction is not simply stricter prediction. It is more behaviorally matched prediction.
That method is what makes the article’s approach more objective than current law-enforcement screening practice. The authors are not asking departments to rely on loose narratives about who “seems unstable” or who “presents badly.” They are asking them to move toward documented signals with measurable base rates and measurable downstream effects. The article reports, for example, that prehire indicators were collected in a standardized background questionnaire and then confirmed in a follow-up interview. It further explains that these indicators map onto different investigative domains and data sources, which makes systematic screening possible. That practical point should not be missed. A criterion-linked model is not merely more defensible as a matter of theory. It is also more administrable. It tells agencies where to look, what to ask, what to verify, and how to connect the answer to actual hiring decisions.
The article also rejects the comfortable fiction that all applicant concerns must be processed through a wholly individualized, one-off balancing exercise. Instead, it develops a tiered framework. Tier 1 indicators are those with higher base rates and statistically stable relations across multiple outcomes, making them suitable for systematic use in population-level screening. Tier 2 indicators are rarer, but they carry high-severity implications and may justify exclusion despite wider confidence intervals because of their liability significance. This is exactly what a serious predictive model should do. It should not simply collect facts and turn them over to the instincts of a reviewer. It should sort facts by demonstrated predictive value and require the institution to treat them differently based on evidence. The article’s framework therefore moves police hiring away from generalized concern and toward calibrated, standardized response.
The article’s examples show how that calibration works. Some indicators—such as prior written reprimands or suspensions, unfavorable termination conditions, bad credit, and employer warnings due to negligence—appear often enough and predict broadly enough to merit routine use in hiring decisions. Others—such as domestic-violence citations, arrears on alimony or child support, unjustified use of force, and complaints regarding racially offensive behavior—are less common but are treated by the authors as high-risk red flags because of the severity of their associations. That is how a criterion-linked screening architecture should function. It should allow an agency to say, with discipline, that some facts deserve contextual consideration while others cannot reasonably be treated as background noise. A system that treats all concerning conduct as merely one more factor in a subjective “whole person” narrative is not doing this kind of predictive work. It is dissolving meaningful distinctions back into discretion.
Another strength of the article’s framework is that it is built to be transparent. The authors are explicit that agencies need item-level guidance because broad constructs and bundled measures do not produce defensible screen-out decisions. That matters for both fairness and accountability. A transparent system can identify why a particular signal matters, what the evidence says about its predictive force, and how that signal should influence the decision. By contrast, a heavily impression-driven process tends to conceal its reasoning inside generalized statements about maturity, judgment, fit, or concern. Those formulations may sound careful, but they are hard to audit and easy to manipulate. The article’s approach does the opposite. It seeks to make predictive logic visible. It wants agencies to know not only that an indicator is concerning, but why, how often, and in relation to what later conduct. That is what makes it evidence-based rather than merely professionalized.
The four-domain structure also helps explain why police screening must move beyond narrow criminal-history logic. The article makes clear that the most predictive prehire factors are not confined to formal criminal dispositions. Prior occupational trouble matters. Prior law-enforcement discipline matters. Histories of temper problems matter. Domestic violence matters. Financial irresponsibility matters. Moving violations and patterns of reckless conduct matter. That breadth is important because institutions often default to the mistaken assumption that the core task is to identify only the obviously disqualifying criminal applicant. The article shows that the task is more nuanced and more demanding. Many relevant indicators sit outside the narrow boundaries of criminal conviction, yet still forecast later misconduct with meaningful force. A serious screening system therefore cannot be designed only around formal criminality. It must be designed around validated behavioral signals across multiple domains of life and prior work.
There is also an institutional consequence to adopting this framework. Once agencies accept behavior-specific, criterion-linked screening, they lose the rhetorical shelter of vagueness. They can no longer say merely that hiring is complex, that context matters, or that evaluation requires professional judgment. Those things may be true at the margins, but the article shows that they are not enough. When certain prehire signals repeatedly and measurably predict later misconduct, the institution has a stronger obligation to standardize how those signals are treated. It is one thing for an agency to make a difficult judgment in the absence of evidence. It is another to do so when the evidence already identifies which prior behaviors are especially hazardous. Better prediction therefore requires not just better tools, but a willingness to surrender some of the profession’s attachment to discretion masquerading as nuance.
In practical terms, then, better prediction looks like this: identify prehire signals at the level of concrete behavior; verify them through standardized background processes; sort them according to known predictive value; distinguish between common, stable indicators and rare but serious red flags; and use those distinctions to generate actual screen-out rules rather than generalized “suitability” impressions. That is the model the article advances. It does not promise perfect foresight, and it does not pretend that every future act can be prevented. But it does something more credible. It shows how police hiring can move from broad suspicion and subjective balancing toward a more exact, empirically disciplined form of prevention. If departments genuinely want fewer complaints, fewer lawsuits, fewer misconduct records, and fewer predictable scandals, this is what better prediction looks like.
IV. The Findings That Undercut the Current Model
The current hiring model is often defended by implication rather than proof. Departments assume that because they use background questionnaires, interviews, psychological evaluations, and post-offer suitability reviews, they are operating a serious preventive system. The article undercuts that assumption in two ways at once. First, it shows that prehire conduct signals can meaningfully predict later police misconduct. Second, it shows that agencies have not been using that predictive information in a way that materially shapes whom they hire. That combination is devastating because it strips the institution of its preferred defense. The problem is not that future misconduct is inherently unknowable. The problem is that the profession has not built its hiring system around the knowledge it already has.
The scale of the study is itself difficult to dismiss. The researchers examined 8,539 candidates from more than 150 agencies and tracked 6,075 hired officers over a period of up to five years. Using Cox proportional hazard regression, they measured whether specific prehire misbehaviors predicted different forms of later police misconduct. The results were not minor. Hazard ratios ranged from 0.16 to 14.59, and 15 of the 19 unique prehire misbehavior indicators significantly related to later officer misconduct. The median hazard ratio was 1.19 and the mean was 1.45, which the authors describe as indicating that prehire signals are generally associated with a moderate increase in risk, while some indicators show much stronger predictive force than others. That alone undermines the prevailing hiring logic. A system that continues to lean on generalized judgment when item-level predictors of misconduct are available is not using the strongest evidence in the room.
The article’s heatmap and tiering system sharpen the point. Rather than present one undifferentiated claim that “past behavior matters,” the authors sort the indicators according to prevalence, statistical stability, and applied significance. Tier 1 indicators are suitable for population-level screening because they have adequate counts and stable relations across several outcomes. Tier 2 indicators are rare but severe and carry such strong liability implications that they should not be ignored despite the limits of sparse data. This structure directly undercuts the idea that all warning signs are too context-dependent or too nuanced for standardized use. The article’s answer is the opposite. Some warning signs are common, stable, and broadly predictive. Others are rarer, but serious enough that the institution should not treat them casually. That is not vague concern. That is applied screening guidance.
Consider what the article identifies as Tier 1. Prior written reprimands or suspensions in a law-enforcement job are described as essential for decision making and something that “must be used” in hiring decisions. Unfavorable termination conditions, bad credit, and employer warnings due to negligence are all identified as indicators that should be strongly considered. Other signals, such as moving violations, repeated recent job changes, and history of temper problems, are treated as meaningful supplemental indicators. These are not speculative categories. The article presents them as the most statistically and practically useful signals emerging from the data. A profession that continues to subordinate those signals to unstructured intuition is not erring on the side of nuance. It is discounting evidence.
The Tier 2 findings are even more uncomfortable because they expose exactly the kind of warning signs departments often prefer to smooth over with “whole person” narratives. A domestic-violence citation, arrears on alimony or child support, prior unjustified use of force, prior demotions, and complaints regarding racially offensive behavior all appear in the article as high-risk indicators that should not be overlooked. Some of these predictors are rare, but the article treats them as operationally significant because of the severity of the outcomes they forecast. In other words, the current model is not undercut merely because some broad patterns exist. It is undercut because the article identifies several concrete signals that no responsible screening regime should treat as casual background facts. The existing system does so anyway.
The item-level findings make that institutional failure hard to defend. Prior employer warnings due to negligence were associated with heightened risk across interpersonal and performance-related domains, including sexual-harassment accusations, written reprimands, conduct mistakes, and citizen complaints for unprofessional conduct. Unfavorable terminations predicted increased risk of inappropriate weapon use, excessive-force complaints, misuse of official vehicles, sexual-harassment accusations, and off-duty misconduct. More than three jobs in two years predicted greater risk of use-of-force complaints, off-duty misconduct, and conduct mistakes. Prior demotions, though less common, were linked to much higher risk of lawsuits and citizen complaints involving force. These are not abstract correlations buried in a supplemental appendix. They are the article’s central practical findings. They say, in effect, that prior instability and prior occupational trouble often show up again in uniform.
The law-enforcement-specific indicators are more damaging still because they undercut one of the profession’s favorite assumptions: that prior badge experience is itself a sign of reliability. The article finds that prior reprimands or suspensions in a law-enforcement job strongly predict later lawsuits, conduct mistakes, property damage, misuse of official vehicles, citizen complaints, and written reprimands. Confirmed prior unjustified use of force is associated with extraordinarily large risk increases for lawsuits, inappropriate weapon use, and use-of-force complaints. Prior complaints regarding racially offensive behavior were associated with a dramatically elevated likelihood of later misconduct-related lawsuits. These findings are especially important because they show that the strongest predictors are not limited to youthful indiscretions or peripheral instability. Prior trouble inside policing itself can be one of the clearest warnings of future risk. A screening system that fails to operationalize that fact is not merely incomplete. It is conceptually backward.

The nonwork indicators tell the same story from another direction. A documented history of temper problems predicted increased risk of frequent conduct mistakes, undesirable off-duty conduct, and intentional property damage. Multiple physical altercations predicted a sharply increased likelihood of citizen complaints involving force. A domestic-violence citation was one of the strongest single predictors in the study, associated with greatly increased risk of criminal charges, sexual-harassment accusations, off-duty misconduct, and conduct mistakes. Bad credit predicted elevated risk across several domains, including reprimands, misuse of official vehicles, conduct mistakes, off-duty misconduct, sexual-harassment accusations, and later detention or criminal charges. Arrears on alimony or child support also forecast serious downstream misconduct. These findings matter because they show that the current model cannot hide behind the notion that off-duty conduct is too remote from police work to be useful. The article says otherwise. Some of those off-duty indicators are not merely relevant. They are among the more useful predictors in the entire screening framework.
The article’s treatment of prior law-enforcement and military experience further weakens the current model. More than half of the officers in the sample had prior law-enforcement experience. If the profession’s conventional assumption were sound, that experience should have reduced risk because those candidates had already been screened and socialized into the job. But the article found no such protective effect. For 11 of 15 misconduct indicators, prior law-enforcement experience did not reduce risk and instead slightly elevated it on average. The pattern was particularly notable for citizen complaints regarding unprofessional conduct, sexual harassment, excessive force, inappropriate weapon use, and misuse of official vehicles. Prior military experience similarly corresponded with elevated risk across all 15 outcomes, with stronger relations in some areas such as accusations of racism and lawsuits for sustained misconduct. These findings are not just interesting side notes. They strike at one of the hiring system’s most entrenched shortcuts: the belief that prior institutional affiliation is itself a form of reassurance. The article does not support that comfort.
Yet the most damaging finding may be the article’s analysis of how agencies actually use prehire information. After establishing that these indicators predict later misconduct, the authors examined whether they affected hiring decisions. The answer was effectively no. Although 15 of 19 indicators trended in the expected direction, the average impact on hiring probability was minimal, reducing hiring chances by only about 5% on average, and none of the individual confidence intervals excluded 1.00. That finding should change the way police misconduct is discussed. It means the institution often had access to predictive information but did not treat it as decisively meaningful at the point where prevention was still possible. The same agencies then became much more decisive after misconduct occurred, when termination risk rose dramatically across nearly every posthire misconduct indicator. That asymmetry is not simply ironic. It reveals the core dysfunction of the current model. Departments are willing to act strongly once damage has materialized, but not when the evidence supports front-end prevention.
Taken together, the findings leave the current hiring regime with very little room to claim that it is already doing the work the public assumes it is doing. The article shows that predictive signals exist, that some are strong enough to justify routine use, that some are serious enough to function as red flags, that prior law-enforcement experience does not inherently mitigate risk, and that agencies nonetheless used these signals only weakly in actual hiring decisions. A system built this way cannot honestly present later misconduct as a wholly unforeseeable rupture. It has too much evidence for that defense. The more accurate description is that the profession continues to operate a screening process in which validated warning signs are gathered, underweighted, and then rediscovered later as scandal, complaint, discipline, lawsuit, or termination. That is the factual record the article supplies, and it is the record that undercuts the current model from within.
V. Why Subjective Clinical Opinion Is a Weak Substitute for Empirical Screening
The central weakness in current police hiring is not the mere presence of psychological evaluation. It is the profession’s longstanding tendency to confuse professionalized judgment with validated prediction. The article is careful in tone, but its structure and findings make that distinction unavoidable. It begins by noting that agencies and screening psychologists already ask intrusive and consequential questions about early life experiences, prior employment, prior law-enforcement service, and off-duty conduct such as driving history and domestic violence. But it immediately adds the profession’s real problem: the criterion-related validity of those practices is rarely examined, evidence-based policy guidance is lacking, and applicant information is too often processed without the kind of empirical architecture that would make the screen genuinely predictive. That is the difference between a process that feels serious and one that is actually designed to prevent foreseeable harm.
The article’s footnote on screening psychologists makes the problem even sharper. It states that screening psychologists typically rely on subject-matter-expert judgments and content- and construct-related validation, while professional standards in educational testing, industrial-organizational psychology, and evidence-based HR practice call for empirical demonstration of reliability, validity, and freedom from bias in employment-related assessments. That statement does not denounce psychologists. It does something more consequential. It places current law-enforcement screening inside a framework where expert opinion, standing alone, is not enough. Once a system is deciding who will receive state authority, carry weapons, use force, restrain civilians, generate prosecutions, and shape constitutional encounters, the question cannot simply be whether an evaluator reached a considered opinion. The question is whether the screening mechanism is tied to criteria that actually forecast later misconduct. The article says police hiring has not adequately met that standard.
That is why subjective clinical opinion is such a weak substitute for empirical screening. Clinical evaluation, by its nature, tends to aggregate impressions. It asks whether a candidate appears stable, truthful, controlled, mature, resilient, or otherwise suitable in a generalized sense. Even when careful, that mode of evaluation remains vulnerable to overgeneralization because it converts heterogeneous facts into a single professional impression. The article moves in the opposite direction. It rejects broad constructs and bundled indicators as insufficient for police screening because they do not identify which specific prehire behaviors carry which kinds of risk. It criticizes prior research that relied on composite indices like “criminal history” or “past employment problems,” and it explicitly says broad psychological constructs such as low self-control are of limited utility for implementing transparent prehire screens. Screening, the authors explain, requires concrete exclusion criteria tied to elevated risk. That point is fatal to systems that remain overly dependent on global suitability judgments.
The mismatch becomes even more obvious when the article explains what it actually measured. The study was not built around diffuse personality narratives. It was built around disaggregated behavior signals: prior occupational trouble, prior trouble in law-enforcement jobs, prior temper problems and violence, and prior irresponsible behavior. Each category was then linked to specific downstream outcomes such as sexual misconduct accusations, racially offensive conduct, inappropriate weapon use, excessive-force complaints, vehicle misuse, written reprimands, citizen complaints, lawsuits, and later criminal charges. That is what predictive work looks like in a high-stakes hiring context. It identifies concrete behaviors, measures their base rates, tests their relationship to subsequent misconduct, and then asks what hiring consequences should follow. A process that substitutes generalized clinical opinion for that evidence is not simply taking a different route to the same answer. It is answering a different question.
The article’s emphasis on criterion matching makes this especially clear. It explains that predictors should resemble the criterion as closely as possible and that prior law-enforcement misconduct may be especially informative because of commensurability: similar work demands, similar norms, similar opportunities, and similar pressures. That is why prior unjustified use of force, prior reprimands or suspensions in law-enforcement work, prior complaints regarding offensive conduct, and similar indicators matter so much. They are not just “negative information.” They are highly relevant behavioral samples from the very institutional context in which future misconduct may recur. A clinical evaluation that dilutes those signals into a general narrative about current presentation loses precisely the feature that makes them useful: their close behavioral match to the future risk being assessed.
The same problem appears with off-duty conduct. The article does not treat domestic violence, physical altercations, reckless driving, bad credit, or arrears on alimony or child support as morally embarrassing details to be weighed loosely against a candidate’s overall presentation. It treats them as potential indicators of future aggression, irresponsibility, instability, and counterproductive behavior. The point is not that every prior act determines fate. The point is that some prior acts carry demonstrated predictive value. A clinician may believe a candidate has matured, seems reflective, or presents well in interview. But the article’s evidence shows why that kind of impression cannot carry dispositive weight where validated red flags exist. It is one thing to consider context. It is another to let generalized present-moment judgment wash out documented behavior that empirically relates to later misconduct.
The article also exposes another defect of heavy reliance on subjective opinion: opacity. A behavior-specific, benchmarked screening model can be explained. It can say why a prior reprimand matters, why a domestic-violence citation matters, why frequent job changes or bad credit matter only in conjunction with other factors, and why certain signals rise to the level of red flags. A largely impression-driven system cannot do that with the same precision. It tends to speak in the language of overall suitability, concern, maturity, judgment, resilience, candor, or “fit.” Those may be familiar professional terms, but they are difficult to audit. They also make it easier for institutions to rationalize inconsistent outcomes, especially when the evaluator’s conclusion is not tightly constrained by validated decision rules. The article repeatedly pushes against that opacity by insisting on empirically grounded, behavior-specific benchmarks and by criticizing the qualitative evaluation of applicant information based on experience rather than evidence.
The weakness of subjective screening is further exposed by the article’s recommendations. The authors do not respond to their findings by calling for more searching interviews or greater deference to evaluator judgment. They call for mandatory use of high-prevalence indicators that reliably predict future misconduct, greater weight for severe red flags, earlier collection of legally assessable preoffer misbehavior indicators, and structured policies or algorithmic decision making to reduce discretionary latitude. They also note that certain specific misbehaviors—such as domestic violence citations, racially offensive conduct, or unjustified use of force—can be treated as prevalidated single-item behavior checks because they are rare, unambiguous, and strongly predictive. That recommendation is particularly important because it stands directly against the culture of overcontextualized discretion. The article is saying, in substance, that some signals are sufficiently meaningful that they should not be dissolved into broad “whole person” judgment.
It is also telling that the article recommends moving some screening functions earlier in the process. It states that psychological evaluations often occur too late and are the most expensive stage, while validated preoffer misbehavior indicators can be collected earlier at lower cost. Background investigators can then confirm disclosures rather than discovering red flags only after the process has moved into its more expensive and more impressionistic phases. That recommendation is not just about efficiency. It is about the proper allocation of epistemic authority. The article is implicitly saying that police hiring should begin with what can be known through validated behavior signals and structured verification, rather than treating subjective evaluation as the primary engine of predictive judgment. Clinical opinion may still have a role. But it is a bounded role. It is not an acceptable replacement for front-end, evidence-based screening.
What makes the present system especially difficult to defend is that it preserves the optics of rigor without embracing the discipline of prediction. A department can say it used background questionnaires, interviews, psychological testing, and a professional screening provider. That description sounds comprehensive. But the article’s findings show why comprehensiveness is not the same as effectiveness. If the process remains weakly tied to empirically demonstrated risk indicators, then much of that apparent rigor is procedural theater. It may generate a detailed file. It may create the appearance of caution. It may allow later claims that the candidate was carefully reviewed. But if it does not systematically translate known prehire signals into meaningful decision consequences, then it has not done the central work the public assumes it has done.
The broader implication is straightforward. In ordinary employment settings, mistakes in hiring can be costly. In policing, they can be catastrophic. A weak front-end decision process does not merely affect productivity or workplace morale. It can affect bodily integrity, public trust, municipal finances, constitutional enforcement, and the legitimacy of the state itself. That is why subjective clinical opinion is such an inadequate substitute for empirical screening in this domain. It is not inherently illegitimate. It is simply too blunt, too opaque, and too weakly criterion-linked to function as the main safeguard against predictable misconduct. The article’s contribution is to show that the profession no longer has the excuse of uncertainty. It knows more than it operationalizes. It has better signals than it uses. And until it stops confusing evaluator impression with validated prediction, it will continue to hire risk under the cover of professionalism and then rediscover that risk later as scandal.
VI. The Most Damaging Institutional Finding: Agencies Barely Use Predictive Prehire Information
The article’s most destabilizing finding is not merely that prehire misconduct predicts later police misconduct. A system can, at least in theory, discover predictive information and still use it responsibly. What makes this study institutionally damaging is the next question the authors asked: after learning that specific prehire signals forecast later misconduct, do agencies actually use that information in hiring decisions? Their answer was essentially no. The article states that while many negative prehire indicators pointed in the expected direction, the impact on hiring probability was minimal, reducing hiring chances by only about 5% on average, with none of the individual confidence intervals excluding 1.00. That single finding forces a reframing of police misconduct discourse. It means the problem is not primarily that departments lack warning signs. It is that they have warning signs and do not materially act on them.
That distinction matters because institutions often defend themselves through uncertainty. When an officer later becomes the subject of excessive-force complaints, sexual misconduct allegations, misconduct-related lawsuits, or even criminal charges, departments imply that the later conduct was difficult to foresee at the hiring stage. The article substantially weakens that defense. It examined a range of concrete signals—occupational trouble, prior law-enforcement discipline, temper problems, physical altercations, domestic-violence citations, financial irresponsibility, and more—and found that many of them significantly related to later misconduct. Then it found that agencies nonetheless did not use those signals meaningfully in deciding whom to hire. Once those two findings sit together, official surprise becomes much less credible. The institution can no longer plausibly say, “No one could have known.” The record now supports a more uncomfortable formulation: the profession often knew enough to act more carefully and chose not to build a hiring process around that knowledge.
The article itself adopts striking language on this point. It says the findings “indicate that hiring practices fail to use this information meaningfully.” It describes the gap as serious and notes, shockingly, that even candidates with severe prehire incidents such as unjustified use of force or domestic violence faced only marginally lower odds of being hired, and in some cases were even slightly more likely to be hired. The significance of that statement cannot be overstated. It means the disconnect between evidence and action is not happening only at the margins of ambiguous cases. It extends to some of the very signals the article identifies as severe and highly consequential. An institution that can absorb those kinds of indicators without materially changing hiring outcomes is not merely undercalibrated. It is structurally permissive.
This is where the thought-piece must shift from methodology to institutional accountability. The problem is not simply that the hiring model is imperfect or scientifically incomplete. The more serious problem is that its incompleteness appears to operate in one direction. It does not systematically err on the side of caution. It errs on the side of admission. The article’s recommendations confirm that the authors see the issue in similar terms. They urge agencies to implement structured policies that prioritize systematic use of prehire misbehavior data, establish clear evidence-based standards, reduce the hiring chances of candidates with major red flags, and curtail the discretionary latitude of background investigators, chiefs, and hiring managers. They also recommend algorithmic decision making to prevent agencies from perpetuating cycles of misconduct within their ranks. That is not the language of a system suffering only from minor refinement issues. It is the language of a system whose decision makers have too much room to disregard predictive evidence.
The practical result of that latitude is a familiar institutional pattern: the department treats prehire misconduct signals as something to be discussed, but not necessarily something to be acted on. That is why the article’s tiering framework matters so much. It identifies some indicators as mandatory or strongly considered, and others as rare but significant red flags that should not be ignored. Yet the article’s hiring-outcomes finding suggests that departments often behave as though those distinctions carry little force. In effect, the institution gathers better information than it operationalizes. It produces a record of concern without a corresponding record of exclusion. That is not evidence-based screening. It is evidence collection without evidence discipline.
The asymmetry between prehire tolerance and posthire response makes the point even harsher. The article found that agencies were far more responsive to misconduct after hiring than during screening. Most types of posthire misconduct substantially increased the likelihood of involuntary separation, with an average relative risk of 6.23 across misconduct indicators. The strongest predictors of termination included sexual harassment, criminal detention or charges, undesirable off-duty conduct, and on-the-job conduct violations. That means departments are capable of decisive action when misconduct becomes institutional fact. They do not suffer from a general inability to recognize or condemn bad conduct. Their real failure is earlier: they do not treat predictive prehire conduct with comparable seriousness when prevention is still possible.
This asymmetry carries enormous rhetorical and institutional significance. It shows that police departments often run a damage-control model rather than a prevention model. They act strongly once the liability has ripened into complaint, discipline, lawsuit, criminal charge, or termination event. But the article shows they are not comparably rigorous when the same kinds of risks are available as prehire indicators. A system designed this way guarantees that some preventable harm will be managed only after the institution has already distributed authority to a risky candidate.

It is, in effect, a system more comfortable with remedial action than with exclusionary discipline at the gate. That preference may be bureaucratically convenient, especially in a profession that often struggles with recruitment and staffing. But it is difficult to defend as a public-safety model once the predictive evidence is known.
The article’s discussion of the “muni shuffle” and lateral hiring helps explain how that permissiveness sustains itself. Agencies eager to hire experienced and certified officers may avoid the time and expense of full investigation or may treat prior records as insufficiently useful. An officer facing misconduct trouble in one department can therefore move to another that either does not fully review the prior problem or chooses to discount it. The article frames this as part of a broader problem of inconsistent standards and local discretion. But in light of the hiring-outcomes data, the issue looks even more serious. It is not only that departments lack consistent national guidance. It is that many departments, even when presented with available risk information, appear willing to convert it into little or no meaningful hiring consequence. That permissiveness is what turns fragmented screening into a pipeline for recycled liability.
A likely institutional response is to say that context matters, people change, and hiring decisions cannot be reduced to checklists. Some of that is true. The article does not argue otherwise. But that response is weaker after this study than it was before. The authors are not demanding mindless mechanical exclusion. They are demanding that empirically grounded, behavior-specific signals be given appropriate weight. They recommend structured use of those signals precisely because too much discretion has left agencies underresponsive to meaningful red flags. Context may still matter in edge cases. What the article undercuts is the use of “context” as a broad permission slip to hire through patterns the evidence says are materially related to later misconduct.
This is the section where the thought-piece should state the problem in plain terms. Police departments often present later misconduct as a posthire management problem, but the article shows that much of the more damaging institutional failure lies at the hiring gate. Agencies are not simply surprised by bad actors. They are often underusing predictive evidence that should have made some hires less likely or impossible. Once that becomes the frame, later scandal looks different. The lawsuit, complaint, or criminal charge is no longer just an event of misconduct. It is also a retrospective audit of the hiring decision that let the risk into the institution in the first place.
The article therefore does more than add to the literature on police prediction. It creates a governance problem for departments themselves. If they continue to gather prehire misconduct information, understand that much of it predicts later misconduct, and still allow that information to have little effect on hiring outcomes, they are no longer merely operating an imperfect selection system. They are operating a permissive one. And once an institution is permissive in the face of predictive evidence, its later claims of surprise begin to sound less like candor and more like narrative management. That is the most damaging institutional finding in the article. The profession does not merely lack perfect foresight. It has declined to use the foresight it already possesses.
VII. The Lateral-Hiring Myth: Experience Is Not the Same as Safety
Few ideas in police hiring are more deeply embedded than the belief that prior law-enforcement experience is inherently reassuring. A candidate who has already worn the uniform, carried the badge, completed training, and worked within a police hierarchy is often assumed to be a safer bet than a true newcomer. That assumption has obvious bureaucratic appeal. It promises lower training costs, faster deployment, and the comforting idea that someone else has already done the hard screening work. The article directly challenges that comfort. Its findings on prior law-enforcement and military experience are among the most important in the study because they expose a durable institutional shortcut: the tendency to mistake prior affiliation for reduced risk. According to the article, that shortcut is not supported by the data.
The article examined whether prior law-enforcement employment or military service predicted lower levels of later misconduct. It found no such broad protective effect. More than half the officers in the sample had prior law-enforcement experience, yet for 11 of 15 misconduct indicators that experience did not reduce risk and instead slightly elevated it on average. The article specifically notes increased risk in areas such as citizen complaints regarding unprofessional conduct, sexual harassment, excessive force, inappropriate weapon use, and misuse of official vehicles. The authors state the point plainly: agencies should not rely on past law-enforcement experience as a proxy for suitability, and “relevant experience” should not replace thorough, empirically supported evaluation of each candidate’s misbehavior history and specific risk factors. That is a sharp rebuke to one of the profession’s most common habits.

The finding matters because lateral hiring has long been treated as a kind of institutional shorthand. In a resource-constrained environment, agencies are often tempted to hire officers who arrive already trained, already certified, and already acclimated to the profession’s operational demands. The article itself notes that agencies may be eager to hire experienced, trained, and certified officers because doing so avoids the time and expense of complete background investigations and full training programs. But it also identifies the danger of this shortcut through the “muni shuffle”: an officer facing misconduct trouble in one agency may resign, preserve certification, and seek employment elsewhere, where the prior misconduct is unknown, insufficiently reviewed, or operationally discounted. Once the article’s empirical findings are placed beside that practical reality, the problem comes into focus. Lateral experience is not simply a credential. It can be a vehicle for imported liability.
This is where the concept of commensurability becomes especially useful. The article explains that the predictive strength of past misconduct is likely to increase where prior and future roles share similar tasks, norms, pressures, and opportunities. That is one reason prior trouble in law-enforcement jobs matters so much. If an officer already attracted complaints for excessive force, racially offensive behavior, sexual harassment, or formal discipline inside a policing environment, that history is not some distant or irrelevant biography detail. It is behavior sampled from the same professional ecosystem into which the applicant seeks reentry. In fact, it may be more probative than many other categories of background information precisely because the role context is so closely matched. A laterally hired candidate is not merely experienced. He may also be experienced at reproducing the very conduct the next department hopes to avoid.
The article’s findings on prior law-enforcement misbehavior reinforce this point with unusual force. Prior written reprimands or suspensions in law-enforcement work were among the most significant Tier 1 indicators and were described as essential for decision making and something that must be used in hiring decisions. Prior unjustified use of force was treated as a rare but severe red flag. Complaints regarding racially offensive behavior also emerged as a rare but consequential indicator. These are not incidental markers. They are institution-specific signals with direct relevance to future abuse of authority, public complaints, lawsuits, and professional misconduct. A department that treats prior badge experience as reassuring while discounting these signals is not being practical. It is reversing the hierarchy of relevant information.
The article’s findings on military experience complicate the same myth from another angle. Military service is often culturally associated with discipline, respect for hierarchy, and operational seriousness. Yet the study found elevated risk across all 15 misconduct outcomes for those with prior military backgrounds, with particularly notable relations to accusations of racism and lawsuits for sustained misconduct. The article does not treat this as a blanket condemnation of military service. It instead warns against assuming that institutional experience elsewhere confers lower risk. That warning is analytically important. It means police hiring cannot safely rely on prestige by association, whether the association is prior policing or prior military service. The central question remains the same: what does the candidate’s specific misconduct history show, and how does it relate to later police risk? The article’s answer is that experience alone is not an acceptable substitute for that inquiry.
This matters not only for hiring mechanics but for institutional narrative. Police departments often speak as though experience is itself evidence of vetting. It is not. Experience merely proves prior entry into an institution. It does not prove the absence of misconduct risk, and the article shows it may coexist with increased liability. In bureaucratic terms, this means agencies cannot treat lateral hiring as a lower-scrutiny path. They must treat it, in some respects, as a higher-scrutiny one, especially where the candidate has prior law-enforcement records that are highly commensurate with the future job. That conclusion is not a policy preference imported from outside the study. It is the natural implication of the article’s own findings and recommendations.
The article’s discussion of local variation and absent national standards makes the risk even greater. Because states and agencies differ widely in how they regulate hiring, investigate misconduct, preserve records, and review lateral applicants, prior experience becomes easy to overvalue and difficult to discipline consistently. One agency may treat a prior resignation, unresolved complaint history, or disciplinary pattern as disqualifying; another may treat it as manageable or obscure. The article’s findings suggest this variability is not benign. It creates the conditions under which experienced candidates can be hired precisely because their experience is mistaken for reliability while the substance of their history is treated as secondary. The more decentralized and discretionary the system, the easier it becomes for “experienced” to function as a reputational shield around unresolved risk.
A sophisticated objection would be that prior experience may still be valuable even if it does not guarantee lower misconduct risk. That is true, but it does not rescue the myth the article undermines. The point is not that experienced candidates are always worse. The point is that they are not inherently safer, and agencies should stop treating them as though they are. Once that assumption falls away, the hiring logic must change. Departments can no longer rely on prior police or military experience as a background proxy for character, discipline, or suitability. They must evaluate the candidate’s specific prehire misbehavior history with the same or greater rigor they would apply to any other applicant. Indeed, because the prior behavior occurred in a highly relevant institutional setting, the scrutiny should often be more exacting, not less.
The deeper institutional problem is that lateral hiring often solves immediate staffing needs while multiplying long-term accountability costs. A department gets a candidate who can be deployed quickly and who appears professionally seasoned. But if the candidate also brings prior disciplinary baggage, unresolved force issues, complaint history, or other validated red flags, the short-term hiring convenience may become a long-term misconduct liability. The article’s findings on lawsuits, complaints, and use-of-force risks make that tradeoff impossible to ignore. What departments have often called efficient recruitment may, in practice, be a way of importing foreseeable risk under the banner of experience.
That is why the lateral-hiring myth deserves its own section in this thought-piece. It is not just another policy quirk. It is a core narrative error inside police personnel practice. The article shows that prior experience does not inherently reduce risk and may increase it in several significant categories. It also shows that prior law-enforcement misconduct is among the most relevant forms of predictive information because of commensurability. Once those points are accepted, the profession must abandon one of its most comfortable assumptions: that a candidate who has already worn a badge arrives with a presumption of safety. The evidence says otherwise. Experience may teach skills. It may also carry forward habits, liabilities, and patterns of misconduct that the next department will inherit unless it screens with far more discipline than police hiring has historically shown.
VIII. The “Muni Shuffle” and the Absence of National Standards
One of the article’s most important structural contributions is that it refuses to treat police hiring failure as merely a local problem of bad judgment by a few agencies. It places the issue inside a broader governance failure: the United States does not operate under a coherent national framework for police hiring standards, and that fragmentation creates the conditions in which known misconduct risk can move laterally, be discounted administratively, or disappear into inconsistent screening practices. The article is explicit on this point. It says that although every state has a standards body governing law enforcement, the authority of those bodies varies immensely. Some impose substantial regulation, investigate misconduct, and set agency-level expectations. Others do far less and function mainly as advisory or recommendatory entities. The consequence is not simply diversity in administrative style. The consequence is a system in which the meaning and use of prehire misconduct information vary across jurisdictions, agencies, and investigators in ways that no serious preventive model should tolerate.
That fragmentation matters because the article’s findings depend on consistency to have real force. If prior law-enforcement discipline, prior unjustified use of force, domestic-violence citations, unfavorable termination conditions, bad credit, negligence warnings, and other misconduct signals have measurable predictive value, then those signals should not be left to radically uneven local treatment. A candidate with a serious disciplinary history should not become acceptable merely because he crosses a county line, applies to a smaller department, or reaches an agency more eager to fill vacancies than to absorb the implications of his record. Yet the article shows that this is effectively what the current decentralized system allows. Without national standards or at least consistent minimum criteria, predictive evidence becomes administratively negotiable. The same fact pattern can be treated as disqualifying in one place, manageable in another, and practically invisible in a third. That is not a screening system. It is a patchwork of institutional appetite.
The article captures this dysfunction in one of its most vivid examples: the “muni shuffle.” It describes the scenario in which an officer facing misconduct trouble in one agency is permitted to resign, the investigation is closed, certification is preserved, and the officer then seeks employment elsewhere. The hiring agency, often attracted by the prospect of obtaining a trained, experienced, and already-certified candidate, may either fail to uncover the prior issues, fail to understand them in a useful way, or decide not to give them meaningful weight. The article presents this not as an anecdotal oddity, but as a known hiring vulnerability in a decentralized law-enforcement environment. The phrase itself matters because it captures the routine nature of the problem. This is not simply misconduct slipping through a rare administrative crack. It is a recurring institutional pattern in which local fragmentation allows prior risk to migrate rather than be resolved.
The significance of that pattern becomes much greater once the article’s predictive findings are kept in view. If prior law-enforcement trouble is among the most probative categories of prehire information because of commensurability—same profession, similar authority structure, similar opportunities for abuse—then a system that allows those records to dissipate or be discounted through lateral movement is not merely disorganized. It is actively undermining one of the strongest forms of predictive evidence available. The article identifies prior written reprimands or suspensions in law-enforcement work as a major Tier 1 indicator and prior unjustified use of force as a rare but severe red flag. That means the very kinds of information most likely to matter can also be the kinds most vulnerable to inconsistent treatment in a fragmented hiring regime. The “muni shuffle” is therefore not just a bureaucratic embarrassment. It is a mechanism by which known predictors of future misconduct are stripped of their preventive force.
The lack of national standards compounds the problem in other ways as well. The article notes substantial variance in how agencies evaluate demographics, education, criminal background, prior drug use, and especially records relating to lateral applicants with prior law-enforcement experience. This means that even where information is technically available, its practical meaning is unstable. One investigator may see a prior complaint pattern as disqualifying. Another may classify it as explainable. One agency may demand corroboration before discounting a red flag. Another may rely on informal assurances or the candidate’s narrative. The article criticizes this kind of qualitative evaluation and stresses that applicant information is too often assessed based on agency and investigator experience rather than empirical benchmarks. Once that occurs inside a decentralized system, inconsistency becomes inevitable. Not only do departments differ in what they know; they differ in what they think known facts should mean.
That institutional variability is especially dangerous in policing because the profession is not dealing with ordinary workplace risk. The article repeatedly emphasizes the breadth and impact of police misconduct: violence, excessive force, sexual misconduct, false arrests, misuse of vehicles, property damage, lawsuits, criminal charges, and profound erosion of public trust. In a sector where the state delegates force, credibility, and discretionary power, fragmented hiring standards do not merely create uneven HR outcomes. They create uneven public exposure to foreseeable abuse. A candidate who might be screened out in one jurisdiction can still obtain armed public authority in another. That means the public consequences of fragmented screening are not geographically contained. They are redistributed according to where the most permissive institutional gate happens to be.
The article also suggests why departments tolerate this fragmented arrangement. Agencies eager to hire experienced personnel can save time and money by avoiding a full background investigation or by relying on prior certification as a shorthand for suitability. This is a familiar bureaucratic temptation. A department under staffing pressure can tell itself that lateral candidates are efficient, already socialized into the work, and less resource-intensive than new recruits. But the article’s empirical findings remove much of the legitimacy from that convenience argument. Prior law-enforcement experience is not shown to be broadly protective. In some categories it corresponds with increased later misconduct risk. That means the practical pressure to hire laterals quickly is operating against the very evidence the article says should matter. When combined with weak national standards, that pressure turns fragmented hiring into a distribution channel for recycled risk.
There is also a deeper jurisprudential and governance problem beneath the article’s institutional account. A decentralized system with no strong national baseline makes it easier for agencies to preserve the rhetoric of professionalism while avoiding the discipline of uniform preventive rules. Every department can claim it screens. Every jurisdiction can point to its own forms, interviews, and evaluators. But absent common standards, those practices are difficult to compare, difficult to audit, and easy to rationalize. The article’s repeated call for behavior-specific, empirically grounded benchmarks should therefore be understood not only as a technical recommendation but as a standardization demand. Prediction loses much of its social value if each local entity can decide, through discretion alone, whether validated red flags will actually matter.
The article’s own recommendations point toward a more disciplined framework. It urges systematic use of prehire misconduct data, clear evidence-based standards, reduction of discretionary latitude among investigators and hiring managers, and more consistent deployment of validated indicators. It even suggests algorithmic decision making and structured policies as ways to sequence and weight predictive information more effectively. Those recommendations should be read as a direct answer to the fragmented environment the article describes. The authors are, in substance, arguing that a profession cannot continue to leave critical screening judgments to highly variable local custom if it expects to reduce predictable misconduct. In other words, the problem is not simply that some agencies do a poor job. The problem is that the architecture permits poor job performance to remain professionally acceptable because there is no sufficiently binding predictive baseline.
This is why the “muni shuffle” belongs at the center of any serious critique of police hiring. It dramatizes the gap between information and accountability. An officer can carry prior misconduct risk from one institution to another because the profession has not imposed a sufficiently disciplined, standardized, and evidence-based screening regime on itself. The article’s findings make that mobility harder to defend than ever. Once we know that certain prehire indicators forecast later lawsuits, force complaints, sexual misconduct accusations, reprimands, and criminal charges, the lateral migration of officers with those histories stops looking like mere administrative oversight. It begins to look like a governance failure built into the design of the system.
The section’s ultimate point is therefore broader than lateral hiring alone. The absence of national standards and the presence of the “muni shuffle” reveal that police misconduct cannot be understood only as the failure of individual officers or even individual departments. It is also the failure of a decentralized profession that continues to treat validated warning signs as locally negotiable. A preventive system worthy of the name would not allow one agency’s unresolved disciplinary history to become another agency’s staffing solution. Yet that is precisely the vulnerability the article identifies. So long as the profession preserves fragmented standards, tolerates inconsistent interpretation of predictive evidence, and allows lateral movement to outrun meaningful screening consequences, it will continue to reproduce the conditions in which foreseeable misconduct is hired, re-hired, and then publicly rediscovered as if no one could have seen it coming.
IX. What a More Objective Model Would Look Like
The article does not merely diagnose the weakness of current screening practices. It points, sometimes explicitly and sometimes by strong implication, toward a different architecture altogether. That architecture is not built around more rhetoric about professionalism, more generalized concern, or more discretionary review. It is built around earlier collection of validated information, structured decision rules, differentiated treatment of signals based on empirical force, verification through background investigation, and a sharper division between behavior-based risk screening and later-stage evaluative functions. In short, the article points toward a more objective model—not because it imagines human judgment can be abolished, but because it insists that judgment be constrained by predictive evidence rather than used as a substitute for it.
The first feature of that model is temporal. The article makes clear that current psychological evaluations often happen too late and tend to be the most expensive stage in the process. By contrast, many of the prehire misconduct indicators with predictive value can be collected earlier, before a conditional offer, and at lower cost. That sequencing matters. A system that waits until the late clinical or suitability phase to begin taking candidate risk seriously has already made two mistakes: it has spent more resources than necessary on candidates who could have been screened out earlier, and it has allowed the later evaluative stage to bear too much of the predictive burden. The article’s solution is to move validated prehire screening upstream. The objective model therefore begins before the prestige layer of the process. It starts with concrete behavior signals, not impressionistic end-stage review.
The second feature is specificity. The article repeatedly rejects broad constructs and bundled categories as insufficient for hiring decisions. It wants item-level guidance. That means the objective model is not built around generic notions of “good character,” “maturity,” “judgment,” or even “low self-control.” It is built around concrete indicators with measurable predictive value: prior reprimands or suspensions, unfavorable termination conditions, bad credit, negligence warnings, domestic-violence citations, prior unjustified use of force, and other specific behaviors that the article shows relate to later forms of police misconduct. The benefit of this approach is not merely statistical. It is institutional. Once the decision-maker is forced to process specific indicators through known risk categories, there is less room to dissolve meaningful warning signs into a generalized narrative of suitability.
The third feature is tiering and calibration. The article’s distinction between Tier 1 and Tier 2 indicators is one of its most practically useful contributions. Some signals are common and stable enough to support population-level screening. Others are rarer but severe enough to justify red-flag treatment despite wider confidence intervals. That is what objectivity looks like in a real hiring environment. It is not a refusal to make distinctions. It is a refusal to make them arbitrarily. A calibrated system does not treat all adverse history the same. It differentiates based on prevalence, severity, and predictive relationship to later misconduct. This is far more defensible than a regime in which every adverse fact becomes merely “something to discuss” in a subjective holistic review. The article’s framework gives agencies a disciplined method of saying which indicators must matter, which strongly should matter, and which should be considered in combination with others.
The fourth feature is structured verification. The article notes that prehire misconduct data were collected through a standardized background questionnaire and then confirmed through follow-up interviews. It also recommends that background investigators be used to validate disclosures and gather external information on relevant indicators. This is important because objectivity is not achieved merely by writing better standards. It also requires better confirmation. A department cannot responsibly rely only on self-report, nor can it leave the significance of a reported signal to unchecked evaluator preference. Verification helps transform the process from subjective narrative management into documented factual assessment. It also narrows the room for agencies to later claim that they did not know enough to act. In a more objective model, predictive indicators are not simply asked about. They are pursued, documented, and operationalized.
The fifth feature is reduced discretionary latitude. This is one of the article’s most important and underappreciated recommendations. The authors explicitly call for structured policies and algorithmic decision making to guide the use of prehire misconduct information. They do so because, as the study shows, agencies currently leave too much room for background investigators, chiefs, and hiring managers to discount warning signs that the data say are meaningful. Algorithmic or actuarial support is important here not because human decision-makers are irrelevant, but because the institution has already shown that unguided discretion tends to underuse predictive evidence. A more objective model therefore uses decision rules, cutoffs, sequencing logic, or other structured mechanisms to ensure that validated indicators have actual hiring consequences. The point is not to eliminate judgment. The point is to stop letting judgment erase risk information the profession already knows how to interpret.
The sixth feature is a reallocation of professional roles. The article is especially clear that industrial-organizational psychologists are better positioned than clinical evaluators to design and validate selection systems, structure multiple-hurdle processes, set cut scores, and evaluate predictive outcomes. That is a major point because it means a more objective model is not simply a better version of the existing clinical system. It is a different design philosophy. In that philosophy, the central predictive work is done by validated personnel-selection architecture, not by clinician-mediated opinion. Clinical professionals may still have a role in assessing psychopathology or other functional concerns in a bounded post-offer setting. But they are no longer expected to serve as the principal engine of misconduct prediction. That work belongs to the evidence-based front end.
The seventh feature is transparency in reasoning. An objective model should be able to explain why a candidate was screened out or advanced. The article’s framework allows that. It can say, for example, that a candidate had a Tier 1 population-screening indicator with broad predictive force, or a Tier 2 red flag with especially severe downstream implications, or a cluster of supplemental indicators that collectively raised liability. That kind of explanation is not merely useful for internal governance. It is essential for legitimacy. A profession that exercises coercive public authority should not depend on screening logic that cannot be articulated except as “the evaluator had concerns” or “the candidate was deemed suitable overall.” The article’s item-level and tiered methodology gives agencies a way to reason publicly and consistently about applicant risk.
There is, of course, a caution that must be observed. The article supports a more objective, standardized, actuarial, and evidence-based model. It does not, on the provided source alone, establish a fully separate technical or legal regime labeled “forensic testing” as the operative replacement for current screening. That distinction matters for precision. The article’s own language supports validated prehire indicators, structured interviews, multiple-hurdle systems, algorithmic or actuarial sequencing, and greater I-O psychology involvement. Those are the components that can be responsibly defended from this source. The more objective model should therefore be described in those terms.
In practical terms, then, a more objective police hiring system would look something like this: gather validated prehire misconduct indicators early; verify them through standardized investigation; classify them according to empirically demonstrated predictive force; apply structured decision rules that meaningfully reduce discretion; reserve late-stage evaluative review for bounded functions rather than using it as the main predictive mechanism; and document the logic of each hiring outcome in ways that can be audited and defended. That architecture would not make hiring infallible. The article does not promise that. But it would make the process more honest. It would stop pretending that generalized professional review is enough when the evidence already points toward more disciplined methods. And it would move police hiring closer to what the article insists it should be: a preventive system that acts on predictive evidence before predictable misconduct becomes public harm.
X. Anticipating the Objections
A serious thought-piece cannot simply press the article’s critique without confronting the most likely objections. The strength of the article is that it allows those objections to be met on disciplined terms rather than rhetorical instinct. The predictable responses will sound familiar: people can change; not every bad fact is disqualifying; clinical professionals are trained experts; rigid rules can create unfairness; and hiring cannot be reduced to algorithms or checklists. None of those objections is frivolous. The question is whether they rescue the current screening model once the article’s findings are fully absorbed. They do not. At most, they justify caution in implementation. They do not justify continued dependence on a subjective, weakly standardized, and undervalidated front-end system.
The first objection is human change. People mature. Youthful misconduct may not define the future. An applicant with a troubled history may improve, seek stability, or no longer resemble the person reflected in older records. The article does not deny any of this. It does not claim that misconduct predictors are destiny or that every prior act mandates exclusion. What it says is different and more modest: some prior behaviors significantly elevate later risk, and the profession needs empirically grounded, behavior-specific benchmarks to distinguish which ones matter most. In other words, the article is not advocating fatalism. It is advocating probabilities. The existence of possible human change does not erase the institutional duty to use the best available evidence when delegating police power. A screening system can recognize that people change while still refusing to treat known high-risk signals as administratively trivial.
The second objection is that all hiring remains contextual and cannot be reduced to mechanical rule. That is true in a limited sense, but the article substantially narrows the force of the point. The study’s entire contribution is to move beyond vague context by identifying which indicators carry broad, stable predictive force and which operate as rarer red flags. It therefore does not propose a context-free world. It proposes one in which context is bounded by evidence rather than allowed to overwhelm it. A domestic-violence citation, prior unjustified use of force, or repeated law-enforcement reprimands cannot reasonably be treated as just one more subjective discussion point if the evidence shows those signals relate strongly to later misconduct. Context may matter in understanding the record, but it is no longer credible as a general license to ignore predictive significance.
The third objection is professional authority: clinical evaluators are trained experts and should be trusted to integrate complex information. The article does not dispute the existence of expertise. It disputes the current allocation of tasks. It specifically notes that screening psychologists commonly rely on subject-matter-expert judgment and that current guidelines emphasize consistent and defensible evaluations while overlooking behavioral risk assessment and empirically supported prehire indicators. It then points toward I-O psychologists as the professionals better positioned to design and validate selection systems, structure multiple-hurdle processes, and evaluate outcomes such as job performance and misconduct. The answer to the expertise objection, therefore, is not that expertise is irrelevant. It is that different kinds of expertise belong to different parts of the process. Clinical expertise does not erase the need for actuarial and criterion-linked front-end screening.
The fourth objection is fairness and adverse impact. A department may argue that structured use of prehire misconduct indicators risks excluding candidates too aggressively or producing inequitable outcomes. The article itself recognizes that employment-related assessments must satisfy standards of reliability, validity, and freedom from bias, and it situates its recommendations within professional and legal frameworks that require defensible assessment design. That matters because it shows the article is not advocating indiscriminate exclusion. It is arguing for evidence-based exclusion criteria rather than ad hoc or purely intuitive ones. Fairness is not served by a system that uses predictive evidence weakly and inconsistently. In many respects, a more standardized model is fairer because it makes reasoning visible, comparable, and capable of review. What creates hidden inequity is not disciplined use of validated indicators. It is broad discretion that cannot explain itself in evidence-based terms.
The fifth objection is that algorithms or structured decision rules are cold, rigid, or politically unattractive in a profession already criticized for dehumanizing judgment. The article’s answer is pragmatic. It recommends structured policies and even algorithmic decision making not because human values should be removed from hiring, but because the current discretionary system has shown itself too willing to underuse meaningful warning signs. This is a familiar tension in institutional design. Where discretion repeatedly dilutes important evidence, structure becomes a safeguard rather than a threat. The article does not imagine that all decisions should be made by machine. It imagines that predictive information should not be left wholly at the mercy of variable local judgment. A department can preserve case-specific review while still using structured rules to ensure that validated indicators are not casually dismissed.
A sixth objection is more political than technical: police agencies need personnel, and overly stringent screening could make staffing shortages worse. The article does not directly frame the problem in recruitment terms, but its findings implicitly answer the concern. A staffing model that fills vacancies by lowering the practical significance of predictive red flags is not a neutral administrative tradeoff. It is a transfer of risk from the hiring authority to the public. The short-term convenience of hiring faster, cheaper, or more flexibly must be weighed against the long-term cost of force complaints, sexual misconduct allegations, lawsuits, reprimands, terminations, criminal charges, and broader legitimacy erosion. The article’s data show that some of these risks are forecastable. Once that is established, staffing urgency becomes a weaker justification for hiring through evidence the profession knows is meaningful.
There is also an objection from institutional culture. Agencies may insist that no prehire system can fully anticipate the effects of organizational norms, supervision failures, or departmental climate on future officer behavior. That is true. The article does not claim to solve every posthire variable. It expressly situates prehire screening alongside other reform efforts and does not deny the continuing importance of accountability and culture. But that response cannot rescue the current model either. The existence of posthire influences does not diminish the value of reducing baseline misconduct risk at entry. A department may still need better supervision, training, and discipline. That is not a reason to preserve a weak hiring architecture. It is a reason to stop pretending that downstream reform can substitute for front-end prevention.
The more candid objection—though rarely stated so plainly—is that departments often want discretion because discretion permits them to hire the candidates they prefer despite adverse facts. The article’s findings on actual hiring outcomes strongly suggest this dynamic, even if the authors phrase it more gently. If negative prehire indicators reduce hiring chances by only about 5% on average despite demonstrated predictive value, then discretion is not merely preserving nuance. It is protecting institutional freedom to hire through warning signs. That is why the article recommends reducing discretionary latitude. It recognizes that the issue is not simply lack of information. It is the institutional choice to keep that information from becoming binding.
The answer to all of these objections is therefore the same in principle. The article supports a system that remains cautious, reviewable, and legally defensible, but also more structured, more behavior-specific, more transparent, and more willing to let evidence constrain hiring. None of the common objections justifies continued dependence on a model that mistakes impression for prediction and underuses warning signs the profession has already shown it can identify. Departments may still insist on discretion, context, expertise, and flexibility. But after this article, those concepts cannot honestly be treated as substitutes for predictive discipline. At most they are supplemental considerations inside a process that must finally be rebuilt around what the evidence says.
XI. Conclusion: Stop Hiring by Impression and Start Screening by Evidence
By the time police misconduct becomes visible to the public, the institution has usually already changed the subject. The conversation turns to supervision, training, discipline, body cameras, command accountability, decertification, public trust, or litigation exposure. All of those issues matter. The article does not deny that they matter. But it forces a more basic and more uncomfortable recognition: much of the misconduct that later produces scandal, complaint, reprimand, lawsuit, criminal charge, or termination does not begin at the point of public collapse. It begins at the point of entry, when agencies are asked to decide which applicants should be trusted with the authority, discretion, and force-bearing power of the state. The study’s contribution is not simply that some warning signs exist. It is that those warning signs can be measured, sorted, and linked to later misconduct, and that agencies have nonetheless failed to use them meaningfully in hiring decisions.
That is why the current model is no longer defensible on its own terms. For years, police departments have been able to preserve the appearance of rigorous screening because the process sounds substantial. There are questionnaires, background checks, interviews, psychological evaluations, suitability reviews, and often a language of professionalism surrounding the entire exercise. But the article exposes the difference between procedural complexity and predictive competence. A department may ask many questions and still fail to build its decisions around the answers that matter most. It may create an impressive file and still leave too much room for qualitative drift, local custom, bureaucratic convenience, or discretionary minimization. It may describe its process as careful while relying too heavily on generalized judgment and too little on behavior-specific, empirically grounded risk indicators. The article does not allow those contradictions to remain hidden. It shows that the profession often possesses better predictive information than it operationalizes.
That gap between what is known and what is done is the central institutional indictment. The article found that fifteen of nineteen prehire indicators significantly related to later police misconduct, with hazard ratios ranging as high as 14.59. It identified signals that should be used systematically, others that should be strongly considered, and still others that should function as rare but serious red flags. Yet when the authors examined actual hiring outcomes, they found that these prehire signals had only minimal practical effect on who got hired. This means the profession cannot honestly present later misconduct as merely the product of unforeseeable personal decline, internal culture after appointment, or bad luck in candidate selection. The evidence supports a more damaging conclusion: agencies have often chosen to preserve flexibility at the hiring gate rather than discipline themselves to act on predictive information that could have reduced future harm.
The article also strips away one of the profession’s most persistent myths: that prior law-enforcement experience is inherently reassuring. The study found that prior law-enforcement service did not broadly reduce later misconduct risk and in several categories slightly increased it. That finding matters because it disrupts the bureaucratic logic that has long governed lateral hiring. Departments have often treated prior badge experience as a form of pre-vetting, a sign that someone else has already done the hard work of screening and socialization. The article says otherwise. Prior police experience may be highly commensurate with future police misconduct, which means it can also be a highly probative warning sign when the applicant brings prior disciplinary history, force issues, or other forms of institution-specific trouble. A profession that still equates experience with safety after these findings is not acting on evidence. It is clinging to convenience.
That same dynamic appears in the article’s critique of fragmented standards and the “muni shuffle.” A profession that lacks coherent national hiring discipline allows predictive red flags to become geographically negotiable. An applicant who should have faced serious exclusion pressure in one department can reappear elsewhere as a manageable risk, an opaque record, or simply a staffing solution. The article shows why this fragmentation is so damaging. It is not merely that agencies differ in culture or paperwork. It is that validated indicators of later misconduct can lose practical force the moment they are filtered through inconsistent local standards, variable investigator judgment, and a decentralized profession still too comfortable with qualitative review. In that setting, hiring becomes less a matter of evidence than of institutional appetite. Some departments will absorb more risk because they are willing to call it nuance. Others will do so because they are under pressure to fill vacancies. The public, meanwhile, inherits the consequences either way.
This is why the article’s deeper lesson is not just methodological. It is constitutional and civic in its implications. Police hiring is not ordinary employment selection. The state is not choosing who will answer phones, stock shelves, or manage a generic workplace. It is choosing who will stop, search, arrest, restrain, intimidate, disbelieve, use force, testify, draft reports, and shape the practical meaning of rights on the street. In that setting, preventable hiring error carries a public cost that extends far beyond internal HR embarrassment. It can produce bodily harm, civil-rights violations, criminal exposure, taxpayer-funded settlements, community distrust, and cumulative damage to the legitimacy of public institutions. The article therefore changes the frame. A weak hiring model is not simply a professional deficiency. It is a public-risk design failure.
The corrective path the article points toward is neither mystical nor unattainable. It is structured, empirical, and administratively legible. Agencies should collect validated prehire indicators earlier, verify them systematically, classify them according to demonstrated predictive significance, reduce discretionary latitude, and stop expecting late-stage professional impression to do the work that front-end evidence-based screening should already have done. The article supports greater use of structured policies, clearer standards, and even algorithmic or actuarial decision support precisely because the current discretionary system has shown itself too willing to underuse warning signs that matter. That does not mean judgment disappears. It means judgment is finally put in its proper place: constrained by evidence rather than allowed to erase it.
None of this requires claiming that every candidate with adverse history is beyond redemption. The article does not support such absolutism, and neither should any serious reform proposal. People can change. Context matters. Institutions still need reviewable and fair procedures. But those truths do not rescue the current model. The problem is not that police hiring lacks compassion or nuance. The problem is that it has too often used the language of nuance to avoid the discipline of prediction. Once evidence shows that certain signals are meaningfully related to later misconduct, “context” cannot remain a blank check for institutional indulgence. At some point a profession must decide whether its front-end process is intended to prevent harm or merely to document concern before proceeding anyway. The article suggests that, too often, the system has chosen the latter.
The practical and moral consequences of that choice are profound. When an officer later becomes the subject of a force complaint, a sexual-misconduct allegation, a misconduct lawsuit, a criminal arrest, or a termination proceeding, the public usually sees only the final event. What the article reveals is the possibility of an earlier failure—one that occurred before the first tour, before the first radio run, before the first stop, and before the first civilian encounter. The later harm may also be a delayed expression of a front-end decision that ignored, softened, or failed to operationalize known warning signs. That is why the article should alter not only how departments hire, but how the public talks about police misconduct. The scandal is not always just what the officer did. Sometimes it is also what the institution already knew, or should have treated as known, before it handed that officer the badge.
So the final point is direct. Police departments should stop screening by impression and start screening by evidence. They should stop mistaking procedural elaboration for predictive rigor. They should stop treating prior law-enforcement experience as a reassuring shorthand when the candidate’s actual history is more probative than the institutional label attached to it. They should stop relying on fragmented local custom to determine whether validated red flags will matter. And they should stop describing later misconduct as a wholly posthire management problem when the evidence shows that meaningful parts of the risk were visible at the hiring gate. The article does not promise a world without misconduct. But it does remove an excuse. The profession can no longer plausibly claim that it has no better way to screen, no clearer signals to use, or no empirical basis for acting more carefully. The warning signs are there. The failure is that police hiring has too often preferred discretion over discipline, impression over prediction, and damage control over prevention.

About the Author
Eric Sanders is the owner and president of The Sanders Firm, P.C., a New York-based law firm concentrating on civil rights and high-stakes litigation. A retired NYPD officer, Eric brings a unique, “inside-the-gate” perspective to the intersection of law enforcement and constitutional accountability.
Over a career spanning more than twenty years, he has counseled thousands of clients in complex matters involving police use of force, sexual harassment, and systemic discrimination. Eric graduated with high honors from Adelphi University before earning his Juris Doctor from St. John’s University School of Law. He is licensed to practice in New York State and the Federal Courts for the Eastern, Northern, and Southern Districts of New York.
A recipient of the NAACP—New York Branch Dr. Benjamin L. Hooks “Keeper of the Flame” Award and the St. John’s University School of Law BLSA Alumni Service Award, Eric is recognized as a leading voice in the fight for evidence-based policing and fiscal accountability in public institutions.
Deep-Dive Audio Supplement
A deep-dive audio supplement titled “Predicting Police Misconduct Before the Badge” would serve as a strategic summary of the empirical shift required in modern law enforcement hiring. In a professional and legal context, such a supplement is designed to communicate the transition from traditional, impression-based clinical screening to an actuarial, evidence-based framework.
The content typically focuses on three key pillars of institutional reform:
The Shift to Actuarial Prediction: Moving away from qualitative “clinical impressions” and toward a noncompensatory tiered framework. This involves identifying specific, validated prehire indicators—such as prior occupational instability, behavioral red flags, and documented patterns of irresponsibility—that measurably forecast later misconduct.
The Fiscal and Fiduciary Mandate: Highlighting the immense institutional and public costs of failed screening architectures. The audio would summarize the “Utility Analysis” of rigorous screening, illustrating how a disciplined “front-door” policy can lead to significant municipal savings by reducing the probability of misconduct settlements and civil rights litigation.
Institutional Accountability: Rebranding prehire screening from a matter of “bureaucratic convenience” to a non-negotiable fiduciary duty. The supplement emphasizes that when predictive instruments exist, failing to operationalize them constitutes institutional negligence, and that true reform begins by acting on behavioral evidence before the badge is ever distributed.
Such a briefing would be structured with an authoritative, investigative tone, providing senior decision-makers—including city officials, legal auditors, and police commissioners—with a concise synthesis of the data-driven path toward more defensible and effective hiring practices.
