Is Risk Prediction Real Criminal Justice Reform or Junk Science?
by Joe Watson
With its criminal justice system bursting at the seams, one state has decided to take a controversial step to alleviate the pressure.
In 2015, Pennsylvania officials indicated the state would become the first in the nation to mandate the widespread use of risk predictors (also referred to as risk assessments) to determine how much time defendants should spend behind bars.
State judges will use statistical probability to determine whether offenders are likely to commit future crimes, data that until recently was almost exclusively used in the re-entry phase to determine how to best supervise prisoners upon their release from prison or jail.
Now, thanks to state legislation passed in 2010, those considered unlikely to engage in future criminal behavior will receive shorter sentences or avoid prison time altogether, which would ease overcrowding in state prisons and help reduce the $2 billion Pennsylvania spends on its corrections system each year.
But on the flipside, as critics of so-called “evidence-based sentencing” argue, offenders deemed at greater risk of committing new crimes – based on such factors as their age, gender, criminal record, educational background and even their history of arrests, not just convictions – could spend more time in prison even if they don’t actually commit future offenses.
“It’s a higher stakes decision point in terms of someone’s liberty,” said Ellen Kurtz, who formerly served as director of research at the Adult Probation and Parole Department (APPD) in Philadelphia. “It definitely makes me a little bit more uncomfortable.”
What many like Kurtz fear is that those who could be most adversely affected by evidence-based sentencing are already the most disproportionately represented in America’s prisons and jails: people of color and those living in poverty.
In Allegheny, Philadelphia and Montgomery counties, for example, blacks were more likely than other racial groups to be rated “high risk” on risk assessments, according to 2013-2015 data analyzed by the Reading Eagle. In Berks County, Hispanics were more likely to be rated high risk.
Risk predictors “are described in a way that makes them sound dry and difficult to oppose,” said Sonja Starr, co-director of the Empirical Legal Studies Center at the University of Michigan. “They’re evidence based, they’re scientific, they’re based on regression studies. People don’t like to say they’re against smarter sentencing or more informed sentencing.
“[But] they might not understand,” she continued, “that when you give a risk prediction that’s based on group averages, you are essentially saying that the state should penalize [someone] for being black, male, young, unmarried, or having parents that went to prison.”
Risk prediction has been used in the criminal justice system for nearly 100 years, according to the New York-based Council of State Governments Justice Center (CSGJC), which has studied the evolution of such assessment tools. They have typically been used to evaluate pre-trial detainees when making bond decisions or estimate the risk of releasing prisoners on parole. [See: PLN, Feb. 2016, p.20].
For decades, risk predictors were little more than psychiatric evaluations based almost entirely on interviews with offenders, and that methodology was widely accepted “prior to the development of structured risk assessment tools in the 1970s,” when risk prediction first became more data driven, according to a 2013 CSGJC report.
The current incarnation of risk prediction “explicitly integrate[s] case planning and risk management” into the process in order to enhance treatment and supervision of offenders, the CSGJC reported.
Across the country, at least 19 standardized risk predictors are used to assess an offender’s likelihood of recidivism, as well as another 47 designed specifically for certain jurisdictions in 37 states.
One of those jurisdictions is Philadelphia, where the APPD has been using risk predictors developed by University of Pennsylvania statistician Richard Berk since 2009.
Berk’s risk predictor incorporates “machine learning,” a statistical discipline he believes is more advanced than most prediction tools because it allows a computer program to sort through a large amount of data and then decide which factors matter and to what extent. His program, Berk argues, is more accurate than other risk predictors because it can distinguish, for example, between violent and nonviolent crimes in an offender’s prior history.
When Philadelphia’s APPD adopted Berk’s program, the department supervised about 50,000 people with just 275 case officers. Cases were randomly assigned to officers regardless of the offender’s profile or criminal history. It was left up to the officers to decide how to supervise offenders at risk of committing more crimes, particularly violent crimes. Community supervision was, at that time, according to Kurtz, “all intuitive, gut based decision making.” And it failed miserably.
But Berk’s method, which sorts offenders into categories of high, medium and low risk, has seemingly reversed those outcomes. Case officers now supervise a set of offenders based exclusively on their level of risk, requiring high-risk offenders to be under more intense supervision and allowing low-risk offenders to check in less often.
Before fully adopting Berk’s system, the APPD tested its potential and found that, with less time devoted to low-risk offenders, case officers were able to oversee more people. Arrests for serious charges decreased, if only slightly. And since fully implementing the risk predictor program, recidivism – violent recidivism in particular – has fallen, making Berk’s once controversial method a success.
Some states have been using risk prediction in other stages of the criminal justice system besides community supervision. Kentucky, for example, uses a risk assessment program developed by the nonprofit Arnold Foundation to determine bail for adult offenders. Though the results have not been overwhelming, the state has released more pre-trial defendants and re-arrested fewer of them.
In Florida, probation officers and judges use a predictor called the Positive Achievement Change Tool (PACT) to help determine everything from supervision to sentencing (on a limited basis) for juvenile offenders. A 2014 state report claims that juveniles who were assessed through PACT before their sentencing were half as likely to reoffend within 12 months as those who were not.
Arizona has been using its own “risk assessment suite” for several years, utilizing probation officers to meet with offenders, analyze their criminal records and then submit a pre-sentence report to judges. But like the system employed in Virginia (where that state’s use of risk predictors was challenged by the ACLU as unconstitutional), judges in Arizona often disregard pre-sentence reports as they are not required to consider their findings.
That’s what makes Pennsylvania’s risk prediction initiative unprecedented. With the exception of a few minor offenses and misdemeanors, evidence-based sentencing will be used in nearly every type of crime in almost every courtroom in the state.
Based on a sneak peek at the methodology, the application of risk prediction in Pennsylvania’s courtrooms will look something like this: a judge will use a chart that places a defendant into low, moderate or high risk categories based on a point total that maxes out at 13.
If the defendant has an extensive prior criminal history, for example (which is considered by experts to be the greatest predictor of recidivism), he or she would be assessed up to 4 points. Being male is worth another point. If the defendant lives in an urban rather than a rural county, that would be yet another point.
Such a hypothetical defendant would already have a nearly 50% risk score to commit more crimes. And that’s before factoring in characteristics that many, including former U.S. Attorney General Eric Holder, feel should not be considered when determining the amount of time an offender spends behind bars.
“By basing sentencing decisions on static factors and immutable characteristics, like the defendant’s education level, socioeconomic background, or neighborhood,” Holder said in a 2014 speech to the National Association of Criminal Defense Lawyers, “[the use of risk predictors] may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.”
Another factor that Pennsylvania’s Commission on Sentencing plans to include in its risk predictor program is, perhaps, its most controversial: the defendant’s history of arrests. As public defenders throughout the state have pointed out to the Commission, arrest rates are even more racially skewed than convictions.
Based on the same type of statistical generalizations used in risk prediction, data shows that men without a high school education, or who are single or unemployed, are more likely to be black or Hispanic and live in low-income neighborhoods.
“This is a compounding problem,” said Bradley Bridge, an attorney with the Defender Association of Philadelphia, which has argued that factoring a defendant’s arrest history into sentencing decisions may be unconstitutional.
Mark Bergstrom, the executive director of Pennsylvania’s Commission on Sentencing, noted he is listening to concerns like Bridge’s, as well as from others who have spoken out at public hearings across the state. While he contends that according to research, prior arrests more accurately predict recidivism than prior convictions, he knows that using arrests as a factor in the state’s risk predictor will be challenged in court – something the Commission wants to avoid.
And while risk assessments are mandated to be used in nearly every criminal case in the state, Bergstrom wants them to be applicable in a limited way: to identify low-risk defendants who would benefit by avoiding prison altogether and high-risk defendants to be considered for longer sentences or more intense treatment.
“While risk assessment tools and evidence-based research and practices are a vital part of [criminal justice] reform efforts, they do not absolve us of our moral responsibility as a country,” Glenn E. Martin, the founder and director of JustLeadershipUSA, wrote for Truthout.org. “Couching reform in ‘risk’ offers the kind of cowardly political ‘out’ that makes it easy to bury the faces and ignore the stories of people, families and entire communities whose lives have been devastated by prison time.
“What risk assessment tools can’t measure,” Martin continued, “is the power of redemption, a human capacity that belongs to people who have committed all sorts of crimes, including murder. Are we so morally bankrupt that we do not believe that people have the capacity to change? We’ve reduced human beings saddled with criminal convictions to statistical probabilities....”
Further, while risk assessments may be effective in forecasting recidivism, they don’t have the capacity to actually prevent it. Risk predictors help to identify offenders who are more likely to commit future crimes but do nothing to change the circumstances of those who feel desperate enough to engage in criminal behavior in the first place.
“America won’t solve its prison crisis using the same logic used to construct those prisons,” Martin said. “In the name of reform, we have settled on an automated and mechanical solution to a bizarre and inhumane system, weighing each individual life using factors that excuse us of our responsibility to those lives.”
There is some evidence that supports such criticism. In May 2016, ProPublica issued a report claiming that risk prediction calculations made by COMPAS, an assessment tool produced by Michigan-based company NorthPointe, often show a racial bias – handing down harsher scores to blacks than to whites. NorthPointe has disputed ProPublica’s findings as being misleading.
Also, as reported by TCF.org in August 2016, some critics have raised concerns related to public transparency and the proprietary algorithms employed by risk prediction programs. For example, TCF.org reported that a journalist filed public records requests in all fifty states, seeking information related to how “criminal justice risk assessment forms were developed or evaluated.” The reporter received no records from any of the public agencies.
Yet since risk assessments have a direct impact on people’s lives, it is important to understand how they work – and, apparently, many risk prediction systems don’t necessarily work very well.
As reported by Washington State’s Spokesman-Review, in January 2017 city and county jails in Spokane began using a risk predictor called the Spokane Assessment for Evaluation of Risk (SAFER), to determine which pre-trial detainees should be released in lieu of bond.
According to the newspaper, SAFER was developed by Washington State University criminal justice professor Zach Hamilton, who had aggregated data from more than 13,000 Spokane County criminal cases.
It was this use of local data – as opposed to generalized national demographic data – that Spokane County Criminal Justice Coordinator Jacqueline Van Wormer claimed was responsible for SAFER’s superior performance. As quoted by the Spokesman-Review, Van Wormer said the SAFER risk assessment tool was accurate in around 70 percent of its predictions, whereas other systems employed by other agencies nationwide are accurate only 55 to 65 percent of the time.
Which means even the “superior” risk prediction program used in Spokane County was inaccurate in about 30 percent of cases – still a significant margin of error.
Sources: “Risk Assessment Instruments Validated and Implemented in Correctional Settings in the United States: An Empirical Guide,” by Council of State Governments Justice Center (2013); www.brookings.edu; www.spokesman.com; www.themarshallproject.org; www.truthout.org; www.vox.com; www.readingeagle.com; www.tcf.org