Rabbi Levi son of Rabbi says…The Holy One said to Moshe “You will make a menorah of pure gold” (Shemot 25:31).
Moshe responded: how will we make it?
God responded: “It will be made of hammered work” (Shemot 25:31).
But Moshe struggled and went down and forgot how to make it.
He went up again and said: My Master, how do we make it? God said: “It will be made of hammered work” (Shemot 25:31).
But Moshe struggled and went down and forgot.
He went back up and said: My Master, I forgot it!
God showed Moshe, and Moshe still struggled. God said to him: “See and create” (Shemot 25:40), and took a menorah of fire and showed him how it was made.
But, it was still a struggle for Moshe!
The Holy One said to Moshe: Go to Betzalel, and he will make it.
Moshe told Betzalel, and he immediately made it. Moshe was amazed and said: How many times did the Holy One show me, and I still struggled to make it! But you, who never saw it, knew how to make it by yourself!
BaMidbar Rabah 15
One of the professional events I most look forward to each spring is the Virtual Workshop on Contemporary Parole–a fantastic two-day online gathering of a rigorous group of people producing exceptional work, which we’ve now held for the third year in a row. The papers are always superb and so is the camaraderie and commentary. I got to present a draft version of my new Sirhan Sirhan paper, as well as hear really terrific work on various aspects of parole: gang validation, racial proxies, young adulthood, and others. I can’t go into too much detail, because these are all works in progress and we’ll probably see polished versions of everything getting published soon enough. But one thing that stood out to me was the uptick in really interesting work utilizing machine learning.
I know next to nothing about machine learning and, like Moshe in the midrash above, I might be too old a dog to learn that particular trick. I mean, in the Sirhan paper, n=1. Thing is, the midrash really resonates with me because I, too, feel a lot like Moshe when I hear someone else talk about a fantastic skill they have and how they put it to good use. It looks like, despite God’s repeated tutorials, Moshe’s goldsmithing skills weren’t up to snuff. Thankfully, there were other Israelites with that particular skillset: Betzalel was a gifted goldsmith who made a spectacular menorah on the first try (this is why Israel’s fantastic art school is named after him.) While unable to emulate Betzalel’s feat, Moshe had acquired a basic understanding of the necessary artistry and workmanship, so he could appreciate why Betzalel’s finished product was of such high quality. In other words–I don’t employ machine learning in my own work, but I know enough about it to be amazed when I read a paper that uses it well.
To understand the promise of machine learning, let’s first talk about how we do parole research the old-skool way. A multivariate regression works much like the denouement in an Agatha Christie mystery novel. You know the drill: Poirot gathers all the usual suspects in a room and goes through a litany of their motivations, opportunities, debunked alibis, you name it. He eliminates them one by one until he can point to the culprits. The important point is that Poirot selects who goes into the parlor for that last scene: people get there by invitation, and Christie is careful to craft the scene so that it’s pretty much always a finite and manageable list of people. When I run a regression, I pretty much do the same: I think about the dependent variable–the phenomenon I’m trying to explain–and I try to come up with a list of the independent variables that might explain it. For example, if my determinate variable is a parole grant, I ask myself: Do people who are represented by a private attorney do better than people who are represented by a panel attorney? Do people whose hearings happen in the morning fare better than folks who are heard in the afternoon? If victims and/or prosecutors show up for the hearing, does that make a difference? Does the professional background of the commissioners matter? Do people in some prisons stand a better chance of being granted parole? You can tell that each of these assumptions has a certain logic behind it (you get what you pay for; people are more attentive and in a better mood when they are not tired or hungry; professional background goes into constructing people’s worldviews; some prisons have better rehabilitative offerings than others, which improves one’s case.) I put all of these “suspects” in a room (the regression equation,) run the numbers, and see which comes out significant.
One of the problems with this model is that regression models rarely offer a complete and exhaustive prediction of the phenomenon they try to predict. There is even a statistic, the r-square, that measures how much of the dependent variable is explained by the set of independent variables we coded for. But there could be many factors that play into a parole grant that cannot be adequately captured by the variables we identified. In other words, 21st century law enforcement doesn’t solve crime by putting twelve people in a parlor; if there is forensic evidence at the scene, it gets analyzed, plonked into giant databases, and could generate hits that are one-in-a-million, not one in twelve.
Enter machine learning. As we’re all now figuring out through our use of ChatGPT, artificial intelligence excels at digesting large amounts of text, identifying repetitive patterns, and throwing those patterns into a model. AI is intertextual in that it can assess the impact of any factor in the database on any other factor. As my colleague Kristen Bell and others explain in this paper, this allows the tool to mine parole transcripts for repeated words to get a sense of factors that would not be salient to us in a traditional regression. Moreover, the capacity of these tools is enormous, so one can feed the machine tens of thousands of cases and get a very powerful sense of what is going on. There are even tools like SuperLearner, which can apply multiple machine learning tools to a dataset, coming up with the best of several models. My colleagues Ryan Copus and Hannah Laqueur do exactly this.
Machine learning has many applications in criminal justice, as this excellent NIJ article explains. The critiques that are leveled on machine learning often revolve around its most common criminal justice use: predicting reoffending risk. As explained in this solid blog post, critics worry that any predictive analysis based on historical crime data will reflect (and thus reinforce) existing biases embedded in the criminal justice system, and perpetuate misconceptions and fears through the feedback loop of basic predictions on past decisionmaking. In other words, as my colleague Sandy Mayson argues, the problem is with the nature of prediction itself. You rely on a biased past, you get a biased future.
What researchers like Bell, Copus, Laqueur and others contribute is the potential of turning the use of the predictive tool on itself and using it not to predict the risk of those subjective to the system, but rather the factors that impact the decisions that the system itself makes. For example, if private attorneys do a better job than state-funded panel attorneys, wouldn’t we want to know this, and wouldn’t it be important to figure out exactly what it is about their performance that makes the difference in the outcome? Using AI can help identify, for example, terminology used by lawyers, thus giving us a sense of the “flavor” of representation that parole candidates receive.
When done well, this technique has fantastic potential to teach us about the hidden nooks and crannies of the parole hearing machine that we would not be able to flag on our own. You don’t have to be an AI whiz to understand and appreciate machine learning research; you just have to understand what it does and appreciate its strengths and weaknesses.