The four speakers in the workshop’s final panel session were Brad Weiner (University of Colorado Boulder), Michelle Whittingham (University of California, Santa Cruz), Conrad Tucker (Carnegie Mellon University), and Kirsten Martin (University of Notre Dame). Andrew Williams (University of Kansas) moderated a discussion period after the short presentations.
A predictive model, Brad Weiner explained, is an equation carefully built on historical data that estimates the likelihood of a future outcome using a new set of data. While models can be biased, he advocates that predictive models can point the way to equitable solutions. They can be useful in higher education, he said, as a means to use resources as efficiently as possible and to help rebuild America’s trust in higher education.
Currently, he noted, higher education does not have the capacity to provide customized interventions for each student, so achieving its research, teaching, and public engagement mission depends on fairly and efficiently allocating the resources it does have. In Weiner’s opinion, “if we do not use data or existing research, then we are guessing, and if we are guessing, we are going to be biased toward the status quo.”
To use predictive models effectively and fairly, Weiner offered the following guidelines:
To illustrate how a model can predict retention, Weiner used an artificial dataset to create a regression model that predicted the likelihood of retention based on four variables, including number of peers and receipt of financial aid. Once the model was created, he entered different artificial data and the model made new predictions and identified three groups of students based
on the previously identified variables: those likely to remain in school, those with a medium likelihood, and those unlikely to be retained.
It is at this point, said Weiner, that the model would transition to inform potential interventions, specifically how to allocate limited resources. For example, based on the prediction that more financial aid improves the likelihood of retention, one option would be to provide some financial support to students identified as least likely to graduate, additional tutoring or other support services to students in the “medium likelihood” category, and low-cost other support to students most likely to graduate. Although he does not advocate it, a second option would be to provide no support to the students identified as least likely to graduate because they are not likely to graduate anyway, and a third option would be to not do anything for the students who are most likely to graduate because it is assumed they do not need support. A decision to accept the latter two options would focus all the available resources on the middle group.
In closing, Weiner suggested that who succeeds in higher education is a function of both structure—the available resources—and culture—the way an institution deploys those resources. “As educators, we should use data appropriately to allocate our limited resources, which means we can reproduce the past or we can create more opportunities in the future,” he said.
Of the 17,000 undergraduates at the University of California, Santa Cruz (UCSC), said Michelle Whittingham, 31 percent are from underrepresented populations, 35 percent are first-generation students, and 30 percent are Pell Grant recipients. The university also has 2,000 graduate students, has been recognized as both a Hispanic-Serving Institution (HSI) and an Asian American Native American Pacific Islander Serving Institution (AANAPISI), and has been cited as one of the top universities in the country for advancing social mobility.16 She noted that UCSC’s admissions and enrollment management office works closely with the academic senate committee on admissions and financial aid, which sets the school’s policies and selection principles, and that the school has exceptionally strong shared governance.
Over the past 15 years, the school’s admission rate has moved from 83 percent of applicants to 52 percent of applicants for the most recent year, and Whittingham and her colleagues continue to work with faculty to innovate ways to leverage student data and insights in order to disrupt historical and systemic oppression. In particular, the university focuses on students whose backgrounds suggest that having the opportunity to study at a major research university will have a strong, positive influence on their future.
Every application, said Whittingham, is reviewed by a reader who has completed a certification process and is scored based on the rubric the senate committee has established. Applications are scored in the context of opportunity and take into account other traits such as leadership, resilience, and perseverance. She noted that faculty have worked hard to understand what achievement means beyond grades and test scores and given a student’s community of origin. As an example, UCSC may focus on schools that have a higher percentage of English
___________________
16 US News and World Report (https://www.usnews.com/best-colleges/rankings/national-universities/social-mobility) and CollegeNET (https://www.socialmobilityindex.org/) rank institutions on social mobility.
language learners and students in foster care, and it uses both census tract and block data to understand the school environment of each student.
Regarding predictive analytics, Whittingham said that all admissions criteria are in a sense based on some sort of predictive analytics. Since 2011 UCSC has been producing a first-year predicted GPA for every student and it uses this information at the margins of its admissions decisions. This predicted GPA is oriented toward inclusion, not exclusion, in that it is used to determine who is going to get a second read after the initial holistic review process. This provides an opportunity to go back and try to better understand the context of that student’s achievements. The university also uses the predicted GPA to identify the 10 percent of students that could benefit from a check-in call or in-person meeting during the first few weeks of the quarter to make sure they are doing okay or if they would like additional supports.
The process of human learning, said Conrad Tucker, relies on the senses to acquire data that an individual’s biological predictive model—their mind—uses to classify and make sense of the world. Much of the feedback a person receives, he continued, comes from other humans and the environment. Artificial intelligence (AI) relies on replicating this process: machines with artificial sensors acquire data that a predictive model uses to classify and make sense of the world. The machines and algorithms receive feedback from humans to improve their classifications, but Tucker noted that “one of the challenges is that we have to acknowledge the biases that exist in humans, and how those biases may then translate to machines.”
As an example of how bias works, he showed first a picture of an apple and asked the attendees to classify it, and then picture of a plantain and asked people to classify it. He pointed out that people in the United States are more likely to say it is a banana, whereas people in other parts of the world correctly identify it as a plantain. He explained that human cognitive processes take sensory input and use prior experience to make classifications, so bias can occur based on that prior experience (e.g., an individual’s common experience with bananas and limited experience with plantains).
Mapping these ideas to admissions, Tucker discussed “algorithms that are trained on student data,” including candidate profiles or student resumes. “The final output is a classification of whether a student is accepted to or rejected from a program, whether a student is provided financial aid, and whether a student will have a GPA that exceeds a given threshold. The challenge of these AI models,” he continued, referring back to the fruit classification example, is that the development of ground truth labels, or “the way that the machines are actually learning is coming from humans. So we have to ask ourselves, what are the inherent biases in the ways humans make decisions from data? And what are the possible risks of transferring those biases to these artificial representations of decision making?” He also noted the inherent difficulty in interpreting the classifications made by AI systems.
The goal of the workshop’s final presentation, said Kirsten Martin, was to examine the potential benefits, costs, and unintended consequences of using artificial intelligence, data science, and machine learning tools in recruiting, admissions, and retention. Doing so, she said,
involves answering four questions: what is being measured, what data does it take, how is accuracy measured, and could the outcome or target be repurposed?
It is important to remember, said Martin, that the outcome variable, data, features, and dataset are all value laden, and what often happens in the admissions process is that students who look like students from the past are admitted or given aid. She noted that although AI can make decisions faster than humans, “we are aggressively reproducing whatever is in the data,” which means that “we can also aggressively be making mistakes.”
Martin clarified that calling data “objective” actually means “easily quantifiable,” because all data-driven decisions are moral judgments. She also noted that there is a social justice imperative to avoid the use of proxy data variables, especially related to socioeconomic status, because as datasets grow larger in terms of the number of variables they include—not the number of individuals—the chances increase of finding a proxy for success in one area of life that is then incorporated unintentionally in a machine learning model designed to predict success in another area of life. As a result, winners in one sphere of life—growing up in a privileged environment, for example—may receive preference in the allocation of opportunities (e.g., college admissions) because of flaws in the data analytics program.
When it comes to measuring accuracy and finding out where predictive analytics can go wrong, it is necessary to consider whether the analytics predict the outcome variable consistently across different groups, capture phenomena of interest (e.g., finding all the good students), and are good at predicting phenomena of interest (e.g., whether accepted applicants turn out to be good students or are retained only with some intervention).
Considering whether an outcome or target can be repurposed, Martin cited use of an analytics system designed to predict retention as part of the admissions process: such a repurposing could be used to deny admission to students predicted to not persist in their studies—which Martin called “repurposing for exclusion” rather than for inclusion, as Whittingham described. Martin cautioned that “the idea of repurposing machine learning for exclusion versus inclusion is not an idle one…. We really need to be careful about how things are labeled and the purpose for which we design them.”
Responding to a question from the audience about the use of AI and algorithms in light of the research Posselt had presented, Weiner began by noting that while artificial intelligence has received a great deal of attention in the media recently, what people do not realize is how far it has to go before it is widely deployed in real-life applications. In fact, he was not aware of any institutions that are seriously considering deploying an advanced AI system that would actually make admissions or financial aid decisions.
He also pointed out that the media focuses disproportionately on 20 to 40 US universities that are not representative of what the entire college landscape is doing. In fact, between 80 and 90 percent of US four-year institutions admit roughly 80 percent of their applicants, which says to him that there is not a huge space in which to deploy AI systems for admissions purposes, and even where there is space, it is not likely to happen in the very near future.
Williams asked the speakers for ideas of how AI tools could be used with social media in a positive way to influence more students to enter engineering, and to comment on the ethics of doing so. Tucker replied that this is a fascinating idea and that large organizations have been studying whether artificial intelligence could be used to predict whether an intervention would
produce a certain response, and the answer is that it can. The challenge with social media is that it is largely an echo chamber where individuals’ belief systems get reinforced by the people around them. He remarked that it would be nice if students started with a belief that STEM education is the future and then use artificial intelligence to reinforce that, but the reality in popular culture is that STEM education is not considered “cool” and is not a path that is inclusive. For that reason, Tucker said he is not sure AI can be the silver bullet to solve much larger societal issues.
Martin remarked that machine learning might be useful for identifying and then targeting individuals who might be interested in and persuaded to pursue STEM education, but probably not for persuading groups of people, such as teenagers. In fact, she said, there is danger of a university using this type of microtargeting because it could prompt people to wonder how the institution managed to get so much information on an individual and then use it to manipulate them. That, said Whittingham, would be a good example of an unintended consequence of what would appear to be an altruistic motive.
Weiner noted that what artificial intelligence does well is tackling repetitive tasks that do not require difficult classification problems. Taking on the selection process for admissions would not fall into the category of easy classification problems, but algorithms that could read transcripts and understand what different course titles mean and provide some context around the grades associated with those courses would eliminate one of the tedious initial steps for human admissions reviewers. Whittingham agreed that it would be a big stretch to turn over the assessment of contextual variables to any kind of machine learning system any time in the near future.
An audience member asked about suggestions for vetting products offered by educational technology companies. Martin said one problem is that the people responsible for purchasing such a system do not know the right questions to ask or are not aggressive about asking them of these companies. She would advise inviting someone from the computer science department to ask tough, almost hostile questions about how these systems work. Whittingham recommended finding out what other institutions a company pitching these services has worked with and what their experience has been. Weiner said that although many of the products sold by the companies can add value to an institution’s work, he would advise extreme skepticism and short contracts.
When asked if the United States would follow the European Union’s actions in proposing restrictions on the use of AI, with specific mention of university admissions, Tucker replied that Europe is ahead of the United States in thinking about privacy issues and that this country is not anywhere near proposing restrictions on artificial intelligence. One reason is that data analysis is a lucrative area for US technology companies, and policies that limit access to data are not going to gain traction here any time soon. What he would like to see is some mechanism that provides students with information on the factors used to produce a given admissions decision from their data. Martin agreed with that idea, and suggested that institutions look carefully at the output variables that lead to an admissions decision that differs from a reasonably expected outcome. The speakers all agreed that transparency is essential when it comes to how data are used with any predictive model.
Martin did suggest that machine learning might be useful for looking at the data for specific groups, such as women or other specific historically minoritized groups, and those data could be used to refine models so that they might reveal predictive patterns for those groups. In that type of use case, machine learning could increase inclusion, but humans would still be the ultimate deciders. The key is to build models to answer very specific questions and help point the
human deciders to the desired decision. Weiner agreed that an appropriate use for predictive models is to optimize resources by being thoughtful and designing systems for specific purposes.
Tucker suggested that predictive modeling could be useful in retention applications by perhaps predicting who would be at risk of dropping out, although he expressed concern that this would use data about student behavior and perhaps mental health status, which would raise privacy concerns. Predicting that type of behavior trajectory would incorporate multiple dimensions of data, so the issue would be getting students and parents to allow sharing the types of data that could illuminate challenges a student might face.
Weiner believes machine learning could be useful as a mirror on campus to help identify places where various pathways are going wrong. For example, the right algorithm could reveal an opportunity to improve a specific curriculum to make it more inclusive.
To close the discussion, Williams asked the panel to provide some words of wisdom for newcomers to the field of artificial intelligence, machine learning, and data science. Whittingham suggested drawing on some of the great books and papers that are available, challenging assumptions about the potential benefits of these systems, and not underplaying potential risks. Tucker agreed with the importance of thinking about potential risks and unintended consequences and educating oneself about what it really means for an algorithm to make predictions or classifications. It is important, he added, to understand the limitations of these systems and not confuse possible correlations with causal patterns.
Weiner acknowledged that colleges and universities are going to have to use data to operate efficiently and effectively as institutions that have the public trust, but said it is imperative to do so transparently and to be concerned about how data might be misused, particularly with predictive models that could further disadvantage marginalized students. He also called for skepticism with education technologies, a caution that Martin endorsed. She reiterated her point about asking hard questions about the conditions under which a particular product works.
After the panel session, the attendees distributed themselves in two virtual breakout rooms to spend 45 minutes discussing the following three questions and then report back to the assembled workshop attendees:
Amy Kramer (Ohio State University) reported that her group thought one of the biggest barriers in using data tools is training, particularly for users to understand the limitations of these tools, what a tool does, and whether it reflects the overall goals of university admissions. The group suggested that some of the time saved using data tools could be invested in improving them and making sure that they are transparent to all users.
The group brainstormed other tools to inform advising, and to help students pick the best colleges for them and understand their odds of admission, whether an application is worth the admissions fee, and what they can do to improve their applications.
In terms of what a “perfect admissions system” might look like, the group discussed a portfolio for each student, with information about the context of the high school, interviews,
annotated high school transcript, and reference letters; an equity lens; tools that could help train recruiters to reduce bias; and transparency. The group also suggested that a perfect admissions system would use input from many different stakeholders in the decision-making process.
Pieri reported that the group considered what happens after admissions in terms of supporting students as well as ways to help faculty understand the culture and experiences of incoming students. They suggested having admissions officers give a briefing to the faculty about the characteristics of the incoming class. The group also discussed how effective AI reviewing of graduate students could be reflective of their cultures, both international and US. Finally, the group discussed faculty hiring and promotion processes as well as the development of a department culture that brings together students and faculty from different backgrounds and produces well-educated graduates who go out and influence industry, even when many faculty members have no connection with industry except on a consulting basis.
This page intentionally left blank.