A year ago, I dropped out of one of the best computer science programs in Canada. I started creating my own data science master’s program…
More colleges and universities are exploring how to better use the trove of data they’re collecting on their students to improve teaching and learning. (Image credit: Chelsea Beck/NPR)
The Chron has a report (with underwriting by Blackboard) on using Big Data in higher ed to foster better student learning.
Via The Atlantic: http://ift.tt/2fasrcd
A college degree may be the golden ticket to a better job, but that incentive alone isn’t enough to stop millions of students from dropping out of school. In fact, just over half of students complete their postsecondary degrees within six years. But a lack of academic preparation is not necessarily the saboteur of their success: More than 40 percent of dropouts left their studies with at least a B average, a recent analysis of 55 colleges showed.
Faced with bleak statistics such as these—in addition to scrutiny over their affordability—colleges are looking in the mirror to examine how they might do more for these students who have the talent to make it but ultimately don’t. At a growing number of schools in states like Maryland and Tennessee, the results of this soul-searching are starting to take shape as a series of digital columns and rows in spreadsheets. This reform practice even has a flashy name: predictive analytics.
Colleges have looked at student data before, but often “it’s data too late,” said Frederick Corey, the vice provost of undergraduate education at Arizona State University, who spoke at a meeting for education reporters last month.
Corey is one of the university’s main drivers of using data to improve the students’ academic experiences. While we may view lists of numbers as the arch expression of a campus’s impersonal attitude toward its students, predictive-analytics evangelists believe that data collected the right way ultimately can personalize a student’s time at a large school in ways that weren’t previously possible. The result is a system of timely suggestions that prompts students to perform the tasks that are shown to improve their chances of completing a course, and ultimately a degree. But while the potential is high, the risks are salient, too. When does a digital nudge turn into a dictum that prevents a student from chasing her dreams? And does that digital profile become a risk to the student’s privacy?
Institutions of higher learning have always gathered copious amounts of information about their students, from how many of them complete certain courses to how accurately a grade in one course predicts their success in harder classes down the line. But until recently, much of that information had been collected merely for accountability purposes—data that are shared, for example, with state and federal agencies that track how successful colleges are at graduating their students or giving them educations that allow them to earn living wages in the workplace. (It’s these data mandates that allow journalists to report on the previous year’s graduating class.)
But what about using these data to look ahead? “I just want to understand, why is the world of education obsessed with autopsy data?” Mark Milliron, the co-founder of education predictive analytics firm Civitas Learning, recalled his partner at work asking after a few months on the job. “It’s always studying data of students who are no longer [enrolled].” And often the data are sliced to show how the average student behaves, painting a picture of the typical student that actually applies to no one. “How many of us know people with 2.3 kids?” Milliron quipped.
And not all data are digital. Small colleges already have a type of predictive analytics built into its system, explained Milliron. Thanks to small faculty-to-student ratios, professors and administrators are able to make quick judgment calls about their students’ weaknesses or points of trouble—lack of participation in class, fear of making eye contact, the tremors in the voice hiding the embarrassment of being overwhelmed—and act on those observations. But “you’re probably not going to get that same personalized experience” at larger campuses where students are likelier to be the first in their families to vie for a bachelor’s degree and may not know how to navigate both the bureaucracies and expectations of a college education, Milliron said. With predictive analytics, “you’re connecting the dots so you’re not getting lost in the mix.”
Experts say a good predictive-analytics system avoids making recommendations based primarily on a student’s financial or cultural background. In Milliron’s experience, many colleges initially assume it’s enough just to observe that low-income students or those who belong to certain racial groups underperform; colleges would then make assumptions about students from similar backgrounds who enroll and then refer them to mentoring sessions or more time with advisers. “You’re insulting and-or stereotyping that student,” Milliron said. Worse, colleges may feel motivated to either exclude those students from their admissions or lower their standards for issuing degrees.
Rather than focusing exclusively on race or family income, a more precise predictor of success is whether or not a student’s financial aid is adequate to address her financial needs. Students stressing over holes in their finances are at greater risk of leaving college—as Temple University learned when it turned to data to boost its graduation rates. (In fact, hundreds of schools are offering emergency small loans and grants to students who may be at risk of dropping out due to diminished funds.) Another telltale sign that some students may be off track is whether they have enrolled in a key requisite to their major by a certain point early in their college tenure. Sending alerts to them or their advisors can preempt the cascading effects of taking the necessary classes too late.
Maintaining a staff of data analysts who are able to monitor student behavior in real time across multiple variables can be expensive for colleges, but the payoffs can be huge. Corey of Arizona State University said since his school began using predictive-analytics programs nearly a decade ago, it’s seen its graduation rate climb by 20 percent. One tool ASU has relied on is College Scheduler, a product that several hundred postsecondary institutions have used. Before they sign up for classes, students enter personal information into a dashboard program that spits out possible course schedules and take into consideration their personal and academic obligations, like being a working parent pursuing biology who has to pick up a daughter from daycare. The tool can be valuable because many students may otherwise end up taking courses that don’t count toward their major, wasting their time and financial aid. At ASU, the College Scheduler auto-populates with the courses students have to take, Corey said.
“Students, unless they’re John Nash, Jr., can never do that matching,” said Milliron, who added that College Scheduler has been shown to boost college-completion rates by more than three percent.
Still, all that data requires a high degree of training and security, because universities have “data points on a student encompassing almost every single aspect of that student’s life in a way that no one else does,” said Brenda Leong, a senior counsel and director of operations at the Future of Privacy Forum. Beyond the abuses of power that are potential hazards with the use of predictive analytics, there’s also the difficulty of ensuring a student’s privacy. Leong noted that just by knowing someone’s birth date, gender, and zip code, there’s an 87 percent chance she could determine that person’s identity. Leong said she often hears boosters of big data referring to the growing amounts of student information as “fields of gold.” “That’s the kind of phrase that puts a lot of people off,” she said. “It’s not data, it’s students; it’s real people with real lives.”
How these predictive data are relayed to students matters as much as the data itself, experts contend. Haughty notes or clinical red flags in students’ inboxes that caution they’re at risk of jeopardizing their academic futures because they’re not attending classes or logging into online portals to turn in homework can chase them away permanently. “This phrase, ‘You’re at risk’ is highly problematic. We never say to a student, ‘you’re at risk,’” Corey said. In recent years some scholars have been developing motivational language for students that strikes the right tone between concern and a kind call to action. The idea is to have colleges adopt these terms so students feel emboldened to improve rather than distraught over their limited success.
Milliron said more encouraging language would say something like, “just so you know, the next milestone is this course. If you pass this course at this level, you’ll triple your likelihood of graduating.”
But even that approach might lead to miscues. Leong warned that if professors are the ones monitoring the student data and firing off such missives, their opinion of the students may be altered. The professor could become overly solicitous or judgmental, undoing the potential benefit of the initial concern. One workaround is to have mentoring officers who are trained to speak sympathetically to struggling students send those notes and track the data.
In other words, predictive analytics is a few Jurassic eggs along in its evolution. Getting it right will take time, Milliron said. “Anybody who says they have this all figured out doesn’t know what they’re talking about.”
This article appears courtesy of the Education Writers Association.
For nearly 30 years, pundits have predicted that education technology would disrupt higher education. Online courses will reduce costs and create unprecedented access to higher education, so the argument goes. Likewise, adaptive learning will improve — or replace — the art of teaching as the right digital content is delivered at the right time to each individual learner.
It’s looking increasingly like none of these are the game-changers we expected. While online learning is commonplace, higher education remains firmly in the crosshairs of critics targeting high tuition, student debt, poor completion rates and unemployed and underemployed graduates — demonstrating a growing skills gap.
But all is not lost. It may be that technology’s transformation of higher education lies not in the transformation of teaching and learning, but the advent of a new digital language that connects higher education and the labor market and, in so doing, exerts profound changes on both.
The historic disconnect between higher education and the needs of the labor market is a data problem. In the past, data translating the discrete skills or competencies that employers need was not easily available or meaningful to faculty who create courses, or the students who take them.
Meanwhile, hiring managers have consistently relied on signals supported by anecdotal evidence, at best — for example, assuming that philosophy majors from Brown made terrific analysts, or that teachers with master’s degrees performed better in the classroom.
Today, technology is changing the relationship between education and the workforce in four distinct ways.
First, competency data is becoming increasingly available. Online psychometric assessments, e-portfolios and micro-credentials are surfacing student competencies beneath the level of the terminal credential (i.e. degree). In addition, many colleges and universities are in the process of migrating to competency-based models, which will allow for the output of transcripts that better describe the competencies of graduates.
No longer will students fork over $200,000 in tuition for a standard four-year bundle.
Second, there is a clear path for employers to interact with this new data. Applicant Tracking Systems (ATS) are incorporating analytics and will soon begin gathering new competency data as inputs for assembling candidate pools for human hiring managers to evaluate. As such, ATS is transitioning from a backwater of HR technology to Application Information Systems that will radically reduce the preponderance of false positives and false negatives in candidate pools, thereby significantly reducing bad hires that cost employers about $15,000 each, on average.
Third, this data is being extracted and parsed into competency statements by algorithms originally developed for purposes other than human capital development (i.e. search, e-commerce). On the other side, the same algorithms are extracting and parsing competency statements from job descriptions, then matching the two.
Of course, regardless of the caliber of student competency data, matching students with jobs only works if employers’ job descriptions accurately capture and describe key competencies. So the fourth major development is the advent of “People Analytics” technologies, allowing employers to track employee performance with a feedback loop to job descriptions. The result is that job descriptions continuously improve, moving from vague and data-poor to precise, data-rich renderings of the profiles of top performers.
Together, these four technological developments will close the gap between higher education and the labor market and usher in a new era in human capital. The resulting “competency marketplaces” will help students understand the jobs and careers that they’re most likely to match and help employers identify students who are on track, or on a trajectory to match in the future.
Competency marketplaces will inform students’ direction through postsecondary education by providing a human capital GPS to help them select which credentials, courses, assessments, projects or virtual internships move them most efficiently and effectively toward target professions or employers.
The core of the competency marketplace is the candidate or student profile. Your profile will include your resume and transcript, along with badges, projects, the results of standardized tests taken over the course of your life (SAT, ACT, GRE, LSAT) or new industry- or employer-specific micro-assessments. Students with more comprehensive profiles (i.e. more competency data) will be given preference by employers via the ATS. Colleges and universities that fail to recognize this may find that their students are at a relative disadvantage in the labor market and, over time, may face enrollment pressure.
The market for competencies will ultimately put unprecedented pressure on colleges and universities to unbundle the degree. As employers move to competency-based hiring, many will determine that degrees are not a priority — or even required for certain jobs. Over the next few years, degrees will become MIA in many job descriptions.
Unbundling doesn’t mean liberal arts will disappear. It may be that liberal arts courses provide high-value competencies that predict career success across many professions. But it does mean that revenue per student will decline, and that colleges and universities will need to work a lot harder and be a lot more creative to capture the lifetime value of student-consumers. No longer will students fork over $200,000 in tuition for a standard four-year bundle. Postsecondary education will become increasingly affordable. Completion rates will rise. Placement will improve. This is how technology will ultimately disrupt higher education.
While this seems like the stuff of science fiction, it is not far off. Millions of new job descriptions are posted online every month. Colleges and universities are issuing millions of micro-credentials, millions of students are posting work in e-portfolios. Thousands of employers use Applicant Tracking Systems that are transitioning to Applicant Information Systems.
As the new language of competencies disrupts higher education, we will need to be vigilant to protect the central role that our colleges and universities play in civil society and economic development. At the same time, colleges and universities must take no comfort in the fact that prior predictions of technological disruption have proven false. This time really is different.
Nasir Bhanpuri is not a record producer. He’s never been high out of his mind with depraved rock stars in a Four Seasons hot tub; he’s never even been inside a studio. He’s a data scientist at a health care systems company, where he develops models to predict patients’ hospital needs. But on a fall night in 2014, in the back room of Schubas — a small, low-ceilinged venue on Chicago’s North Side — he caught a show by his friends in Bombadil, a quirky North Carolina folk-pop band. In the dressing room after the show, he had a conversation with the band that made him think about music the way he usually thinks about health care: What makes one Bombadil song more popular than another, and what if he could predict that?
Bhanpuri knew the band in its earliest incarnations — as experimental, Bolivian-inspired undergraduates at Duke University who wore outlandish costumes and played novelty songs about death and caterpillars. But a decade later, on this particular night in Chicago, they played mostly love songs — and Bhanpuri wondered why.
They seem to be more popular, Daniel Michalak, the lead singer, told Bhanpuri after the show. “They had this hunch from getting feedback when they were performing, but it hadn’t really been quantified,” Bhanpuri said. He was a guy who could quantify things.
After the show in Chicago and several conversations later, the band agreed to let Bhanpuri build a rudimentary model to try to predict the popularity of Bombadil’s songs. “The worst thing that can happen is we make a bad song, and we do that all the time anyway,” Bhanpuri recalls Michalak telling him during their initial conversations.
Data and predictive systems are being used try to answer complex questions about policing, basic income, who’s going to win elections — but this is nothing like that. This is data analysis on the smallest scale, a model built for one peculiar little band that wanted to make better music for its fans. Bombadil wasn’t trying to change its whole sound or to manufacture a perfect song. It just wanted a more systematic approach to making music, with the goal of creating slightly better versions of the songs it was already making.
To figure out what about a Bombadil song made it popular, Bhanpuri first had to know what the components of the band’s songs were. So, for every song on Bombadil’s first four albums, Bhanpuri asked the band members to provide information like the amount of drums, the number of song sections and how much each person was singing. He narrowed that list to 20 categories and asked the band to rate the amount of each in every song, on a scale of 0 to 5. (To take the drum example — 0: no drumsticks were needed, 5: broke another drumstick.)
While the band broke down the songs into their piecemeal data parts, Bhanpuri worked on creating a metric for popularity that reflected the nuances of various music platforms. Together, everyone settled on a combination of ratings from Last.fm, Spotify and the Echo Nest (a music data and analysis company that is now part of Spotify). Using four albums’ worth of popularity scores and song characteristics, Bhanpuri developed a model to determine which combination of Bombadil song parts were the most popular — “‘it’s the drums on this song’ or ‘it’s Daniel singing on that song,’” as Bhanpuri put it.
The first test of the model was on their fifth album, “Hold On.” Bhanpuri used his model to try to predict which of the songs would turn out to be the most popular, and it did well, producing ratings that weren’t too far off from the popularity ratings on music services. “It gave us some confidence that the modeling approach might catch some things that their intuition was overlooking,” Bhanpuri said. A few months later, the band decided to use the model to actually change their music. The band had just finished a demo on a little lullaby called “I Could Make You So Happy” and sent it to Bhanpuri to run through the model. The feedback was a little awkward for the band: Michalak should be singing less and James Phillips, the drummer, more. “I didn’t hold anything against Nasir, but to be an artistic creative person, you have to have a strong ego, so it was a little hard for me to step back,” Michalak said.
Bombadil agreed to make another version of the song. The “data version” — as Bhanpuri called it — included some new elements and a little more drums. Now there were two versions of the same song, and Bhanpuri was confident that the data-driven song would “win” this time.
Listen to the two versions of “I Could Make You So Happy,” and let us know which you prefer.
They sent both versions of the song to about 50 friends and relatives and asked them to rate each on a scale of 1 to 5 (Michalak’s mom participated and gave both songs a 5). This wasn’t a perfect A/B test by any means, but these were the kinds of people Bombadil had wanted to make better music for: their fans and friends. More than 70 percent of the people who were surveyed preferred the data version — and in the end Bombadil did too.
It’s easy to imagine a horrified indie-rock lifer who thinks A/B testing signals the end times of creativity in the music industry. But Phillips disagrees. “It’s easy to write a similar song to one that you’ve already written,” he said. “Ironically, using data challenged us to break patterns and to create something new, and to take songs two or three steps further than we normally would have.” For a song they’re currently working on, feedback from the model led them to include more upbeat parts, different rhythms and vocal sections they’d originally written off because it “didn’t feel like it was in paradigm with the record,” Michalak said.
A/B testing songs based on feedback from a model is still pretty uncommon, said Liv Buli, who is a journalist at the music analytics company Next Big Sound and has worked on similar projects analyzing the audio properties of songs that lead to sales. “I haven’t heard of a lot of bands doing this,” she said. “I think it’s fairly unique simply because it’s still pretty controversial. The mindset is still that music is an aesthetic and creative process and you can’t do it by the numbers.”
This doesn’t mean what Bombadil and Bhanpuri have done is going to shake the music industry to its core. It’s possible there are other artists out there using data to create new songs or change existing ones, but I couldn’t find any. “I’ve definitely never heard of anyone doing this, and I talk to songwriters and musicians all day long,” said John Vanderslice, the founder of the recording studio Tiny Telephone and the producer that Bombadil will be working with on its next album.
Bombadil always knew that a totally data-driven approach to making a song wouldn’t necessarily make for a great album. The type of model that Bhanpuri used looks for an optimal set of parameters — a little more drums, a little more rap, a little less singing — meaning that every song would likely end up sounding the same. When they head to San Francisco to work with Vanderslice this fall, Michalak wants everyone, Bhanpuri included, to get in the studio together for the first time. He isn’t worried about potential tensions between an actual producer and a data-scientist-as-quasi-producer — in fact, he thinks their opposing ideas will lead to better results. And if it comes to it, the band can always A/B test the feedback.
Highlights from the recent NCES report.
New insights about teaching, publishing and intellectual history.