Every one of us can move the needle with advanced analytics, and we do not need to hire anyone, buy anything or contract with anyone to pull it off.
If the service providers are correct, you need their professional services in order to make any real progress with your data.
I have learned that every one of us can move the needle in our organizations with advanced analytics and we do not need to hire anyone, buy anything or contract with anyone to pull it off. Here is my story:
Business Objective: Increase Graduation Rates
When I was CIO at a university, we on the leadership team talked a lot about how to increase the graduation rates of our students. Graduation rates mattered deeply to us. If we admitted a student who did not graduate, no one really came out ahead – we had invested in a student who did not finish and the student had spent time and money but not reached a career goal. As we discussed changes in curriculum and changes in our faculty model I jokingly said that if we only admitted students we knew would graduate, our graduation rate would climb to 100 percent. We all shared a laugh and then went back to talking about curriculum and faculty models.
But I kept thinking about the idea of admitting students more likely to graduate. We knew the background, skills, experience and goals of the students who applied to our programs. And, we knew the background, skills, experience and goals of those who graduated. This was a strong data set. Could we use this data to define the profile of what we expected to be a successful student and then admit as many people who fit this profile as possible? I thought the idea had merit.
What Would the Data Say?
We already had an admissions profile that we humans had defined. Every potential student was required to complete three admissions tests. These three test results were the dominant factors in our admissions model. But what would all our other data say?
I decided to do an experiment. I would use advanced analytics methods to develop a data-based admissions model. But, I did not have a data scientist. I did not have any advanced analytics tools. I did not want to contract with someone who came with a bias – I wanted our data to do the talking. So, I talked with some people, did a little research and decided to conduct a data contest. At the small price of a few thousand dollars, I posted our desired outcome (an admissions model that predicted the likelihood of an applicant graduating from our program) and our data at a contest website (in my case, kaggle.com but there are others).
Data Analytics Disrupts the Old Admissions Model
Data teams from around the world used advanced methods on our data and the winning team came back with an admissions model that turned our human-defined model on its head. The data said that the most important factor in our human-defined model was, in fact, the sixth most important factor. The second most important human-defined factor was, in reality, the ninth most important factor.
I took the results of my experiment to the executive team so that we could discuss the model. There were some questions about the approach but it was hard to argue with data. Everyone agreed that we should start to use the new model. There were some ripple effects. The new model meant that we would admit fewer students, so we would have to make up for that loss. But, with a confidence factor of around 83% (the admissions model was probabilistic and so was not a perfect predictor of success for individual students) we knew that our graduation rates would, over time, increase.
I am happy to report that this all worked as planned. And, because our new admissions model was based on data (rather than informed opinion) there was an added benefit: we could give rejected candidates specific feedback on what they needed in their background in order to be admitted. They could then go fill in those gaps and apply.
In completing this and similar projects, I have learned the following:
- At a low cost, someone can analyze our data. There are contest websites. There are local user-group contests (I have a data set in such a contest as I write). There are local colleges and universities that are begging for real data for student projects.
- At a low cost, we can use massive compute power. For a different analysis, I had data sets composed of thousands of data elements (and the corresponding millions of pieces of data). Advanced analytics methods (like machine learning) consume lots of compute power to sort through complex data sets. There is plenty of cheap, available compute power in the cloud (in the case of my complex data set, I rented a thousand compute nodes for a few hours each month to do my analysis).
- Delivering a profound, insightful analysis might be the best way for IT to move the needle in the organization’s life and to boost the profile of the IT team. This is one way we prove our worth as business and technology leaders.