All Schools Need a Data Scientist

Since I started in education 18 years ago, I've seen wave after wave of reform measures. At that time, No Child Left Behind was gasping for air. Common Core shot to the fore but crashed and burned like an early Spacex rocket. CC was rebranded and repackaged as Race to the Top almost immediately. Every new reform measure to hit the streets was marketed as "research-based" measures as if that moniker meant it was sure to work.



Then, a few years ago we stopped hearing the phrase "research-based" when it was replaced with the words "evidence-based." It was like someone high up on the education policy setting food chain suddenly realized that research on reform measures didn't necessarily imply those reform measures worked. But this epiphany didn't make things any better. 

So here we are, now in the third decade of the 21st century, with education in America still floundering. According to Michael Seelig in the Stanford Social Innovation Review, 20 years of education reform resulted in growing inequality. This spring, Kimberly Amadeo wrote that education in the US is falling behind the rest of the world. 

Billions of dollars of research in education, and billions more spent to implement reform measures based on that research, and we still haven't found an equation we can solve to make it work.

Why?

Perhaps it's because what works in one place doesn't necessarily work everywhere. Education reform that works may not be a one-size-fits-all algorithm. It may be that School A in Peoria needs a completely different approach than School B in Alameda. It could be that each district, or even each school may have their own one-of-a-kind approach that will produce the best results for its students. 

Over the last five or so years this idea might actually have crept in. But the articles linked above, and international education rankings, don't seem to say it's been implemented effectively. Teachers everywhere are ordered to make "data-driven decisions" about their curriculum and instruction. To make such decisions takes lots of data, and teachers across the country are piling data up like county dump piles household rubbish. Drawers full of data. Disks full of data. Drives full of data. 

Data, data everywhere, and still...lackluster results.

Why?

I'm going to hazard a guess and say its because there are a lot of people collecting data, a few people compiling data, and even fewer actually analyzing that data. A lot of those few who are actually analyzing data have probably never studied statistics, data analysis, or data science. Most people don't realize there's much more to data science than making pretty graphs and charts. 

Sadly, a lot of people in education think that's pretty much all there is to it--collect some data, make some charts and graphs, make a decision to change some this or that because of what you see on the charts and graphs. But data science, real data science, starts way before collecting the data, and extends way beyond making charts and graphs.

Data science starts with figuring out what problem you want to solve. Too often in education people start by collecting data, then coming up with questions based on what they see in the data. Though data may reveal more problems to be solved, identification of the initial problem you want to solve should be the first step in the process.

Once you identify the problem you need to solve, then you identify what data you need to try to solve it, what of that is available and what you need to collect, and how best to gather and collect the data you need. Most educators I know have never been trained in this process. They're just told in a professional development session they need to make data-based decisions and have data available to justify their decisions.

Next comes cleaning the data. This almost sounds criminal to folks who never studied data science. It sounds like you get rid of data you don't want so it will tell you what you want to hear. But that's not it at all. Often, no matter how good your plan is or how much effort you put into it, data collection efforts yield incomplete or corrupted datasets. Data science has some conventions to handle most of these situations, but these methods are unknown to most lay educators. 

Once the data are all in a usable format, data exploration begins. This is much more than assigning two or three categories and making a bar chart. This process means cutting and slicing the data, examining relationships between different variables in an effort to find trends and patterns that could lead to greater insight. It's looking for connections between two or more variables that may not appear to be connected at all. 

The next stage of the process is to build models, models that predict results based on the values of the variables you decided to include based on the earlier stages of the process. This is probably the phase of the process most foreign to educators from the classroom to the corner office. Constructing mathematical models utilizing multiple variables, some quantitative and/or some qualitative, is not the forte of even most K-12 math teachers. 

Now! It's finally time to put those pretty charts and graphs together that show policymakers and other stakeholders what the models reveal. Alongside an executive summary, these can be used to convince teachers and admin to implement changes based on the models' predictions. 

Whew! What a process. Sounds like we're finished right? We finally reached our destination! You might be thinking, that's a little more involved than what we've always done, but it's still doable with the great educators I have on staff. And I'm sure you do have fantastic, more-than-capable educators. But here's the kicker...

We're not done yet!

That's right! Even when we've done all that, the data science process isn't over. In fact, IT'S NEVER OVER! Once you get to the end, it starts all over again!

What? How can this be?

It's because once you implement changes based on all this great, professional data analysis, you start generating more data. As that data pours in, the data scientist compares the results of the implemented changes with the predicted outcomes. New trends and patterns result that reveal more opportunities to improve, or even unanticipated problems that require solving.

And the solution(s) come from another iteration of the data science cycle. 

That's why every school, or at least every school district, needs a data scientist, or a team of data scientists if its large enough. Data science done right is a perpetual process, not a one-and-done. Every school or district has its own unique set of students, teachers, administrators, parents, and other stakeholders. Each has its own culture. All these unique populations require unique solutions.

A data scientist or team of data scientists in each school or district could make real progress on reforms to improve their school or district. In Peoria, they could determine what works best for schools in Peoria. In Alameda, they could find solutions to the problems faced by Alameda schools. 

Now is the time when all schools or districts need a data scientist.

Comments

Popular posts from this blog

13 Years Ago Today...Amanda Marie Allison (1993 - 2011)

2023 Improving Education -- Ranking Arkansas High Schools by Performance vs. Expected Performance

Snow days SHOULD be made up!