Tuesday, February 23, 2010

It's All About the Data

Yesterday I attended an event at the Brookings Institute on the war in Afghanistan. It was moderated by a news correspondent from ABC and included comments from a senior fellow at the Brookings Institute (who is also one of the authors of Toughing It Out in Afghanistan) and a man we were told is an expert on Pakistan. It was generally pretty interesting, and I think I learned a decent amount about where we're at in that conflict. The author was pretty optimistic about our chances in Afghanistan but worried about Obama's plan to slowly begin drawing down troops in the summer of '11. He said he spoke to many Afghan and Pakistani dignitaries in Dubai last week who warned him that those across the region feel that this is a sign of a lack of commitment on the part of the US. As one might expect, we all left unsure about where Afghanistan will be in the coming years.

It's not the war in Afghanistan that I want to focus on here, however. It's one of the comments that came up during the question and answer session of the event. One of the questions was about metrics and what criteria the military should be measuring in order to get an accurate understanding of whether what they're doing in Afghanistan is working or not. The Brookings fellow talked a little while about metrics and how they can best be used in a military context, and then he made a comment that stuck with me. He said that we have to be very careful that we don't rely too heavily on data that are easily constructed/gathered. He warned that we often put too much of an emphasis on these datum points because we can create them easily, while it is the data that are often more difficult and time-consuming to create that are often more useful in helping us determine whether we're meeting our objectives or not.

I paused for a second, did that silent head-nod thing that let everyone around me know that I agreed with him and, of course, instantly related it to education.

The way that we use data to evaluate progress/learning may be one of the most mismanaged aspects of public education. I'd like to start by discussing standardized test scores.

I think one of the reasons so many people have so many problems with standardized tests is that despite a growing number of common standards, there is still a significant lack of agreement about what exactly a school should teach. Is math really more important than art? Is writing really more important than supply and demand? Should schools explicitly teach social skills? Should all students really be held to the same standard? Are 9th, 11th, and 12th graders expected to be learning too?

Standardized testing data are horribly flawed because the don't tell you how much a student enjoys school (which I think MATTERS). Data don't tell you anything about how musically talented a student is or whether a school has cultivated that talent. And the data don't tell you anything worthwhile about the students who weren't tested, which is about 75% of the population of most high schools. And even for the students and subjects it does test, the results can often be misleading.

When you look at all the things a school does, all the things it should be doing, and all the things it could be evaluated for, standardized tests measure maybe 15% of those things (and even less in schools where multiple grades aren't tested), yet they nevertheless provide the public with about 100% of what they know about a school's effectiveness.

Any rational individual looking at this situation from outside the context of education might naturally be inclined to ask herself: WHAT THE HELL? And she'd be right to do so. The way that we collect and manage data with standardized testing, and the importance we place on it, is ridiculous.

But, as with just about everything education does that seems to lack common sense, the reasons we do it are probably primarily political. I can think of three reasons that standardized testing makes sense, none of which are actually helpful to schools/teachers/students:

  1. Standardized tests quantify the learning of large numbers of students and compare them to each other in a way that the public thinks they can easily understand (whether they really do or not is a totally different matter.) 90 is better than 50, so school A is doing better than school B. We should try to make school B better.

    This saves the public A LOT of time. It allows them to make quick value judgments about schools, whereas if they had to look through a lot of qualitative data regarding student improvement like portfolios, student/teacher/principal testimonials and the like, they would probably get very frustrated and demand something that allowed them to more easily compare schools. Qualitative data can often make it very difficult to decide which of two things is better.

    "Did school A do better this year than last year? Did it do better than school B? I don't know. There are so many words I have to read! Just give me some numbers; at least I can tell which one is bigger."

  2. They provide elected officials and educational leaders (e.g. school boards, superintendents, principals) with a means of demonstrating progress in a way the public can quickly glance at and be satisfied with. This kind of number data are easily manipulated in a way that most of the public either doesn't have the education to understand or the time to research. You can find a perfect example of this in this post over at GFBrandenburg's blog.

  3. Like the senior fellow at the Brookings Institute warned about, standardized tests are probably our sole measurement of schools because they're the easiest way to collect data. If a student gets a multiple choice question wrong, there's little question about what to do: you take away a point (okay, a lot of scoring schemes may be a little more complicated, but I think the point still applies - it's easier than assigning some sort of quantitative measurement to a sample of student testimonials about whether their school made them love learning.) And although states cough up millions of dollars of taxpayer money to fund for-profit testing agencies, they're still probably cheaper than compiling all the different qualitative measurements one might come up with (although I could be wrong here - thoughts?).
But standardized tests aren't the only example of data mismanagement in education. I saw a ton of it in everyday practice at my most recent school. I'd like to offer some anecdotes as concrete examples of where we go wrong with data collection in practice.

Data-driven instruction: it's one of today's education buzzwords. The idea is that you should gather evidence of your students' knowledge and understanding prior to moving on with your instruction. It seems obvious that this should be best practice. The only problem is that very few educators are trained on how to create valid data, and even fewer know how to use data. When you consider the amount of time it takes to create data sets for an entire class (or five if you were to teach either five sections of different students or five subjects to the same students), the reality is that doing responsible data-driven instruction becomes overwhelmingly impractical.

If I wanted to create a chart demonstrating whether my students attained a certain skill as a result of my classroom, I'd first have to create a diagnostic assessment to see if they had it before they entered and then a second assessment to see if they had it after instruction, each of which could take a few hours. Then I'd have to have confidence that my assessment really measured what I was hoping it would measure. Most teachers who've tried this can tell you that you often figure out that what you were hoping to measure wasn't measured at all by the time you go to grade the tests (one reason for this might be because a student may answer a question incorrectly because they didn't understand your language, not because they didn't attain the skill you were hoping to measure). Additionally, if I wanted a way to compare a student's understanding over time or against other students' abilities, I'd have to come up with a way to quantify the data I received (this is often a major inconvenience because not all data can be quantified in a way that is meaningful). Then I'd have to input the data I created into some sort of chart so I could compare it. At the end of this, I'm questioning the validity of what I've just created and why I just spent 10+ hours on a chart when I could have been creating lesson plans or tutoring students.

Nevertheless, this is the kind of stuff IMPACT expects DC teachers to be doing. That's policy. It sounds good, but it's often impractical. In talking to 20+ teachers at my former school about the evidence they provided to their administrators in the IMPACT evaluations, every single one of them told me that they just made it up to get a good rating. Some even use the same exact chart every time they go in for a conference, year after year.

Even our administrators didn't know how to create or use data. I witnessed multiple situations in which administrators were creating data, pretending that it was valid, and then pretending to analyze the data in a way so that we could act on it. The reality was that they didn't have a clue what they were doing but knew they needed to do it because it was a part of their talking points. It's best practice.

I'll give two concrete examples here. One comes from my former content-area department head/administrator and one comes from my grade-level administrator.

My department head was the kind of guy who would find some new educational thing online that would get him super excited and then expect every teacher under him to use it regardless of whether they liked it or not. The problem was that he never had to try it because he wasn't a teacher, and I don't think he ever had any experience teaching in a high school classroom in our subject area. So he found the following questions online or at some workshop:
  1. From whose perspective are we viewing?
  2. What's new and what's old?
  3. How do we know what we know?
  4. How are things, events, and people connected to each other over time?
  5. So what why does it matter?
He asked that we have our students answer these questions for every primary source that we gave them. The problem was that even we didn't understand how some of them should be answered. Then he told us how to score them. He gave us a rubric that allowed a student to achieve a 3, 2, or 1 on each question and then asked us to score each others' students' responses. Most of the responses were given a 1 because, like us, the students didn't know what was being asked of them. Our department head got frustrated and told us we needed to raise the scores, so most of us just faked it.

The problem was that our department head didn't know what he was measuring, didn't know how to teach it to the kids, and certainly didn't know how to communicate it to the teachers. So we all just jumped through the hoops of "data-driven instruction" and wasted countless hours of our lives.

(I will quickly say that I believe the questions can be powerful instructional tools if a teacher is WELL trained in how to use them, which we were not. But as measures of assessment, they're sketchy at best, especially when you're attempting to quantify a students' understanding of a particular standard).

The other example comes from my grade-level administrator, who (unfortunately for her) was responsible for administering DC-BAS and DC-CAS exams to the tenth grade. The DC-BAS is like a practice standardized test that is administered four times a year to measure students' improvement in reading and math. Unfortunately for all of the 10th grade teachers, Discovery Education (the company that creates and scores these exams) did not score the written responses for the DC-BAS assessments. So our administrator occasionally put us to work doing it in our morning meetings. We were given an incredibly bare rubric for assigning a 3, 2, or 1 to each response and no training on what we were looking for. We had counselors grading math tests and world history teachers grading English tests. Because two people had to grade each response and then compare, you'd sometimes see one person give a 1 and another give a 3. They'd stare at each other and say, "Sure, 2 sounds fine." Because none of us had much of an idea as to what we were doing, the end result was that we were making up numbers for the sake of making up numbers. The same scorers wouldn't be scoring the same students the next time around, and you wouldn't be able to do anything actionable with the "results" we were coming up with. But we were at least pretending to do "data-driven" instruction.

You often hear people say that policy is usually ten years behind the research. Well, I think that practice is probably another ten years behind policy. Yes, we all know that data are important, but few of us know how to create data or use data. Also, most educators don't have the time to do solid data analysis of their students. So while we will use data (Johnny got the answer right when I asked, so he probably understands and I can move on), most of us won't be doing anything near what we're currently pretending to do anytime soon. If this is something that policymakers are going to hold onto as something that is a valuable waste of teachers' time, then they're going to have to start requiring teacher preparation programs to include data creation and analysis courses, which will just add to the never-ending (and always a big joke to anybody who's inside of it) list of tasks teachers are responsible for.

It's not that I'm against using data in education. I'm for it. It's just that if we're going to use it, we've got to start using it more intelligently and not for political purposes. And I think, in the here and now, it's the teacher that needs to decide how s/he is going to responsibly use data. S/he's got to decide what's bull coming down from the top for political purposes and what's something that can actually be used.  Otherwise, it's all just a bunch of hoops.


  1. Just FYI- I'm pretty sure those questions are the basis of the IB program, which I've heard your school may be up for adopting since Deal MS has adopted the MS version of it and Wilson HS isn't doing the HS version (as far as I've heard). IB is a rigourous program, even just as a basis upon which to format courses, especially social studies, but it also takes years of training to be a certified IB teacher in which I assume you are taught to utilize such a format. Without that training, it's almost impossible to utilize such a complicated and deep format for understanding.

  2. Great Post - bottom line - The data-collecting DCPS is doing is worthless from an educational point of view. It's all political and it's mainly bogus.

  3. Can anyone else verify that those are IB questions? I wouldn't be surprised. The school is excellent at spotting best educational practices elsewhere and then attempting to implement them back home without the know-how. It was overwhelmingly clear that our department head didn't understand how to use them, and word in the hallway now is that he's going to drop them after this year.

  4. Another little-known fact about the DC-BAS (benchmark, not end of year) tests: they're not made by the same company that makes the DC-CAS, nor do they have any correlation with the pacing of the curriculum. Consequently, there is all sorts of hand-wringing, weeping, and gnashing of teeth when students go "down" in one strand or another from test to test. For example, at my school, kids went "down" on the data strand of the math test. Why? The previous test had focused on what one traditionally thinks of as data analysis (which we had taught), while the second test focused on probability for the data strand (we hadn't taught data yet). Did our students show growth or not? Who knows? And yet- we freak out, administrators demand ANSWERS about how we could let this happen, and worst of all, we label these poor kids who can't pass a test on something they haven't yet learned as "below basic".

    Remind me how this is informative to our instruction, and why DCPS is paying big bucks for this *useless* information.

  5. It's all big business with lots of money to be made!!! Isn't that the American way?Idiocracy here we come!

  6. Great, great, post as usual. Just had my IMPACT meeting with my administrator today, tonight (it's 9:30 pm just got home from grad school)I'm about to start putting together a data portfolio for my administrator meeting tomorrow. What bothers me the most is that IMPACT or the school's administrators don't, or won't, say what data it is that they need (I wonder why?). So staff are all wandering around trying to figure out what to show, filling out collaborative sheets (wasting paper) and scratching their heads as teacher after teacher comes out the office being told that they don't have the right data. Well just tell us what the F### you want then? They won't because like you said it's all bogus, they don't want to be held responsible for telling us the wrong thing, so it's better to be vague and then they are more able to hold it against the teachers at the end of the year. Only my second year, and I'm already so tired of the game playing at DCPS.

  7. I completely agree with you on this data-driven instruction. Before coming to DCPS, I worked in an environment where statistics rules and the rules of statistics and data collection ruled. At my school we are using the Harvard model and have these data binders that the principal checks for INMPACT. The problem is the Principal doesn't really have any idea or uses the data the way the instructional coach advises. It is usually put everything in the data binder but the principal just spots checks when he is looking for a way to "downgrade" IMPACT scores. On the instructional side, I freely admit to "inventing data". Simply I don't have enough time to call the 80+ parents as much as I need to. My planning period is taken up by counselors, teacher meetings, parent meeting and every little thing that the principal decides we must do.. when we finally get a planning period..usually 45 minutes once a week .. I am usually to damn tired to be bothered with "analyzing data". I feel it more constructive to grade the students work that I need to grade and provide feedback to the students. The administrators of my school do not understand their own data process. As a result, we have numerous and redundant data collection methods. We continuously generate the SAME data on a different form. Furthermore, has anyone noticed that the Discovery data report can generate the same info atutomatically, yet, TPTB still want teachers to spend hours generating a handwritten report. Using the DCPS data protocol, the method of analysis is EXTREMELY flawed especially as it does not have any error analysis built in. We can never determine from this method the standard error of the mean or student guessing. The biggest flaw is since we are performing intra-group comparisons the "proficiency" level is adjusted based on the number of kids taking the test who got it right as opposed to 70% non-adjustable scale. I rue the day when 50% becomes proficient. What the filthy teacher did not address is how data generates more data... at my school, we started with DCPS data model.. and principal kept having teachers adjust teaching based on this data. Then came the weekly skills tests for weekly assessment.. then came add the reteach items.. then came the DC-BAS again.. so add those standards to the pot and teach.. then came the information that said from teachers and every other non-admin adult that said the student behavior was disrupting instructions ( this is truly rocket science!!) so now it is generate data for the clinician and other personnel... The biggest drawback is that data goes upward but the interpretation and analysis of the data that direct the new directives downward are SOO off-bases 95% of the time.. the admins look totally clueless and without one brain cell capable of critial thinking .. Bottom line, DCPS data-driven instruction is laughingly inept....

    Anon at 8.48 pm.. I discovered by accident that the administrators use the teacher data to prepare their reports for MR. I got hold of a admin binder who used the teachers data. What was noticeable is that the way the data was reused is well "imaginative", I suspect that is why there are never any written reports that accompany the quantitative data.

  8. Since apparently no one in DCPS has any idea what data the powers that be want, or how they want us to use it- can we get Bill Turque on this? Expose this insanity and lack of leadership for what it is and demand real, effective standards for data collection and interpretation?

  9. All those with "Data" stories to tell, please testify:

    DCPS Agency Performance Oversight Hearing on Monday, March 15, starting at 10 a.m. at the Wilson Building. These hearings are opened with public testimony. People sign up in advance, and then testify in the order in which they have done so. If you are willing to share your story at the hearing, please contact Aretha Latta at 202-724-8196, or email her at alatta@dccouncil.us

    Also, please see Turque's piece on this at http://voices.washingtonpost.com/dcschools/2010/02/dcps_lining_up_support_for_cou.html#comments

    DCPS sent out emails to supporters only asking them to testify positively about the reforms

  10. http://www.nytimes.com/2010/02/24/education/24teachers.html?pagewanted=2&hpw

    Firing bad teachers.. not so easy

  11. Anon at 1104: I saw that. Absolutely ridiculous.

  12. I recommend exploring books such as "Weapons of Mass Instruction" (http://www.newsociety.com/bookid/4012) to better understand the context of this discussion as to why and how the use of data in schools has come to be. Also, this is a free resource that expresses similar ideas to Weapons of Mass Instruction:


  13. Anon at 205: Thanks for this. I'm going to check that out for sure.

  14. As a point of clarification, the actress Goldie Hawn didn't graduate from Anacostia, but from Montgomery Blair, class of 1963, in the old building on Wayne Avenue in Silver Spring, MD. Anacostia, the neighborhood, was white and integrated in the 1950s and early 60s. By the mid and late 60s, especially after the riots in 1968, whites were moving out to the suburbs. And I know this how? I was there.

  15. Note to the reader: Anon at 403's comment is related to a discussion that was going on in the comments section of the "Volunteering....But I Wish I Was Teaching" post.