For the last few weeks, across Maine, students in grades 3 and up have been in the midst of standardized testing. Kids don’t like taking tests, teachers don’t like giving tests, most parents don’t like their children taking so many tests, even school administrators get headaches from scheduling tests. It is a perennial period of gloom. This time of year my Facebook feed lights up with posts from teacher friends (and parent friends) about how stressful, inconvenient and time-consuming standardized tests are.
What I’d like to do, keeping with my pledge to focus on the positive, is talk about what can be done to minimize or even eliminate the negative impact testing can have on the culture of learning in Maine schools. This week we will look at why we test kids, what the intended outcomes are, and what has made testing so problematic. And then we’ll look ahead to a solution.
To start: Why test?
There are two intended outcomes when we test students. One is obtaining formative information, and the other is obtaining summative information.
Note: I prefer to use the word information in place of data when I can get away with it, as data brings to mind numbers, and when we look at test results we are looking at a great deal more than just numbers.
In the case of formative information, educators are looking for evidence to support adjustment of future instruction. Is the child demonstrating mastery and understanding? Is there evidence of learning? Are there areas of concern? Are there visible patterns of learning among groups of students? Are there patterns of failed learning? Ideally, educators ask themselves these questions when they review test results in order to determine who gets taught what in the future. We can learn a tremendous amount from looking at test results, from reading levels and evidence of fluency to specific areas of strength and difficulty in math. As a math teacher, it is always helpful to see when a student demonstrates apparent strengths in geometry, for example, and weaknesses in algebraic concepts. Broad subjects like “math” and “reading” involve a lot of sub-disciplines, which is why math teachers tend to cringe every time a parent says something like, “oh, I was never very good at math, I guess it runs in the family.” From tests, we are able to see evidence that exposes specific areas of strength and weaknesses, which usually contradicts the idea that a child is “bad at math,” or, “not a good reader.” There is almost always something to celebrate and build on in a child’s assessment results. Formative information helps educators make decisions about what learning their students are ready for next. The dreadful and hideous term, data-driven instruction, comes from this. More on that later.
Summative information is conclusive and retrospective. It tells us how a student or group of students did at the end of instruction, period. Report cards are summative reports. Summative information is reviewed by test takers and test givers alike to provide information about the past. A standardized test at the end of the year, for example, whose results will not be available until after the year is over, won’t provide educators with a lot of useful information to adjust instruction for specific students, but might add to existing evidence to determine if a curriculum is working or needs adjustment or replacing. At the district level, teachers and administrators can compare summative test results from school to school, and even with other districts. With this information, they may ask, should we switch curricula? Should we share our successes with other districts? Should we investigate other factors affecting learning in one building or across the district?
This is where the almost equally dreadful and hideous term, data-driven decision making comes from.
Let’s talk about that briefly. Data, referring to numbers, implies hard evidence. Numbers don’t lie, or so they say. The term, data-driven, has been attractive in the business world because it implies that decisions are made based on cold, hard numbers. If the data shows profits are down, we look at more data that shows who is and who is not buying the product. Then we look at more data that shows why the market has gone cold on the product, and then what can be done to warm the market back up to the product so that profits can go up again. And so forth. The reason I find the term, data-driven, to be so hideous in the field of educating children is that the data, i.e. the numbers, only say so much. They are a piece of the evidence that only paints part of the picture. In the case of academic testing, the numbers do not tell the entire story, and can be unreliable.
That’s right, the tests we give students can be unreliable. If a student scored an 86% on a specific category of a standardized assessment, there are a number of factors that could contribute to that result. The score could have been impacted by academic factors, but also by environmental conditions, such as the temperature of the classroom, disruptions by classmates, a fire drill or other interruption, lack of breakfast, physical illness, emotional distress, fatigue, or over-stimulation, just to name a few. Did the result demonstrate retention of information over time, or a comprehensive test review days or even hours before the assessment? One group of students might have had superb instruction and great retention of knowledge, while another group had less instruction but a thorough, comprehensive review of test content days prior to the assessment, and both groups could yield similar results.
There are many, many factors that can impact academic assessment results in a way that makes it difficult or impossible to reliably measure academic learning. This is why data-driven instruction is an entirely inappropriate term to describe best practices in education. It is also one reason test scores should never be used to evaluate teacher effectiveness.
The better term to use to describe how we use the information derived from tests, both standardized and general, is data-informed. We use the results that come from both formative and summative assessment to inform the decisions we make. While the evidence tests provide may not always be reliable, it can still be valuable to help us see patterns and show evidence of strengths and weaknesses among students and groups of students. It can also provide valuable information to help us choose which instructional materials we like best.
Tests give us valuable information about students, about instruction, about school climate, about curriculum, but they generally do not determine anything on their own. We should rely on tests to help us make decisions, but not to drive our decision making process.
So far we have only addressed the overemphasis on testing, and not over testing as an issue in itself. Many feel we are testing children too much these days. I hear from students, teachers, administrators and parents that we give too many tests, and the tests take too long, and I agree. When you consider district writing prompts, standardized assessments for math, language arts and science, universal screening assessments for math and language arts like NWEA that often happen two to three times each year, and regular summative curriculum assessments in all subjects, we are really throwing a lot of tests at kids. In a single subject, tests might only take up a handful of the 170 or so daily lessons taught throughout the school year, but the frequency of these tests is significant. A language arts student, for example, in grades 3 through 8, might take a screener in September, a writing prompt in October, the NWEA again in January, the MEA in March, a writing prompt again in May, and another NWEA also in May. If the student is in high school, add the SAT to the mix. That is about six disruptions to the curriculum for one subject only, in addition to any lesson that might have been missed due to a snow day, fire drill, assembly or other disruption. Students easily become fatigued and overwhelmed with so many tests, and teachers become frustrated that their instruction is so constantly interrupted and thus impeded.
Getting rid of standardized tests outright is not the answer, as they are tools that give us valuable information. Reducing the number of tests we assign to our students is necessary and long overdue. The problem persists despite widespread acknowledgment; over testing continues. The standardized tests give us one piece of information, while the district writing prompts give us another. Progress measures like NWEA give us a nice look at patterns of growth throughout the year. There are actual benefits to testing.
When I talk to teachers I work with about tests and test data, I am reminded how difficult it is for teachers to utilize the information these tests provide. My experience in the classroom was the same; there is extremely limited time to “unpack” and “dig down” into the data, if the data doesn’t come too late to utilize anyway. As a teacher, I always found I learned a great deal more about my students’ learning by watching them learn, in my classroom, than I did from the numbers that came back from a screener or a standardized assessment. “Timmy had trouble classifying two dimensional shapes today,” is an issue I can immediately begin to address in my classroom. Whereas, “Timmy went down three points in geometry,” presents a much more vague result to work with.
Years ago as a math coach in a different district from where I work today, I was asked to “show teachers how to dig down into their NWEA data,” to gain information about their students’ growth. It was futile. Most knew how to access the data. Unfortunately, after the time devoted to all that digging, they just didn’t gain much valuable insight from it. It was as if I was looking for a realization like:
“Timmy dropped three points in geometry. Clearly, Timmy needs more geometry instruction. Off to work I go.”
The data digging was not worth the time for teachers. As their math coach, I could have done the digging, and shown them the results. “Look,” I could have said, “Timmy dropped three points in geometry.” And the teacher might respond,
“That’s not surprising, he was having trouble classifying rectangles just the other day.”
Glad I could be of assistance. Time is a precious commodity for teachers (see every other post in this blog), and I never want to waste it. There is valuable information to be learned from tests, but not more valuable than what teachers should already know about their students. Test data should raise questions, start a conversation, or confirm prior understanding, but it should not drive the decision making process.
So about that solution.
Remember, I am just a guy with a blog, not a politician. I can flip flop all over the place if I want to. So write to me and tell me what you think, and maybe I’ll evolve some more on this, but here’s what I think. Students in each grade level, in each subject (Math, reading, writing, science, social studies), should take one and only one school-wide standardized type assessment per year. It should be delivered at the end of the year, in order to help inform next year’s teachers about the new students they have coming into their classes.
That’s it. One standardized test, for multiple purposes.
Current teachers should be encouraged to learn as much as they can about their students’ learning from the formative assessing they do in their classrooms while teaching their lessons. Let them not interrupt their curriculum more than one time all year for the purpose of standardized testing. Give students more time to be inspired, engaged… educated.
I could write forever about this topic, but it’s your turn now. What do you think should happen with standardized testing? Maine has now delivered three different standardized tests for math and language arts in three years. Do you think we will ever find the right one? Are there states out there that have tackled this dilemma in another way? Do you live there? Is it spring where you live??? Write to me. Keep the comments coming. Visit the Facebook page, as I will be sharing some relevant links in the coming days and weeks.