Monday, August 17, 2015

Educational Assessment: A Huge Waste of Time and Money?

An educational road trip
Imagine it’s 1980 - no World Wide Web, no cell phones, no GPS. Your child is learning to drive a car. They have to drive from Los Angeles to New York in time to attend an important event that could well influence the course of their future life. How would you help them do it? They’d need a long-range plan, of course – a map with a route marked out on it. But this plan alone wouldn’t get them there – they’d need to actively interpret the directions in the real world – identifying which of the many small streets is the right one to turn on, looking for signs and landmarks to know when to change lanes and prepare to exit the highway, constantly checking to make sure they didn’t take a wrong turn, and figuring out how to get back on track when they inevitably do. They must, in other words, constantly be assessing the situation – determining where they are on the map, where that puts them in relation to the route, and what to do at each moment to stay on track and on schedule.

This driving scenario is analogous to formal education. In this case, the subject matter (arithmetic, world history, etc.) is the map. The curriculum is the route marked out on the map. The student is the driver.  The assessment is the process of tracking location and progress in relation to the route, destination, and schedule.

What’s missing from this picture?

If you are a parent, this scenario might make you feel uneasy. Would you really be ok having your child learn to drive while also following a complex and unfamiliar route across thousands of miles over a number of days with important consequences riding on their timely arrival? (Analogously, would you expect that your child would buckle down and successfully learn to read books or master algebra on their own by June, given that they want to be a writer, carpenter, engineer, doctor, or architect when they grow up?) Probably not. If they had to make the trip by car and they had to do the driving, you’d probably want to send someone along with them – a navigator and guide who knows the route well, can coach them on how to drive safely and skillfully, and looks after their well-being during the trip - making sure they leave on time each morning, get plenty of sleep, and don’t get lost or sidetracked visiting roadside attractions along the way.

In the educational analogy, the navigator is the educational guide.  But not a classroom teacher – this navigator is a personal tutor working with one student.

Imagine that we cannot afford to provide a navigator (personal tutor) for each driver, but that we can allocate one navigator for each fleet of twenty-five cars. These cars are all leaving from different starting cities, at different times, moving at different speeds, with drivers who have different levels of driving experience and skill, and different levels of familiarity with their route.  Nonetheless, the fleet navigator is responsible for seeing that all drivers arrive in New York within the same hour.

In the educational analogy, the fleet navigator is the classroom teacher.  The cities the students start in are their prior knowledge of the subject matter (arithmetic, history, and so on), New York represents the destination – the set of learning objectives that the teacher is expected to help all students achieve by a specific calendar date (such as the end of the school year), and the diverse speeds and routes represent the fact that students come to any class with diverse levels of prior knowledge about the subject matter, different capabilities and limitations with respect to learning, different levels of interest in the topic, and so on. And yet the teacher is still expected to get them all to New York within the same hour.

What does any of this have to do with assessment?

I frequently hear people make statements like this:
“I feel that all this effort on assessment stuff is mostly a huge waste of time and money.”

To borrow a line from the film The Princess Bride:
You keep using that word ["assessment"].  I do not think it means what you think it means. 

When people talk about assessment, they typically seem to be thinking of written tests, and may even have in mind one specific “high-stakes” test. And that is indeed one form of assessment. But assessment, in an educational context, simply means gathering data to figure out where a student is on the map, evaluating where that puts them in relation to the route and schedule, and answering specific questions such as what adjustments to make to keep them on track and on time. 

Assessment can be done with the eyes and ears as well as with a paper test or an electronic GPS-like dashboard. The personal navigator sitting in the car with the student-driver, for example, is constantly assessing the situation using her five senses – looking for road signs, watching what the driver is doing, feeling the acceleration and deceleration of the car, comparing the car’s location against the marked route, and so on. Believe it or not, that’s assessment.  (More specifically, that’s formative assessment.) Another form of assessment is the determination of whether the trip was a success or failure overall – if the child arrives in New York in time for the event, the trip was a success and otherwise it was a failure. (This is an example of summative assessment – in this case, we might call this a “high risk” assessment because the outcome of the assessment correlates with big consequences, for better or worse.)

The fleet navigator (classroom teacher) obviously can’t be in the car with any of the drivers – she has to manage all twenty-five cars for the duration of the trip. But this is 1980, remember – before GPS and cell phones.  So the fleet navigator not only can’t see what every driver is doing inside their cars at any given moment, but she also has no way of tracking precisely where any student’s car is at any given time.  She can’t do anything to help the drivers reach their destination without information about their location and progress – she would effectively be flying blind. Classroom teachers face a very similar challenge - they can't directly observe what's going on in students' heads, and they simply can't teach effectively without good information about where each student is and how they are progressing.

What might we do?

One reasonable strategy would be to set up a series of checkpoints along the main routes.  Drivers check in when they arrive at these checkpoints and that way the fleet navigator can update the map with their approximate locations. If someone fails to check in at the expected time, or if they check in from an alternate location because they cannot find the checkpoint, then the fleet navigator can investigate the problem and decide how to take corrective action to get them back on track.

These checkpoints are analogous to formal educational assessments – including (but certainly not limited to) written tests. The location of a student’s car is analogous to their state of understanding of the subject matter – their progress in the class relative to the curriculum (route) and learning objectives (destination). The checkpoints (formal assessments or tests) help the fleet navigator (classroom teacher) to know much more precisely where each driver (student) is. Importantly, these checkpoints provide early warning – if we have to wait for the child to miss the event in New York (or fail to achieve the learning objectives by the end of the year) to find out if they were on track all along, by then it’s way too late to do anything about it.

The effectiveness of a classroom teacher – like the effectiveness of our fleet navigator – depends critically on the availability of data about individual students.  In addition to the informal assessments teachers are doing constantly using their eyes and ears, formal assessments (including tests) are the checkpoints that provide much of the detailed data about how students are progressing, whether they are on track, and what corrective actions the teacher needs to take.

But why can't teachers just give Friday quizzes and find out all they need to know?

An assessment (quiz, exam, standardized test, etc.) is a measurement instrument - like a ruler, weight scale, or thermometer.  Unlike a ruler, however, which measures things that one can actually see, an assessment is a psychometric ruler - it measures knowledge and skills and other intangible entities of the mind that we can't actually see and that are, in fact, much harder to define than an attribute like length or width. 

Let's ask roughly the same question but in a different domain: "Why do we need to provide engineers and medical doctors with rulers, weight scales, and thermometers to do their work?  Why can't they just create their own to find out all they need to know to do their jobs?"  There are a number of reasons.  Consider calibration, for example. Back in the day people did make and use their own rulers and weights, and they came up with very different measures for the same thing - a major problem if you are paying by the ounce for something, or if you are building a bridge from two ends that should meet in the middle, or if a medical diagnosis depends on the value being measured (body temperature, for instance).

That's not quite the same as the educational scenario, though. Since we can't see the invisible knowledge constructs we are trying to measure in education, we'd have to actually ask "Why can't engineers and medical doctors just create their own measurement instruments while blindfolded and wearing heavy gloves so they can neither see nor feel the thing they are trying to measure?"

Imagine two math teachers in adjacent classrooms each make up their own 10-question math quiz for the same instructional unit.  I've drawn a couple of homemade rulers below to illustrate what that might look like. Obviously, there are major problems with these measurement instruments. Let's consider just a few of the more glaring ones.

Problems with consistency of measurements
Looking at the first ruler, for example, the difference between a score of 1 and 2 is small compared to the difference between a score of 2 vs. 3.  The evenness of the numbers masks underlying unevenness in student understanding, which can lead to invalid educational conclusions and actions.

Problems with interpreting scores
The second ruler is measuring two different dimensions and adding them together. That would be like adding someone's height in feet to their hair length in inches and reporting the resulting number as a score.  How are we to interpret such a score? As a common educational example: when we include printed word problems in our math quiz, a child who struggles with reading may be unable to complete any of them - not because they don't understand the math but because they can't fluently read the problems.  Their score doesn't reflect their math competency - it's a combined math plus reading score. 

Problems with comparing performance across students
Now compare the two rulers.  How are we to compare the performance of students across the two math classes? For example, imagine a student in each class scores a 4 on their version of the quiz.  What can we say about the performance of the two students? They earned the same score - do they have the same math competency? Certainly not. If you look at the length marked by the 4's, then evidently the second student scored about twice as much as the first student. The numbers are not comparable, but they invite interpretation, evaluation, and decision-making as if they mean something specific and comparable.  This is a very real problem that colleges face, for example, when looking at student transcripts.  Looking at two applicants from different states, both having a high school GPA of 3.3, how are the admissions officers to compare them? They really can't.  Love it or hate it, that's one reason the SAT is so widely used - unlike GPA, standardized tests like the SAT provide a common ruler for measuring student competency in specific domains like math and language so the scores can be compared in meaningful ways across students, classes, and schools.

So, is investment in educational assessments a huge waste of time and money?

There is certainly room for healthy debate about whether any particular assessment is valid and fair, how assessments should be administered to students, and how the assessment data should be used. But is it really reasonable to ask whether we can do entirely without educational assessment in schools? Or whether we should really care about the quality and validity of assessment data? Only if it doesn’t really matter what students are learning or when they are actually learning it. But if that’s the case then we have to ask ourselves this: why do we bother sending our children to formal schools with highly trained teachers in the first place? If we really don’t care what they are learning or when, wouldn’t it be better to send them to day care or adventure camp five days each week instead?

In fact, assessment is not a huge waste of time and money.  But without high quality assessment in place to inform effective instruction, large parts of the rest of the educational system might well be.

Postscript: A peek at the future of educational assessment

Now fast-forward from 1980. Imagine a world where teachers have the equivalent of GPS in the classroom - that is, continuous, detailed data on student learning plotted in relation to the curriculum goals, delivered in real-time, and actionable at a glance. Yet students never have to take tests. 

It may sound far-fetched, but it already exists. It's called "embedded assessment" and we've built such a system over at Native Brain to demonstrate conclusively that it's not only technically possible but that it can be made to work at scale in typical public school classrooms - today. (See the screenshot below.)

As I've said before in this blog, we have the know-how right now to make mainstream public school education much, much better than it currently is.  The same way that GPS suddenly transformed the way we drive, technology in the classroom can transform the way teachers teach and the way students learn. There is definitely a way. The question is, do we have the will to make it happen?

(Note: As of the date of this posting the Native Numbers iPad math curriculum and accompanying GPS-like instructional dashboard are currently available at no cost to parents and teachers.)

Check it out. Send us your thoughts. Share.