More Questions Than Answers on Standardized Testing

As a parent of a rising fifth-grader in Clarke County public schools, I’ve tried to get to the bottom of the role testing plays in public education. What I’ve learned is that the more questions I ask, the more questions I have. I think it’s safe to say that Georgia Milestones is attempting to do too many things and therefore doesn’t do any of them well. I propose that we look at things differently and consider assessment tools currently available and how they might fit into a workable solution.

I understand the need for accountability. As a parent I want to know if my daughter’s teachers are challenging her in a way I expect and hope that they would. As a taxpayer I hope our educators are spending our tax monies effectively. As a member of society I hope our public school system presents opportunities to every student regardless of ability, ethnicity or socioeconomic background.

Testing measures student achievement in a variety of ways. Norm-referenced assessments allow for comparability among pupils, schools, districts and states. Criterion-referenced tests ensure students are learning the standards, which brought the Common Core controversy into the mix. But we can’t even decide on the standards, much less write tests that accurately gauge students’ mastery of the material. We are currently using Georgia Standards of Excellence, replacing Common Core for the 2015-16 academic year—the fourth change in standards in the past decade. And, as the catch-all, we want to measure student growth with an eye towards narrowing the achievement gap, as well as rating teachers and principals’ effectiveness.

Our legislators have put our money squarely behind Georgia Milestones to accomplish these objectives (ignoring for now the “need” to assess music, art and physical education teachers). Milestones are state-mandated tests given annually from third through 12th grades. Testing takes place over two weeks, with at least five school days devoted primarily to test administration.

Reaction to the “high stakes” nature of the tests varies from school to school and pupil to pupil, but anecdotal evidence suggests that the stress created by this type of testing environment is probably greatest in middle school and totally inappropriate for kindergarten through fifth grade. The tests are focused on criteria (standards) but contain a few questions intended to serve the norm-referenced function. This dual purpose has led to confusion among parents whose children had subpar criteria scores but high percentile ranking, or vice versa.

Unfortunately, there have been a number of problems in simply administering the tests. Results from 2014–15 Milestones were not used in student pass/retention decisions due to myriad issues with McGraw Hill’s testing platform. McGraw Hill was subsequently “released” from its $120 million, five-year contract after just one year. While the 2015–16 tests had fewer problems, third-, fifth- and eighth- grade scores were again not used for pass/retention decisions due to technical issues with the testing platform. Furthermore, End of Course grades did not meet the two-week turnaround the state promised, delaying final grades for some high-school seniors.

Why are we reinventing the wheel? There are a variety of tests currently or formerly used in the Clarke County School District that provide a norm-referenced assessment.

•Scantron Performance Series Assessment is currently available to all CCSD schools; not all opt to use it.

•Iowa Test of Basic Skills. CCSD left this testing platform under No Child Left Behind, though some schools still use it in the gifted program.

•Measures of Academic Progress are used in Oconee County schools.

These tests can be administered online in one day and multiple times in the school year. They are usually given with little fanfare and no “extra” preparation. (Read: less stress and a more realistic picture of where a given child stands in terms of basic knowledge.) The tests provide a snapshot of how each child compares on a norm-referenced scale—we can compare students from Athens to Akron, OH or Gaines Elementary to Barrow. Proven reliable, each of these tests can be used to measure growth and offer the further advantage of consistency. As with any test, it’s still one test on one day.

Regarding using tests to evaluate teachers, the American Statistical Association asserts that over 80 percent of a student’s test scores are influenced by factors outside of the control of the teacher. And even though the tests from the last two years have been dismissed because of widespread system failures, the scores will still be used as baselines for teacher evaluation. Would you enter the teaching profession knowing that these test scores were 30 percent of your annual evaluation?

Setting standards and devising assessments we can agree are effective is where we should concentrate our efforts and resources, and that effort should begin with teachers. We deserve an honest discussion of what tests work and how those results can and should be used. We should quit using tests designed for so many purposes that none are accomplished effectively. We need teachers, parents and administrators to speak out and tell us what works and why; what doesn’t and why not. And we must demand that our legislators quit trying to reduce a complex relationship between student and teacher to a single test score, using a test that doesn’t do anything very well.