Real-time collaboration and continuous action
create ideal conditions for future success.
boilerplate image

Assessing Deeper Learning: The Ohio Performance Assessment Pilot Project


Having trouble viewing the webcast? Contact Technical Assistance here.

The Alliance for Excellent Education Invites You to Attend a Webinar on

Assessing Deeper Learning:  The Ohio Performance Assessment Pilot Project

Mariana Haynes, PhD, Senior Fellow, Alliance for Excellent Education
Stuart Kahl, PhD, Founding Principal, Measured Progress
Lauren Monowar-Jones, PhD, Program Coordinator of Performance Assessment, Office of Curriculum and Assessment, Ohio Department of Education

Please join the Alliance for Excellent Education for a webinar on using curriculum-embedded performance measures to learn and demonstrate deeper learning competencies students need for college and a career. The webinar will focus on the Ohio Performance Assessment Pilot Project (OPAPP), which includes a system of learning and assessment tasks aligned with the Common Core State Standards. Ohio has taken a unique approach to the pilot by including sustained, collaborative professional learning throughout all components of the program, including formative assessment to support student learning, technical training, and writing and scoring of assessment tasks.

The panelists will explore the use of performance tasks to elicit and assess complex thinking and communication skills. They will examine what this means for designing curricula and varied structures for professional learning to provide teachers with the knowledge and skills to help all students attain high-level cognitive and intrapersonal skills. Panelists will also address questions submitted by webinar viewers from across the country.

Register and submit questions for the webinar using the registration form below. After registering, you will receive an email confirmation. Please check your email settings to be sure they are set to receive emails from

Please direct questions concerning the webinar to

If you are unable to watch the webinar live, an archived version will be available at usually one or two days after the event airs.


Additional Resources:

I’m Mariana Haynes, senior fellow for the Alliance For Excellent Education, a non-profit policy and advocacy organization based here in Washington, D.C.  We are delighted to have joined us for today’s webinar on using curriculum-embedded performance assessments to help students learn and demonstrate deeper learning competencies.  These competencies include a deep understanding of content, and the ability to use that knowledge to think critically and solve problems.  It also means that students will need the ability to communicate effectively using a variety of media, the ability to collaborate with their peers, a capacity to reflect on one’s learning, and the appropriate mindsets that foster learning.


So we’re going to learn about the Ohio Performance Assessment Pilot Project, or OPAPP, and how it is designed to elicit and assess deeper learning competencies.  Ohio’s taken a unique approach to piloting curriculum-embedded performance assessments.  It includes a system of both learning tasks for formative purposes, and assessment tasks.  These are aligned with the common core state standards and the next generation science standards.  It also integrates sustained collaborative professional learning throughout all components of the program.


Our distinguished guests will discuss the design and the implementation of curriculum-embedded performance assessments that capture important dimensions of student learning.  These formative learning tasks afford powerful feedback to teachers and students, providing information about where students are in their learning and where they need to be relative to specific learning goals.


So, let’s meet our guests.  With us in our studio today is Doctor Stuart Kahl.  He’s founding principal and former CEO of Measured Progress, an educational testing company working with over 20 states on their assessment programs.  Stuart has over 35 years of experience in large-scale assessment.  His current interests include assessment literacy, formative assessment, and curriculum-embedded performance assessment.  In some quarters, he is considered the conscience of the testing industry.  He was recognized by the Association of Test Publishers with the 2010 ATP Award for Professional Contributions and Service to Testing.


Next in our studio is Doctor Lauren Monowar-Jones, the program coordinator of performance assessment with the Office of Curriculum and Assessment at the Ohio Department of Education.  Lauren holds graduate degrees in astronomy and physics, and has worked at the Ohio Department of Education for the past seven years as a science consultant in the office of both curriculum and assessment.  She is an adjunct instructor that the Columbus State Community College, where she teachers online introductory astronomy courses.


If you would like to ask questions of our webinar guests, please do so using the form below this video window, and we will turn to your questions from time to time.  Also, if you’re on Twitter, we encourage you to tweet about this webinar using the deeper learning hash tag that you’ll see in the left corner of the video window.  So let me set the stage for today’s webinar: profound national and global changes have prompted educators to rethink the competencies students need and the assessments to measure them.


The growing concerns about how to significantly improve the quality of education for all students is generating healthy debate about the use of meaningful assessments that capture outcomes beyond simple academic content knowledge.  Criticisms of traditional assessments have pointed out that much of testing seems to have little to do with learning, and looking at assessment practices in this country over the last decade makes it all to easy to say that the critics may have a point.  Richard Elmore, noted professor at the Harvard Graduate School of Education, contends that the real accountability system is in the task that students are asked to do.  They must know not only what they are expected to do, but also how they are expected to do it, and what knowledge and skills they need in order to learn it.


As a result, students become more self-reliant in planning how to approach a learning task and engage their understanding and progress.  This urgent need to change classroom teaching and learning was underscored by the December release of the Program for International Student Assessment, known as PISA.  PISA is test of reading, mathematics, and science given every three years to 15-year-olds in the United States and more than 65 countries world-wide.  Since 2009, the proportion of top-performers in the United States has declined in reading and math.  US rankings fell from 25th to 26th in math, and from 14th to 17th in reading.  It is important to look beyond the rankings to examine what can be learned from high-performing systems regarding how they teach and assess student’s deeper learning competencies.


First, understanding what students know and can do is essential to effective teaching; and second, an abundance of literature shows that teachers’ use of high-quality classroom formative assessment used to discover what a student does and does not understand produces some of the largest effects of student achievement reported in educational literature, yet the majority of teachers, both novice and veteran, find evaluating and responding to students’ learning one of the most challenging elements of teaching.


So let’s turn to our guests to learn how states can provide teachers with the tools and professional learning they need to strengthen the connection between curriculum-embedded performance assessment, teaching practices, and student learning.  So, first we’re gonna turn to Stuart.  Stuart, Measured Progress has a long history in the world of performance assessment, and would you walk us through why states are moving towards the use of these kinds of assessments, and how their design can promote a higher order of thinking skills.


Stuart Kahl:

Okay, thank you, Marianna.  I appreciate being here, and this is a topic, needless to say, that’s dear to my hear, and I think it’s a very important, also.  I see my job, by the way, as being an introduction to the person next to me providing background and a little bit of a – a little bit of history – but a little background in performance assessment and curriculum-embedded performance assessment, because I think, when I talk about issues around that topic, they’re addressed by the program that Lauran has in place, and that’s real important.


Let me start by talking about the call for deeper learning.  You ask why the states are moving in that direction.  Well, it’s easy.  There – there’s all kinds of pressures: there’s political, business, education leaders, there’re the standards – and I don’t have to go through these in great detail – there’s the common core, there’re the science standards, and so on that all call for deeper learning.  And so I think that’s certainly reason number one: there’s a demand for it.  The reason number two: it’s just good practice.  It’s something that is not new.  We’ve been here before, we’ve seen the need for having students have higher-order skills that they don’t have to the degree they need, basically, and so it’s good practice.


I should say we get diverted sometimes.  Different programs lead us away from this kind of learning – the efficiency of testing with the tremendous demand for assessment in multiple grades and subjects for high-stakes accountability led us to move more towards this – the most efficient type of testing, which really doesn’t tap higher-order thinking skills as well.  And so it just seems like it’s a time that we have to address the larger goals of education; and by that I mean, we know that a kid’s ability to add two and three-digit numbers with and without regrouping is not a major goal of education.


Being able to apply skills like that is, and it seems to me we should be more directly dealing with those, but in instruction and assessment.  What do we mean by deeper learning?  It’s not too hard to bring to you a definition there.  There’s lots of them, but the point is – let’s see, the NRC, the National Research Council, had a nice way of breaking down 21st-century skills into cognitive skills, interpersonal, and intrapersonal skills.  The deeper learning really fits in that cognitive skill category.


Though, when we talk about curriculum-embedded performance assessment, clearly interpersonal and intrapersonal skills can play a role when the kids are engaged in the activities that are part of curriculum-embedded performance assessment.  So the definition, you can see, of deeper learning that NRC has come up with stresses the application of this foundational knowledge and these skills to real problems, and that’s the basis of it.  There’re two things that I and others in my office feel are game changers.


And, it’s interesting, I pick these, because these are things that are there at the classroom level.  These are things that – it’s where the action is; it’s where we want to impact things in terms of reform and so on.  We wanna see teachers and students, in many cases, spending their time differently.  And formative assessment is the process – it’s an instructional process.  It has many steps in it, one of which is collecting evidence of student learning, and sometimes I’m concerned that my colleagues in the industry have pirated the term and people associate formative assessment with a test, and that’s not the case; it’s a process.


As a process, that means that if we want improvement in it, that requires professional development, not the purchase of tools.  And if professional development is effective, then it’s ongoing, on the job, and collaborative; and that certainly talks to the needs in terms of teachers and educators.  Curriculum-embedded performance assessment, another one like formative assessment, is something that if it’s done right, it must mean you’re doing a lot of other things right.  My example of formative assessment was you must be doing professional development right.


Well, in the case of performance assessment, I think it’s a similar situation.  It requires an awful lot of other things to be going well in terms of support within the educational environment and so on, for example.  A definition of performance assessment – you can see all kinds of definitions.  The one we’re thinking of in this day and age is more engaged activities that lead to products, or performances, or presentations that are scorable or something that can be evaluated for either formative or summative purposes.  Curriculum-embedded performance assessment, obviously, is something that is not always at the end of something, but rather it’s ongoing – it’s during instruction.


I like to think of the curriculum-embedded performance assessment as instructional units, and instructional units, there are a series of activities – multiple activities that are both learning and evidence-gathering activities.  Some of them can lead to products that might be used for formative purposes; some of them might be used for summative purposes.  And so that’s kind of my view on what curriculum-embedded performance assessment is, is really – these are instructional units.  Where are things happening?  I – there’s a lot of places where good performance assessment is going on; I think mostly at the school and district level.


There are states engaged in this, and more and more are moving towards it; they’re still developmental.  I don’t have to talk about OPAP, because you’re going to hear a lot about that, but that, to me, is the model; it’s just a wonderful program.  Massachusetts is in the process of developing something that might be similar, but they’re really just beginning.  At the school and district consortia level, there’s a quality performance assessment from the Center for Collaborative Education in Boston that I think is a collaboration of schools in three states that have truly made a tremendous commitment, not just to dabble in performance assessment, but to make it a major focus throughout the school year.


And the New York performance standards is another; that’s a collaboration of, I believe, 28 schools in New York, and you can go on their websites and see what kinds of things are up to.  I’d like to spend the last two or three minutes talking about issues in performance assessment in the authentic assessment era.  This was – call it the 1990s.  This was the highlight of my career, because I thought there was some wonderful stuff going on.  But at that time, we didn’t do things as well as we would do it now, because we’ve learned a lot, and one of the issues had to do with the content quality.


A lot of the programs back then might have been portfolio assessments on a large scale, and a lot of that was left up to the teachers to decide, without a whole lot of training and background, what goes in those portfolios.  And you saw – we saw worksheets, we saw test scores, we saw all kinds of things, and that was a problem.  And the second problem that related to the content was, there wasn’t that concern that there is now about alignment of tests to curriculum to standards, and so on –

So a lot of those activities  were fun and games, but not  really tackling important  concepts and skills and so on. And it just wasn’t that  attention that there is nowadays to — efficiency, again, back then,  expensive, time consuming, all  of that, well, i should get the  next presentation your going to  see. That’s dealt with because if  it’s part of the instruction, if it’s instructional units, it’s  efficient in that way. And then there were lots of  misconceptions that — this  probably had a significant  impact on the status of  performance assessment back  around the turn of the century. Misconceptions, we’ll talk about briefly. Then i’ll end there. Too time consuming. They were time consuming, and — and that was a problem. But if it’s curriculum embedded, these are the lessons people  will be using. They’re not additive as this  next concern mentions. There was concern that there was an additional commitment  disconnected with the required  curriculum. Well, that’s addressed, too,  with the new sepa systems. Less reliable than multiple  choice. I could talk for hours on this  with all kinds of explanations  and so on. This is something that drives me crazy. It’s just not true. I can get the same reliability  with eight to ten constructed  response questions as a 50-point multiple choice test. And that’s fact. That’s easily demonstrated. And it’s in technical manuals of all the state assessments and so on, ones this use both modes of  assessment. So it’s just myth. And yet, you hear that concern. A lot of it has to do with the  fact that human scoring is not  perfect. Well, when you guess an answer  right on a multiple choice  question correctly and you get a point, you got a point for not  knowing the answer. And there are all kinds of  reasons why multiple choice item for item is not as reliable as a more extended task that you  might get in a non-selected  respond format. So basically that’s just myth. Human scoring that subjective. The same thing. Human scoring doesn’t have to be perfect. It’s a question of how many of  these things you have to get the reliability you need. And how good a sampling of the  domain also. The sampling of the domain is  another issue, and that’s why  many states have mixed models. Performance component as well as — as a selected response  component. So basically these are things  that — i mention them now  because we’re at a critical time for this type of assessment. And i — i think it could go  either direction because these  are still issues people are  going to bring up.


Mariana Haynes:

Thank you. >> thank you so much. It clearly seem that all of us  in terms of practitioners, those that are — serve as district  and state leaders, policymakers, have a long way to go really  understanding assessment well. And just need to work on our own assessment literacy to really do justice to the kinds of  professional learning and the  kinds of tools that students  need to achieve these very  ambitious standards. So something for all of us to  take heart, that we need to  educate ourselves around many of these issues that do have a  certain amount of complexity to  them.

So as i mentioned at the top,  i’m really delighted, Lauren,  that you’re with us today as the head of this very exciting pilot that stuart has lauded. Before we start, we have this  video that you were so gracious  to provide. And it will give us an  opportunity to hear from those  doing the frontline  implementation on the impact of  the pilot on their teaching  practice which is fundamental to all of this. We’re going to roll that now,  and then we’ll come back to you, Lauren.


lauren, thank you. How has professional learning  been in the pilot and the  instructional practices?


Lauren Monowar-Jones:

>> thanks for having me. I appreciate being here. The ohio performance assessment  project is a pilot that’s come a long way. We started in 2008 with the  foundation money, working with  stanford on curriculum embedded  performance assessments. At that time, we were really  just interested in learning  whether performance assessment  could work in ohio and how it  might pan out. And toward the end of that  funding cycle, about the time we were awarded the top grant for  ohio we came up with a different model that we’d like to try. I’ll talk about that in a  minute.

Before i get to the model, i’d  like to tell more about the  pilot and where we are so far. One of my colleagues describes  the project as a large-scale  assessment in miniature. It has all of the elements of a  large scale assessment with  external review committees and a vendor that writes items and  effects in the department that  review them. So just to give you an idea of  the scope of the project, we’ve  engaged teachers in grades three through 12. We’ve written tasks in math,  english language arts, science,  social studies, and career tech  pathways. We’ve piloted 140 learning tasks and 148 assessment tasks, and  i’ll explain the difference  between the two in a moment. We’ve trained 686 teachers to  provide instruction using the  project model and trained as  many teachers to score these  tasks. We’ve scored 23,438 student  responses, always scoring them  at least twice. We’ve developed teacher training materials for in-person sessions and for electronic on-demand  sessions. And we’ve documented the project using an outside connection and  used a videographer to get the  lovely video of the teachers. Generally we’ve experimented  with the aspects of a large  scale assessment system. The task dyad learning system is  the model we’re using. And there’s a schematic of it in this slide. The model we’re piloting is  called the learning system and  the d a d doesn’t stand for  anything, i get asked that a  lot. It’s a pair of closely coupled  tasks comprised of an assessment task that’s used to measure  performance against the content  standards and a learning task  that’s used to provide a context for the student to learn the  skills that will be assessed. The learning is comprised of  instruction, often direct  instruction, student practice,  often using the same software  interface as the assessment. And the teaching practice of  making observations and giving  additional instruction or  feedback to either the  collective of students or to  individual students if  necessary.


The intent is to ensure that all students have had the  opportunity to learn that which  will be assessed. The objective of the model is  not to catch the student not  knowing during the assessment  but to catch the student showing that he or she has the skills  implied by the content standard  and instructed with the aid of  the learning task. To reiterate the relationship  between the learning and the  assessment task is like the  relationship between what you do with a driver’s permit and  driver’s license.

So that analogy is one of my  favorite analogies is that you  go to the driver’s motor vehicle bureau i guess, we call it the  bmv in ohio. And you take a test on the rules of the road. And once you pass the test  showing that you know the rules  of the road, you’re then given a permit in which you should  practice driving. If you have brave parents, they  will go with you driving, and  usually you will practice things that will not at all be on the  driver’s test. In ohio the test is on a closed  course. Nonetheless, our students don’t  practice on a closed course. They practice on real roads. The reason for that is primarily to prepare them for what the  real application of that  practice will be. That analogy is similar to the  learning task and the assessment task relationship. In the learning task, students  will practice things that may  not be on the assessment test  but that will make them good  drivers so to speak. And then the assessment task is  going to assess something that  they have practiced.


So let me give you a little bit  more information about the scope of the pilot which is massive to say the least. So i’ll talk briefly about each  component. So far we’ve successfully  developed learning and  assessment tasks. Our vendor, measure progress,  did this at first. Now we have teachers writing the learning tasks and coaches  writing the assessment tasks for our middle school pilot with a  pretty good level of success. We’ve piloted dyads with  teachers in grades 3 through 12  in the four content areas and  career tech pathways. Teachers who have piloted have  also scored their own assessment tasks in a double-blind  situation so they don’t know if  they’re scoring their own  students’ work or someone else’s students’ work.


One of the initial features of  the gate-funded project of the  use of teachers as readers. When teachers are taught to  score, that experience sharpens  a teacher’s understanding of the item, the scoring process, and  the content standards all at  once. Moreover, when teachers score  students taught by others, the  teachers might see something in  the students’ response that can  be used with his or her own  students next session. One of the early teacher scoring and training — scoring training and scoring of students  responses took so long that it  was projected it might take days of teacher time. We have worked to define the  assessment tasks and the rubrics to reduce the time to score. The process now takes about one  day to complete for any group of teachers. We’ve also field tested diads  that were piloted and revised  with the teachers who did the  online training for  implementation. Those field test teachers also  scored their own students’ work  and the work of other students  in the scoring event that took  only one day to complete. We have developed a set of  online professional learning  modules to prepare teachers to  implement the diads in their  classrooms. And we’ve modified these based  on user feedback. We’re in our third trial this  month. And we’ve collected a lot of  very interesting data. We did some experiment. We collected data that should  tell us whether it matters if  you teach a learning task or  not. We’ve collected data to tell us  whether it matters if you are  trained to score in a  face-to-face scenario or on  line. We’ve collected data it tell us  whether teachers are biased for  or against their own students  when they score. And we’re also looking at  performance of students and  diads compared to their  performance on the state and  assessments.


Finally, we’re scaling up. And we have a model and a vendor who’s going to help us implement the scaled up version of this  model, and our plan is to be in  full operational mode by the  spring of 2015.


So lessons that we’ve learned  along the way. Quite a few actually.


The first lesson is not anything new. Face-to-face professional  learning is clearly most  effective. Much more than effective than  online professional learning. Online learning can be effective even though it may not be the  most effective. It can be effective if it  includes group work. We learned that after trying  with the first group of teachers and gave free reign to design  their own implementation of the  online learning modules and  learned that when they did not  work together as groups they ran into technical issues and that  derailed many teachers from  being able to complete the  online modules. There’s also the online models  initially did not include many  optional sections. We also found it was important  in the online professional  learning to give the learner  flexibility to choose what the  learning needs the most and to  differentiate. Kind of what we ask teachers to  do all the time.


A second lesson learned which  which also is probably not a big surprise is that technology is a very big challenge. One of the thing that we noticed besides having issues with  teachers not being comfortable  with technology was also that  teachers and technology  specialists were not  communicating very well to one  another. And there is, of course, the  limitation that a lot of k12  institutions are dealing with  which is how to provide access  for all students. It could be that they’re limited to a computer lab or that  they’re insistent on protecting  students from being able to sort through information. Both of those provide barriers  or interesting obstacles to  learn to overcome.


A third lesson learned is the  light bulb moment. One of the things i noticed, and i haven’t figured out how to  replicate this any faster, is  that the teachers involved in  the project really understand it only after they have scored  their student responses to the  assessment task. Up until that moment, no matter  how many times i try to tell  them the answers to the  questions, they’re unwilling to  hear it. It’s sort of — one of my  friends says that one of the  failures is providing answers to questions that have not yet been asked. I think that’s the problem. They don’t ask the questions  what am i supposed to do with  this learning task or how does  this task relate to the  assessment task in a way where  they’re willing to receive the  answer. Until they see what happens with the student responses. What i will say is that when  teachers score student work,  teaching improves quickly. The next time they do a learning task, students do significantly  better because the teachers know how to teach it. They understand what’s important about the learning task right  away.


The fourth lesson learn sudden  that students love technology. Again, not a big surprise. But another issue is that  teachers need some help in  learning how to use technology  effectively with students. It’s not just that teachers  aren’t using technology. I think it’s that they’re using  very simple tools that aren’t  very effective in helping students learn. And so we need to help teachers  find these wonderful resources  and use them effectively.


And my last lesson learned is  that curriculum embedded  performance assessment requires  best practices to be effective. For optimal implementation, it’s really important that teachers  are fully — fully — what is  the word i’m looking for — that they’re really using all of the  resources that they have in  their toolboxes as teachers to  really engage students. And if they’re not using all of  those resources, formative  instructional techniques,  feedback, reengagement, then the implementation will not be  effective. Teachers need training that  supports using these practices,  and time — time to be  reflective. We haven’t been giving teachers  enough time to think about it. In this pilot we’ve provided  teachers with a lot of that  time, and in real life, i think  that’s something we need to  think about changing.


Some challenges we ran into were online delivery of — of the  tasks. That’s a new challenge that’s  something that involves many  layers, not just teachers but  technology and district contact  and information, back and forth. It’s just difficult. They’re not used to getting  these types of pieces of  information to the right  resources. And that’s coming with all of  the new online testing. I think this is something that’s good that we learned what the  challenges were and how much we  need to help districts do that.


Another issue is that teachers  are reticent to teach without  knowing all the answers. Especially in high school. We’ve struggled with helping  teachers be able it figure out  how to be comfortable in that  uncomfortable situation. Also keeping the tasks and goals and rubrics aligned to one  another is a challenge that  teachers that are writing the  learning tasks now sometimes  debt great ideas for activities  that don’t quite match up with  what their stated learning goal. So we end one things that are  not quite meshing together. So that’s something i’ve  developed a lot of and have had  a lot of fun doing it, i’ve  developed a lot of interactive  activities to try to help  teachers overcome some of these  pitfalls that they keep running  into.


Finally, our state model that  the state has adopted to  continue this work is we’re  going to provide these learning  systems for the untested grades  in science and social studies  between grades three and eight. For science, ohio’s testing in  fifth and eighth. The other years we’ll be  offering the task dyad learning  system. Our model is that teachers will  create the learning task shelves at summer institutes. Vendors will finish off the  learning task and create  associated assessment task. And then the teachers, again,  will be involved in scoring  assessment tasks with moderators to train. This is the model we’ve been  using in the pilot, and i think  it’s — there are a lot of  benefits to having teachers  involved at this level. I’ve seen it in other states,  and i’d like to bring it to  ohio. So i’m excited about that  aspects of it. — that aspect of it.


>> thank you very much.


>> you’re welcome.


Haynes:>> i just wanted to ask a couple of questions. You made a decision early on  about using an online platform. Right? Even though there’s some of  these challenges around using  technology. Can you talk about what was your decision to do that.


Jones:>> well, i saw that we were  moving toward an online  assessment system. The rifp, we hadn’t chosen which consortium we were going to go  with. We were on smarter balance, but  both were going to be offering  online assessment system. I felt that it made stones try  out some of the online delivery  pieces of that. Just to work out the level one  and level-two bugs. And we learned so much because  of that. I’m really glad that we did  that. It was — i mean, eventually  everyone got on board with it. But there was a lot of  skepticism that we would end up  with a vendor who could provide  all of that and an online  delivery system. Thank you, measured progress. >> measured progress that does  involve this.


Haynes:>> as we begin our panel  discussion, i want to remind  viewers that you can ask  questions via the box below the  video window. So stuart, just — kind of go  through this again about how   curriculum-embedded performance  assessments can help teachers  taxle two big problems now. One is increasing rigor of what  students are asked to do overall with all students given these  new expectations or what they  need to know and do. As well as addressing learning  gaps in students’ learning and  performance.


Jones>> well, the first part of that, the rigor. That’s — that’s why we’re  moving there. We know that — that the  selected response format has  problems addressing some of the  skills we’re most concerned  about. And that’s why we’re moving to  performance assessment, to  engage them in bigger types of  problems, activities that  involve them — in so many — so many situations and real-world  problems and places where they  have to apply an awful lot of  different skills as opposed to  everything in isolation, number  one. Number two, higher order skills. That’s why we’ve moved the  performance assessment. So the rigor is kind of there. That’s what they’re intended to  do. The gaps, it their is  interesting — i — this is  interesting — i do believe that it can do a lot for two reasons  to deal with the learning gaps. Number one, we talked about the  curriculum-embedded performances as being units.  We saw from examples there’s the learning tasks and assessment  tasks. The listening tasks are still in many cases generating evidence  of learning, and so that’s —  that’s during construction,  that’s formative assessment  evidence gathering. The evidence-gathering step with informative assessment process. And –


>> what’s that allow you to do? Once you have that evidence,  once –


>> once you have the evidence,  the other step to the process  are basically — are basically  providing feedback, rich,  effective, descriptive feedback, not six right out of eight,  okay. And also to adjust instruction  to adjust learning activities  and so on to fill those — to  fill the learning gaps or close  those gaps. Now when you talk about gaps as  — learning performance gaps,  the different populations and so on, the research shows clearly  that effective performance —  excuse me, effective formative  assessment which is part is of  what we’re talking about here  has the biggest benefit for the  students who are disadvantaged  or are struggling, whatever. And i might also add that when  you talk about formative  assessment, you talk about the  suspects for learning, a big  part of the rationale for that  has to do with motivation to  learn. And i think there’s no question  that kids engage in these  activities, first off engage in  activities for formative  purposes, it may not be graded. Their focus shifts. They’re motivated to learn as  opposed to get the grade and be  satisfied with 80% correct. To me, that’s the other aspect  of it.


>> and maybe you can pick up on  this same question. And also talk about grading in  relationship to these formative  tasks.


Jones >> yeah. For the learning tasks, we  deliberately made it impossible  for teachers to assign grades. We wanted to make sure that they weren’t going to inadvertently  start using them for some other  kind of summative grading  system. It was a challenge. The teachers didn’t like that at first. But they began to see the  results quickly as you heard in  the video that we showed at the  beginning. Many of them after trying it  recognized the benefits. That was a very positive thing. But i think that the — another  piece about the learning tests  is that our learning tests in  the — in the cast dye learning  system include an extensive  teacher’s guide that gets to  formative instructional  techniques that would be  appropriate miscon senses. We use ohio’s model curriculum  for the common core standards  and our own state standards in  science and social studies to  help identify areas that might  be complicated to teach and  strategies in order to get  around the misconception that’s  students often have.


>> thank you. So when you — i want to look at some of the skills and practices teachers made it to have as part of this formative process. One of them you mentioned. Maybe you can elaborate,  reengagement. What is that about? What does that mean?


>> reengagement is a process of  reteaching. But the idea is to involve the  entire class, even the students  really got it, in the process — the students who really got it,  in the process of reteaching. Studies have shown that type of  reengagement can really help all the learners learn. What it does fundamentally for  performance assessment is it  helps all of the learners have  an understanding and develop a  common understanding of what the expectation of performance is. It’s difficult for students  sometimes to use a rubric or a  question in order to understand  what the teacher is expecting of them. Sometimes providing examples  will be helpful. Reengagement of the process is  taking actual student work,  showing it to the class  anonymously, of course, and  deconstructing some of the  responses and saying this  response is good, why is it  good. What’s not good about this  response. Every response has good and bad  things about it. A great youtube video about that is my favorite no.


>> i’ll check that out.


>> yes.


>> when you were talking about  as far as motivation engagement, those kinds of processes sort of reengage which i guess is the  point of the terminology that’s  used. But it takes — takes stops  another place in terms of their  learning.


Jones>> exactly. Listen, there’s another aspect  of it, too, that formative  assessment is a process that has multiple steps involved. And part of it is the use of  other students as resources. To me, whether we talk about  reform, we talk about the way  teacher and students spend their time. And time keeps coming up. I keep hearing i don’t have time for the assessment and still  teach. You got that change that mode of instruction where the teacher is the source of all information  presented to the students, then  you test and move on. The idea is that students on  their own can do a lot it help  each other. In some ways that should free up a teacher’s time to be a  facilitator, to be moderating  things, to be going from group  to group or whatever. And doing a variety of things in terms of gathering evidence. To me, that’s another skill,  almost a management skill — >> changing the dynamic so that  the learners are much more  active in accessing their peers  as resources and — >> by the way, making the change is not easy. This is a culture shift. This is a — a mindset shift or  something. The first time you try some of  this, it may not work so well. The first time you don’t grade  something for the kids, you  might find —  >> makes you nervous. >> it doesn’t work, quit it. You know. That’s not the issue. You want the kids to know this  isn’t graded, but i know when it is graded. If i can do this now, i’ll be  fine then. So that’s a critical part of  formative assessment, too.


Haynes>> a challenge is for teachers  to feel comfortable when they  didn’t have all the answers or  when they didn’t know exactly  where this would take them on  one task and so forth. Can you say more about that?


>> yeah. I think it’s especially  difficult in the high school  grades where our teachers are  expected to be content experts. So asking a question that you  don’t know the answer to is  hard. A lot of times i try to get  teachers to start out by  friending they don’t know the  answer for a little bit. Sometimes that’s more  comfortable for them. The point is that what teachers, what students need to learn is  that there won’t always be  somebody with an answer to a  question that you have to ask. And providing them the  opportunity to have an  experience where there is no one in the room who has “the answer” is a really important thing to  experience.


Haynes >> we have a question from a  viewer about where they can see  examples of these tasks.


Jones >> there’s a link on the opapp  website, go to www.educatio, at the bottom of that page, there is a link to a  resource called i learn ohio  with the user name and possword  that’s open to — password  that’s open to anyone to use. I don’t remember them off the  top of my head. But you can get in and see for  example tasks. They’re learning tasks only. But there’s one for high school  biology, one for algebra one,  one for high school social   studies, i believe a history  task, and a 10th grade level  task.


Haynes>> a question from sunny from  denver who asks were there any  state policies adopted to enable this process, or is there — is  there consideration that they’re going to have to generate the  policy frame for any of this  work?


>> right. So the pilot, it was just a part of the top grant. That was how it — that enabled  the pilot to happen. Moving forward, the state model  that we’ve adopted is a part of  our non-sumtive assessment plan. At this point in time, i’m not  aware of any policies that have  been developed to establish  whether teachers will have to  use them or not. The fact that we’re offering  them is unique. Not many states are offering  formative assessments to their  teachers that could be used,  that will mimic what the  summative assessments will look  like. >> some of the questions we’re  getting is about how they’re  going to integrate the  performance assessments with the accountability system. >> for the learning system,  those are going to be outside of the summative system. They won’t be part of  accountability at this point in  time. At least not on the federal  level. We are offering, though,  end-of-year tests. We are a park state, so our —  our english and mathematics  exams will be park assessments  which include a  performance-based assessment and end-ive-year assessment which — end-of-year assessment which  will be combined into one score, and we’re mimicking assessments  for science and social studies.


>> a lot of overlap.


>> a lot of overlap, yes. >


Haynes> so there’s a couple questions about costs. Taylor from d.c. And circy from  iowa asked about the estimated  costs of the pilot and what that might look like going forward as you’re trying to scale.


>> we did a cost estimate, and i was surprised that the cost  estimate was not very high per  student mainly because our model is not to have hundreds of those performance assessments. Costs a lot to develop them. But per student, and ohio has  about 1.8 million students, the  cost is pretty low. Less than a dollar per student. I believe more like 25 cents per student –


>> really? There’s a perception that this  is — very expensive. Expensive to develop. >> it can be if you want to have ten every year, yeah. It’s more expensive. But if we’re look at one diad  each year for the full grade  level, not that expensive  counting by student. >> a lot of the works thus far  that was funded to race to the  top is something you’ll have  going forward. So those develop cost will come  down in theory.


>> yeah. What we’ve created from the  pilot are more prototypes of  what will be used in the summati ve — non-summative assessment  system. How the state chooses to use  them moving forward has not yet  been indicated to me. We’ve not released anything that will be used in the summati ve  sense, whether it’s used for  summative purposes for local,  predictive — i don’t know what  they want to use it for. We haven’t put out any of the  assessment tasks. We’ve only put out learning  tasks so far. More will be released by the end of the semester, i’m sure.


Haynes>> terrific. Do you have thoughts about this  as you work with other states in terms of things they wrestle  with around all of this?


Kahl>> i do. But you’re not going to be able  to force me to name a dollar  figure. There’s so many variables that  to put a price tag on it is very difficult. It is true that there’s an awful lot of efficiencies in what  we’re talking about. What might have been expensive  years ago may not be expensive  now. The business of in the past of  large scale performance  assessment was gathering all the work, essentially scoring that  you will kind of stuff. If the teachers are doing the  scoring, that’s — that’s a  savings. They’ve got to be moderated so  there isn’t a cost of auditing  scores and so on. That’s done on a smaller scale  and so on. And there’s good techniques that we know to do. That we’ve also got the  submission to work  electronically. So — there’s a lot of savings  nowadays. So it is really to me, it’s  worth the expenditure whatever  it might be. But you know, the — the  development of tasks, teachers  developing them. Maintaining that quality. We think they should go through  the same review that the  multiple choice items have gone  through for years. You know, bias and sensitivity,  alignment, all kinds of issues  like that have to be addressed  in those reviews. That should be done for the  performance task, as well. You’ve got a bunch of committee  work, you know. Vendor work in terms of — of  the logistics of all of this,  but to me, the development of  these things should be something that’s ongoing, that teachers  are submitting and reviewing,  revise, posting, that kind of  thing. And so — yeah. For — any combination of all  these factors could probably put a price tag on and add it up. But i think it’s certainly not  — it’s not extraordinary, i  don’t think. >> and i would say  developmentally the only thing  that is different from the  learning tasks from a standard  item for an assessment would be  that the learning tasks do need  to be piloted in front of  students. One of the things we’ve noticed  is that our first attack at  writing a — a learning task and teachers’ first attempt at  writing a learning task doesn’t  always result in the types of  responses that you’d expect. Pretty much i think anyone who’s worked on scoring constructive  responses at the state level  would understand this. You cannot possibly anticipate  every response a student would  give. And some of them will just be a  big surprise, and then you have  to rethink how did i frame this  question, how did i set up this  particular task so that students could respond. And so i would say piloting is  one extra piece that has to be  added in.


>> okay. And you’ve mentioned it, the  active scoring provides teachers with sort of the most powerful  professional learning in all of  this h. It’s almost sort of —  you mentioned a couple of times, almost like an aha moment by  virtue of them going through and scoring work. Is it possible to use technology to increase the reliability or  ability to practice and score  through the use of technology? Are you using technology for  those purposes?


Jones>> yeah, uh-huh, we are. We do — we’ve tried both a  foois face scoring situation and a remote scoring situation  because we felt that  scaleability of a face-to-face  scoring situation for every  teacher in the state was not  really realistic. So we tried a distributed  scoring model. Measured progress worked with us to develop training modules for  the teachers to be trained to  score the assessment tasks. And we think it’s been very  successful. One limitation that we have is  that it was a little bit hard to include validity papers and the  large practice sets. So moving forward, i would say  we probably need to increase the scoring materials that we’re  creating for teachers to be able to practice. We’ve seen pretty good  reliability in the way teachers  are scoring.


Haynes>> are you seeing that in terms  of reliable of — >> general when he it comes to  scoring constructed response or  extended constructive response,  most of the companies have  systems now that at some level  are distributed. And their training often is on  line. The constant monitoring of the  scoring consistency might —  might — the readings, looking  at the discrepancies and dealing with them when they’re scores  that are different by more than  one point. There are systems that it can be applied to. I think to the — the products  that come out of performance  assessment. Particularly if they’re  submitted electronically and  people can look at them, yeah. >> sounds sophisticated. Sort of for all kinds of  reasons. In terms of like you said, since the ’90s, the level of expertise and using the technology to  really refine some of these  scoring methodologies.


>> the early half of the ’90s we were taking live paper and  sorting them out into a dozen  boxes and you had a dozen tables –

>> i remember those days.

>> i mean — >> yes.

>> that’s long gone.


Haynes >> okay. We have a question from aroundy  from columbus. In your neck of the — randy  from columbus. In your neck of the woods, about the schedule schedule. How is the school schedule for  this, is there time to  collaborate on how they’re using or scoring these — these tasks?


Jones >> yeah, that’s a good question. With the pilot we were able to  provide funding to the schools  so that they could get  substitutes and get time off for the teachers. But we also had an agreement  with every school that  participated in the pilot that  they should provide at least one hour of collaborative work time  for the teachers that were  involved in the pilot. Most of the schools already had  something like that built in. They already had team meetings  that — that would happen on a  weekly basis or twice weekly for a half-hour each. As long as they had an hour  where they were able to work  together a week, we were pleased with that as a minimum  requirement. So i don’t think it had a big  impact on what was already  happening in most of the schools that participated in the pilot. Although our professional  development load the first  semester was very high for our  face-to-face cohorts. They had eight days out of class which was stressful for a lot of them.


>> but the teachers — how are  they recruited into the pilot? >


Jones> we had an open application  process, anyone in the state  could apply for the first two  cohorts, it was a race to the  top only. For the field test cohorts we  opened it to all the — all the  districts in the entire state. And we selected them based on  their responses to the questions in the application and their  readiness to be able to  participate. >> did you get a pretty good  response? >> we did. The first one was competitive. We had about 21 — 21 districts  supply. And we had had to only take ten. So it’s –

>> great response.

>> yes.


Haynes>> you — we have a lot of  questions about what’s  available. I mean, folks upon these  resources. Are the — folks want these  resources. Are the tests available, are s  there anything that we can share with viewers in terms of how  they may be able to see these? You mentioned one.


>> it four tasks are available  on the i learn website. We are moving toward getting the modules available. Right now we’re in the last  round of piloting. For the month of january,  they’re live, and they’re being  used by a group. Teachers. Perhaps later this semester, by  march, we’ll have a home for the training modules. May be the same as the home that we have right now.

>> can they go to the opapp site?

>> yes, definitely check the  opapp site. I’m sure once we get a home for  the p.d. Modules, i will put up  where they are and get access  for anyone who wants to find  them.


Haynes >> how is this — how has this  impacted the role of leaders in  schools? Do they have to be integrally  involved in this in terms of how — what is their role in making  this successful within their  school?


Jones>> i think most of the schools  have participated in elementary  school, the leaders, the  principals have attended a lot  of the professional development. I think that’s very important  for the teachers to get that  kind of support. In the high schools, i’ve seen  lots of that support, more from  the curriculum directorer if  anyone else from the district  office. I think getting support in terms of being given time, not having  other duties piloted on top of  them in addition to this has  been important for the teachers  that are in the project.


>> excuse me. I mentioned the center for  collaborative education in  boston. And there are two p.a.-quality  assessment performance programs. They have an implementation  guide that — it’s about what it takes to implement this program  over a — over their program,  multi-year period. So it’s not something you can do overnight to go from a to b just like that. It’s — it’s very challenging. And a lot of that has to do with what the roles and  responsibilities of the leaders  and the teachers and so on are. So that’s a — that’s a — >> there’s a lot out there,  isn’t there? There’s much work out that has  been going on for many years now that — that can come to the  fore, and the common standards,  it makes it accessible and  transferrable to other places.


>> i think it’s a lot easier  sometimes to do something at the school or district level than it is statedwide.

>> yes.

>> that is true.

Haynes >> so i think we’re just about  out of time. Any final comments? I applaud you for this marvelous enterprise. I think we all are very eager to hear how this goes, and again,  viewers will be looking for  resources that may be available. We will post a few things on —  where the archived video is —  will be housed in the next day  or so. But again, I really want to thank you, lauren, very much, and  stuart for spending time with us to share this exciting work. And thanks to our viewers for  joining us today. Have a great day. Thank you.

Categories: Uncategorized

Action Academy

Welcome to the Alliance for Excellent Education’s Action Academy, an online learning community of education advocates. We invite you to create an account, expand your knowledge on the most pressing issues in education, and communicate with others who share your interests in education reform.

Register Now

or register for Action Academy below:

Join the Conversation

Your email is never published nor shared.

What is this?
Multiply 4 by 11 =
The simple math problem you are being asked to solve is necessary to help block spam submissions.



Every Child a Graduate. Every Child Prepared for Life.