Teacher Competency Evaluation

Abstract

Teacher Competency Evaluation refers to the systematic procedures that attempt to present a balanced and comprehensive assessment of how teachers perform in their classrooms. Particular areas of interest include how teachers communicate with students with diverse cultural backgrounds and learning styles and how teachers maximize educational relationships with other teachers, administrators, and parents of their students. The importance of Teacher Competency Evaluation has grown in importance in the late 20th and early 21st centuries as schools are faced with increasing calls for accountability, but controversies abound in schools as to how best teacher competency should be evaluated. In spite of these widespread controversies, there is universal agreement as to the fact that teachers contribute to student learning more than any other aspect of school, thus confirming the importance of discovering effective ways to judge teacher performance.

Keywords Artifact; Behavioral Observation Scale (BOS); Behaviorally-Anchored Rating Scale (BARS); Competency; Critical Incident; Debriefing Interview; Formative Evaluation; Interstate New Teacher Assessment and Support Consortium (INTASC); National Board of Professional Teaching Standards (NBPTS); PRAXIS; Scoring Rubric; Summative Evaluation

Teacher Education > Teacher Competency Evaluation

Overview

The issue of how to fairly and comprehensively evaluate teacher competency is exceedingly complex for numerous reasons. As Kenneth Peterson observed, good teacher evaluation is difficult to do because there are many short-term costs and only long-term payoffs (Peterson, 2000). At first glance, evaluating teacher competency would appear to only entail three steps:

• Determine the standards that a teacher would have to meet to teach at an acceptable level.

• Determine the range of knowledge that a teacher needs to marshal in order to teacher effectively.

• Assess whether a teacher has this body of knowledge and demonstrates how to operationalize it.

The problems with this approach are so numerous that a large body of educational literature solely centers upon the problematic nature of this view of evaluating teacher competency.

Minimal Classroom Competency

For example, The United States is not united when it comes to a common agreement regarding the requirement for teacher licensing; a fact that has long been known to teachers seeking to practice in a state other than the one in which they received their license. There are literally fifty different state definitions of teacher competency for beginning teachers as to what determines their minimal classroom competency. Competency refers to the ability of a qualified individual to perform an activity, task or job situation (Spector & la Teja, 2001). Minimal competency would qualify that definition by attaching it to the lowest acceptable expectations for performance determined by the employer. This is interestingly revealed in examining the differences among states when requiring the successful completion of a standardized test to assess a teacher-candidate's competencies in subject matter and pedagogical knowledge. While the majority of states require passing the PRAXIS exam created by Educational Testing Service (ETS), a number of states, including those with large populations of teacher-candidates like California and New York, have created their own standardized tests in lieu of PRAXIS (Stronge, 2006). An alternative voluntary evaluation system for teachers with three or more years of experience has also been established by the National Board of Professional Teaching Standards (NBPTS). In an attempt to try to create a national consensus regarding teacher evaluation across the country, the Interstate New Teacher Assessment and Support Consortium (INTASC) was created in 1987.

Differences in Evaluator Interpretation

When evaluating teacher competency after teachers begin working, the definitions concerning their desirable rather than minimal performance level multiply exponentially across regions. Among school districts in the same state, school administrators and faculty assigned evaluative roles will commonly differ in their interpretations of state standards for teachers. Finally, the means commonly used to evaluate teacher competency—interviewing teachers and observing repeatedly their classroom performance—are means open to a variety of various interpretations of the part of evaluators.

As a starting point for probing the reasons for the wide divergence of professional opinion concerning teacher evaluation, questions pertaining to teacher observation offer a useful starting point. Clearly, teachers need to be observed by assigned evaluators while they work with students in order for a full picture of teacher competency to emerge. One question is: how often? The number of observations of a teacher in action will differ depending on whether a new teacher or experienced teacher is being evaluated. A novice teacher might be observed more often than an experienced teacher.

Observer Conduct

Although not often mentioned in educational literature, the manner in which the observer conducts him or herself during class can very much color the outcome. For example, an evaluator may bring a notebook or legal pad in order to take notes. Yet the sight of an observer furiously and constantly writing notes throughout a teacher observation could throw a reasonably confident observed educator into self-doubt, negatively impacting teaching performance. Some observers have used tape recorders that might have a similarly unsettling effect. If the observer shows a non-verbal sign of disinterest in classroom proceedings (e.g. glimpsing at a watch or room clock; staring out a window), such seemingly trivial gestures may be interpreted by the observed teacher that he or she failed to make the grade during the observation.

Formative & Summative Evaluation

Observations are classified as "formative" or "summative." A formative evaluation is used to identify the strengths and weaknesses of a teacher so that specific suggestions for improved performance can be tried. A summative evaluation is used as information to determine the teacher's employment status (e.g. a pay raise or tenure; or dismissal or probationary status). The issue of how many formative evaluations should ideally be conducted before a summative evaluation is conducted is controversial, since it is impossible to establish an objective number of teacher observations as ideal. The timing of formative evaluations as to when they occur during the school year is also worth considering. The scheduling of a formative evaluation the week before a major school holiday could be interpreted as an unfair event since students are universally perceived as less attentive to teachers immediately before a holiday.

Who Does the Observing?

The most common evaluator is a school principal, a choice often criticized by teachers who cite the fact that principals are least in contact with students and their learning needs, and so the least able to assess how teachers are reaching them. Other observers may be peer educators—other teachers. Some teachers also question the utility and fairness of being evaluated by their peers, since prejudicial emotions like envy, or judgments based upon subtle sexism and/or racism, or major differences in philosophy, might negatively color evaluations. On the other hand, some principals object to peer observations in schools marked by strong teacher unionism on the grounds that a peer evaluator will look the other way at teacher inadequacies when a job might be on the line because of a summative evaluation (Lieberman 1998).

Applications

A typical framework for evaluating teacher competency entails each observation framed by an interview before and after the observation. Observations can be announced ahead of time, or can be surprise visits by an evaluator to a teacher's classroom, the former being more common. The interview prior to observation will often focus on asking the teacher being evaluated what strengths and weaknesses he or she currently perceives in the classroom. A conscientious evaluator will carefully note these with commentary and re-introduce relevant ones as topics during the interview following classroom observation. Whether or not the teacher being evaluated is a novice or an experienced educator, it is perceived as vital that the evaluator present herself or himself as a potential coach, or critical friend (Whitford & Jones, 2000) rather than judge (although that role of coach seems difficult to maintain when a summative evaluation is in process). The evaluator should offer specific suggestions for enhancing teacher performance that are inviting intellectually rather than off-putting psychologically.

Measurement Scales Used in Observation

Two sources are the Behavioral Observation Scale (BOS) and the Behaviorally-Anchored Rating Scale (BARS). The BOS standards are based upon optimal standards of teaching performance codified in principles (e.g., ask open-ended questions to elicit maximal student participation in classroom discussion). An observed teacher struggling to spark a classroom discussion on an assigned topic should consider how and when to ask an open-ended question. On the other hand, BARS standards are based upon the actual number and appropriateness of demonstrations of educational best practices utilized by a teacher under evaluation.

Regardless of the standards used, it is accepted among educational professionals that no evaluator should directly criticize the professionalism of the teacher under observation. In fact, an evaluator, regardless of how deficient the teacher performance observed, should affirm to the teacher the difficulty of the profession for everyone in it. That is often the main reason for a debriefing interview after a teacher observation: to help the observed teacher regain an equal footing among his or her peers. This also allows the observed teacher the time and space to try to integrate the teaching suggestions made with the evaluator without the pressure of feeling the evaluator as intrusive in the normal atmosphere of the classroom. Even in the case of a summative interview leading to teacher dismissal, a debriefing interview can be conducted in a manner demonstrating respect for the professional and personal integrity of the educator who cannot, for any number of reasons, be given more employment.

Assessment of Artifacts

Another aspect commonly found in the teacher evaluation process involves the assessment of artifacts. Artifacts refer to the concrete evidence other than that discovered in observations and interviews by assessors. Some researchers have reported that almost 90% of schools use portfolios in some form to make teacher competency evaluations (Wilkerson & Lang, 2007). Often a teacher under competency evaluation offers a portfolio containing representative class assignments, notes from district workshops, and forms of student evaluation. With the increasing presence of the Internet, some teachers present an electronic folio of their artifacts on a Web site.

Not all indicators of teacher competency are as concrete as a folio, or as easy to interpret. Every new teacher has taken college-level courses touching upon educational theory. The fact that such theoretical knowledge is often fresh in the minds of new teachers being evaluated might be at least as much a liability that evaluators need to take into consideration as an asset. Educational theories often are based upon ideal scenarios far removed from the actual grit of classroom conditions in which teachers work. A new teacher's idealism might well be supported by the attractiveness of recently learned theories that hold little relevance for the actual classroom environment. In such a case, the evaluator has the unenviable task of helping the evaluated teacher ground idealism in realism appropriate to the actual classroom, all the while hoping that the gist of new teacher idealism can withstand such tailoring. Another academic knowledge transfer is how relevant the new teacher should find the research literature about "best practices." As highly valuable as such research literature can be, the applicability of "best practices" established in certain types of schools might not transfer to another school setting due to socioeconomic and administrative differences between the schools researched and the school a teacher finds him or herself in.

The extremely serious nature of evaluating teacher competency carries pressing legal consequences, particularly when a teacher's response to an evaluator's overwhelmingly negative summative evaluation leads to an evaluated teacher pursuing legal redress. What might seem like a teacher showing propriety, acting professionally in an appropriate manner, in a given instance (e.g., sending a misbehaving student to a principal's office) might turn into a legal issue involving legal implications if the offending student is a member of a racial or religious minority who perceive school officials as systematically discriminatory toward their children. The financial and emotional cost to all parties in this event will be high, and the teacher evaluation form will shift from an educational document to a legal document with its meanings contested by attorneys. Curiously, while there is a public perception that aggrieved dismissed teachers usually win their cases against school boards, several studies have clearly shown that school boards have won the majority of such cases (Stronge, 2006).

Viewpoints

Effects of School Environment

Evaluation of teacher competency does not exist in a vacuum. How any teacher performs encompasses necessarily issues that go beyond any individual teacher. While Hollywood films romanticize the always successful struggles of superb teachers working in schools with incompetent leadership, inadequate funding and crumbling infrastructures, teaching in the real world is directly impacted by these realities which can contribute significantly to inferior teacher performance. Evaluators need to be trained in keeping tabs on educational best practices. They need to synthesize that awareness with an awareness of the social, economic, and political realities of the school. Asking an evaluator to bring all those skills to teach competency is asking a great deal—and that might explain why the literature concerning how teacher evaluators should be trained and evaluated is not very extensive (Ribas, 2005). When so much controversy prevails regarding what teacher competency is and how it can be best assessed, it seems overwhelming to begin raising questions about how many teacher evaluators have comprehensive training in personnel assessment. Yet some school districts take on that unenviable task while others feel it an unreasonable demand.

Do Evaluations Improve Teaching?

Another area of controversy is to what degree, regardless of the teacher competency evaluation method used, teachers actually gain in expertise from the experience. Some educational researchers believe that self-reflection and peer interaction are far more effective than any system of formal teacher competency evaluation in helping struggling teachers achieve greater professional competence. This belief has great currency in one of the most rapidly changing facets of teaching, technology integration. The optimal use of educational technology is not tested through exams like PRAXIS , and even classroom observations of technology can be extremely hard to assess since technology breakdowns will derail the best of teacher-designed lessons. Further, educational technology assessments have a difficult time keeping up with the speed of technological change in the 21st century. It is not uncommon for a teacher under evaluation to have his or her technology utilization assessed by a simple check-off system such as, "Yes, technology has been effectively used to reinforce the lesson." "No, technology has not been effectively used" (Whale, 2006).

Using Scoring Rubrics

Yet another tricky issue involves differing professional opinions as to whether a universally applicable scoring rubric could be practically employed for all teaching evaluations (Flowers & Hancock, 2003). A rubric is a rigid scorecard with which defined teacher behaviors are assigned relative value. Its utility is strongly critiqued by educators who question the absolute value of scoring rubrics for students as well as teachers on similar grounds (Marshall 2006). Teaching is not an easily quantifiable phenomenon. Supporting this fact is the often heard comment individuals make about a teacher in their past, who they believed was incompetent yet whose value as an educator they later came to appreciate. There is a time release effect in teaching, with wide differences as to when students successfully learn. Evaluations of teacher competency might offer approximate indicators about the efficacy of teacher's instructional behaviors - but that might never indicate to what degree a teacher has been successful in catalyzing integrated learning.

Prevailing Educational Philosophies

Finally, how teacher competency is conducted depends on the core educational philosophy of educators and the prevailing educational objectives where they work. If a teacher believes that the purpose of teaching is primarily to prepare students for the future workforce, then their teaching strategies will largely incorporate lectures, memorization, and regular high-stakes testing. Their success as teachers could only be assessed in light of that teaching philosophy and the alignment of that philosophy with district and state objectives. Yet what if a teacher using this philosophy fails to reach many of his or her students because the students are more interested in learning how to think for themselves rather than their status as future workers?

The Critical Incident

A contested concept among educators is the critical incident, a term a teacher evaluator might use to describe a particularly revealing moment during a teacher observation that indicates a tangle of interrelated teacher behaviors in need of improvement. Such a moment, like a religious epiphany, might not hold great significance if considerable within the context of an entire teaching year. Any teacher observer brings a degree of what social scientists call "the Hawthorne effect" (Olson, Verley, Santos, & Salas, 1994). This refers to the fact that experimenters, simply by their presence around the people they are studying, bring a degree of change in the behavior of the observed. Whether or not a school principal versus a fellow teacher should assume the evaluator role in terms of which creates a more significant Hawthorne effect on the observed teacher is a subject worthy of more extensive thought.

Terms & Concepts

Artifact: Any form of evidence, though usually in the form of text, that is used as supportive evidence in the process of evaluating teacher competency.

Behavioral Observation Scale (BOS): A form of teacher assessment based upon matching observed teacher classroom behavior with a series of printed statements describing specific instances of typical teacher behavior.

Behaviorally-Anchored Rating Scale (BARS): A form of teacher assessment based upon matching observed teacher classroom behavior with specific examples of variously effective teacher behaviors demonstrated by established teachers.

Competency: A contested term that is generally acknowledged to refer to the observable actions of a teacher in light of the standards for teacher behavior defined in that teacher's contract, and reflecting a standard for an educational professional defined by a teacher's school district.

Critical Incident: A moment in a classroom that offers a teacher-evaluator a particularly lucid instance of an issue that needs to be critically explored with the evaluated teacher.

Debriefing Interview: An interview at the close of formal observation(s) that helps the evaluated teacher change his or her behavior in a positive direction, or eases the path for the evaluated teacher to seek alternative employment.

Formative Evaluation: An evaluation conducted with the intention of helping a teacher improve his or her performance for the duration of the school year and beyond.

Interstate New Teacher Assessment and Support Consortium (INTASC): A national organization of state education officials and members of national educational organizations attempting to define a nationwide, uniform policy of teacher licensing and professional development.

National Board for Professional Teaching Standards (NBPTS): An organization offering a national system of voluntary certification for teachers meeting their definition of rigorous standards.

PRAXIS: The name assigned to a number of standardized tests administered by a national testing service that evaluates the subject area competency of a teacher along with his or her knowledge of pedagogical practice. Some states specify that a teacher's license can only be issued to a candidate who has scored at a level indicating competency on the PRAXIS scale, while other states substitute other measures.

Scoring Rubric: Used commonly as a grid or other configuration of expectations for student performance. There have been sporadic experiments over decades in designing performance rubrics for educators.

Summative Evaluation: A form of evaluation used to determine the career future of an educator by summarizing a series of observations and other materials used in teacher performance evaluation.

Bibliography

Carlo, M. (2012). How to use value-added measures right. Educational Leadership, 70, 38–42. Retrieved December 15, 2013, from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=83173916&site=ehost-live

A Coherent system of teacher evaluation for quality teaching. (2015). Education Policy Analysis Archives, 23(14–17), 1–22. Retrieved January 15, 2016 from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=100997666&site=ehost-live&scope=site

Donaldson, M. L., & Donaldson Jr., G. A. (2012). Strengthening teacher evaluation: What district leaders can do. Educational Leadership, 69, 78–82. Retrieved December 15, 2013, from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=74999571&site=ehost-live

Flowers, C. P. & Hancock, D. R. (2003). An interview protocol and scoring rubric for evaluating teacher performance. Assessment in Education: Principles, Policy, & Practice, 10 161–169. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=10849124&site=ehost-live

Gill, B., J. Bruch, & K. Booker (2013). Using alternative growth measures for evaluating teacher performance: what the literature says. REL 2013-002. Washington, DC: Dept. of Education, Inst. of Education Sciences, Natl. Center for Education Evaluation and Regional Assistance, Regional Educational Laboratory Mid-Atlantic. Retrieved October 15, 2014 from EBSCO Online Database ERIC.

Goldhaber, D., & Walch, J. (2014). Gains in teacher quality. Education Next, 14, 38–45. Retrieved October 15, 2014 from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=eue&AN=92675910&site=eds-live

Lieberman, M. (1998). Teachers evaluating teachers: Peers review and the new unionism. Piscataway, NJ: Transaction Publishers.

Olson, R., Verley, J., Santos, L. & Salas, C. (1994). What we teach students about the Hawthorne studies: A review of content within a sample of introductory I-O and OB textbooks. The Industrial-Organizational Psychologist, 41 , 23–39. Retrieved September 23, 2007, from http://www.siop.org/tip/backissues/Jan%2004/pdf/413_023to039.pdf

Papay, J. P. (2012). Refocusing the debate: Assessing the purposes and tools of teacher evaluation. Harvard Educational Review, 82, 123–141. Retrieved December 15, 2013, from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=73895692&site=ehost-live

Peterson, K. D. (2000). Teacher evaluation: A comprehensive guide to new directions and practices . (2nd Ed.). Thousand Oaks, CA: Corwin Press.

Ribas, W. B. (2005). Teacher evaluation that works!!. (2nd Ed). Westwood, MA: Ribus Publications.

Spector, J. M. & de la Teja, I. (2001). Competencies for Online Teaching . (Report EDO-IR-2001-09). Washington, DC: Office of Education Research and Improvement. (ERIC Document Reproduction Service No. ED456841). Retrieved September 7, 2007 from EBSCO Online Education Research Database. http://www.eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/19/37/f4.pdf

Stronge, J. H. (2006). Evaluating teaching: A guide to current thinking and best practice, (2nd Ed. Thousand Oaks, CA: Corwin Press.

Teacher Evaluation Kit: Complete Glossary . (n.d.). Retrieved September 23, 2007, from The Evaluation Center of Western Michigan University http://www.wmich.edu/evalctr/ess/glossary/glossary.htm.

Whale, D. (2006). Technology skills as a criterion in teacher evaluation. Journal of Technology and Teacher Education, 14 , 61–74. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=19554794&site=ehost-live

Whitford, B. L. & Jones, K. (Eds). (2000). Accountability, assessment, and teacher commitment: lessons from Kentucky's reform efforts. Albany, NY: State University of New York Press.

Wilkerson, J. R. & Lang, W. (2007). Assessing teacher competency: Five standards-based steps to valid measurement using the CAATS model. Thousand Oaks, CA: Corwin Press.

Wilson, R. M. & Blum, I. (1981). Evaluating teachers' use of inservice training. Educational Leadership, 38 , 490. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=9398196&site=ehost-live

Suggested Reading

Bolyard, C. (2015). Test-based teacher evaluations: accountability vs. responsibility. Philosophical Studies In Education, 46, 74–82. Retrieved January 15, 2016 from EBSCO Online Database Education Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=108502959&site=ehost-live&scope=site

Grosz, K. S. (1995). Strategies for evaluation. College Teaching, 43 , 76–77. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=9507250480&site=ehost-live

Hall, K. & Harding, A. (2002). Level descriptions and teacher assessment in England: Towards a community of assessment practice. Educational Research, 44 , 1–16. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=6411049&site=ehost-live

Harlen, W. (2005). Trusting teacher' judgement: Research evidence of the reliability and validity of teachers' assessment used for summative purposes. Research Papers in Education, 20 , 245–270 Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=18396368&site=ehost-live

Manley, R. A., & R. Zinser (2012). A Delphi study to update CTE teacher competencies. Education + Training, 54, 488–503. Retrieved October 15, 2014 from EBSCO Online Database Business Source Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=79915409

Marshall, K. (2005). It's time to rethink teacher supervision and evaluation. Phi Delta Kappan, 86 , 727–735. Retrieved September 23, 2007 from EBSCO Online Database Educational Research Complete. http://search.ebscohost.com/login.aspx?direct=true&db=ehh&AN=17188277&site=ehost-live

Essay by Norman Weinstein, M.A.T.

Norman Weinstein is a writer and educator who taught at several universities and participated nationally in Writers-in-the-Schools programs and the National Writing Project. He is the author of several collections of poetry and books on Gertrude Stein and jazz. His writing about music, literature, and architecture has appeared in “The Christian Science Monitor” and “Architectural Record.” He contributed a chapter to “Classics in the Classroom: Using Great Literature to Teach Writing,” and has written about educational technology for EDUCAUSE.