Portfolio > 4b > 1. Assessment > Course Notes
Search this site

4. A collaborative consultant with skills in:
(b) Evaluation and assessment

Course Notes for Assessment and Evaluation - 1998

Jim Tucker


Sunday, 26 July, 1998

Session 1

"For we dare not make ourselves of the number, or compare ourselves with some that commend themselves: but measuring themselves by themselves, and comparing themselves by themselves, are not wise." 2Cor 10:12 KJV

"How would we know who the smart people are?"  Response made by the boy who was scoring highest on tests when an instrument was trialed that assisted with memorizing facts for regurgitation.

When we measure, what should we measure against?  A standard relative to others, or an absolute standard.

"Be ye therefore perfect, even as your Father which is in heaven is perfect." Matt 5:48

Many of the values driving assessment are for the purposes of self-gratification, for making someone feel valued for what they do, rather than who they are.

Read the "Poisoned Apple".  Using bell curve assessment is wrong.

It is wrong to think that the "norm" is normal.

Standardized tests may be true for a population, but not for an individual.

What is elitism?  Who are the elite?  The privileged class, not those with more wisdom, knowledge, intelligence.

Read Gerald Bracey.  He debunks teaching/learning assessments published in the popular press, based on statistics.

e-valuate : fundamental assumption is that value is there.

How am I as a professional in my job evaluated?
In church work, largely by self assessment, knowing that I am achieving the goals I set for myself.  Subtle indirect.

Validity as a measure of my performance, on a scale of 0-5:  4
This is my perception, only subjective.

Session 2

4 Definitions




Gathering data to reach a conclusion


Gathering the data - from highly subjective to intensive, detailed study
   * Having child assessed - usually parent means much more than just collecting the data
   * Getting an account balance on check account


Making a value judgment (in terms of criteria) on the basis of the data
Drawing out a value
Criteria - established standards, a reflection of the agreed values; agreement is incredibly hard to get because values are hard to define.
Setting the standards and the criteria controls the evaluation.
The leader has an important role in getting consensus on the values, standards, criteria


To set a value
Example: real estate appraiser
In this course, we are doing assessment and evaluation only.

Question critically your own values, and the way you make evaluations and appraisals.

Session 3

Judge, Jury, evaluation and assessment

The POINT: the jury, or the judge, makes the final decision

The judge uses the criteria of the law as interpreted by history, precedent, case law.

The jury does not decide on the basis of law, they decide on the basis of presentations made.

This is an evaluation process, by the judge or jury, as the judgment.  The presentation is the assessment.

But the assessments are not unbiased, they are slanted.  It brings with it its own values.  In learning assessment, tests are selected according to the results we know they will give us, the ones we are looking for.

Never make a decision until you have heard the other side of the story.

2 kinds of evaluation:

Formative - Summative

   - or -

Process - Product

These are the 4 most important words in the evaluation literature, short of the word evaluation.

Evaluation will include 2 aspects.
The process (formative) and the resulting product (summative).
This applies best in situations when you reach a finish point and the product is completed.

In a continuous process, how do you do this?  You take a snapshot at a point in time, and evaluate what you have then as the product.

There is a risk with these words that we use them as labels, and use them in appropriately, and hide behind them.

Assignment for tonight
"Program Evaluation" handout

Session 4


Jim's model
Many dissertations don't propose an hypothesis, they only ask a question.  This is OK.
Evaluation = answering an important question.
You define important, but there must be importance associated with it.
The question has to be one that finding an answer to will bring about changes.


the genders are equal


gender split in management sb the same as in workforce


eliminate causes of imbalance

Question: What are the reasons for the gender balance found in management being different from the workforce as a whole?
Jim's Is gender balance an important quality to have in this agency?

Is it fair to bring a bias as an evaluator?  Yes, but it is not hidden, there is no attempt to ask a trick question, the bias is known up front.  It is OK to ask loaded questions.

Evaluation is mostly value.
There is no bullet-proof question you can ask to do an evaluation.

Give three evaluative questions and corresponding methodologies for evaluating the Andrews University Leadership Program.

1. How does the program ensure that collaboration and cooperation are fostered among participants?

  1. Is the technology - email, bulletin board, newsgroups, internet meetings, web - integrated in such a way that it motivates collaboration?
  2. Is the rationale and structure of the regional groups designed to encourage collaboration and cooperation, both within the groups and across the groups?
  3. How is the individual nature of the measurement instruments (IDP and portfolio) reconciled with the collaborative nature of the Leadership program?


  1. Determine through a literature survey an accepted definition of the terms collaboration and cooperation.
  2. In the context of the Leadership program, and in particular, the three modalities raised in the question, investigate whether collaboration and cooperation in the program can be demonstrated.
    For all 3 modalities:
        1. Interview Dean and Program Coordinator re actions taken to foster this.
        2. Interview faculty.
        3. Interview/question students in the program.
    For collaboration via technology:
        4. Collect usage data from system log files showing use of technology.
    For collaboration via Regional Groups:
        5. Visit Regional Groups and observe meetings.
  3. For each modality, evaluate its success in fostering collaboration and cooperation.
  4. Make recommendations.

2. What failsafe measures are used to ensure that the selection process for admission into the program chooses participants who are self-motivated?


  1. Determine through literature survey the tried and true tests for measuring self-motivation?  What are the indicators of self-motivation?
  2. Determine what the current system is.
    1. Interview Dean and Program Coordinator re selection criteria.  Request written material documenting internal procedures and public documents.
    2. Interview faculty.
    3. Interview/question students in the program.
    4. Review student CVs and Statements of Purpose.
  3. Evaluate the success of the measures in use.
  4. Make recommendations.

3. In view of the fact that the portfolio demonstration of competencies replaces comprehensive exams, how is the acceptable degree of competency in each area determined?


  1. Research via a literature survey accepted methods for determining whether competency has been achieved
  2. Interviews
    1. Interview Dean and Program Coordinator to determine their criteria and the methods used.
    2. Interview faculty.
    3. Interview students who have had signoff on competencies.
  3. Evaluate the consistency of application.
  4. Make recommendations.

A document was produced in the class summarizing 14 of the ideas written up by the different groups who worked on this in the course.

Monday, 27 July, 1998

Session 1

What is an ideal learning environment?
One that is flexible, loving, accepting, adapts to the material being taught and the individuals participating.
Stimulating, affirming, non-threatening.

Asking this question is a waste of time if we don't plan to get there.  To get there means change.  You need to start with assessment of where we are.  Then make incremental changes to move to where we want to be.  Further assessment to collect data on whether we are moving toward where we want to be.

AERA Conference on assessment, San Diego, April 1998
Mehrens, Michigan State - ask for copy of his paper on the consequences of assessment.

Conclusion: The consequences of assessment are meaningless.  This does not mean assessment is bad, it is just not driving us where we want to go.

Robert Linn, University of Colorado, Boulder.  "Assessment and accountability".  Gave the same conclusion.

Also Douglas Carnine

"What gets measured gets done."  Waterman quoting Haire.
But we measure in education and nothing changes?

Temptation to ask questions for which answers can be found, but which have only limited value.  We prefer to do this rather than ask the really important questions that it is hard to find answers to.

There is a lack of a feeling of accountability.  It doesn't really make a difference that can be measured quickly or easily.  Feedback loop is too long.  This contrasts with neurosurgeons and airline pilots.

Session 2

Complex change occurs only if 5 things all happen. (Ambrose, 1987)
Action Plan

Add Communication between Vision and Skills

If one or more of these missing, how do you go about solving that?
Can be used as a process control mechanism, ie, all pieces must be present for the project to fly.

More Assessment Terms
Formal -
Informal -

"I would not have seen it if I had not believed it".  You see what you believe.  Eisendeich?

What is a formal assessment test?
Medicine EKG, urine analysis
Business Financial statements, ratios ROI
Most formal tests are standardized

Informal assessments
Personal observation
Surveys, open-ended questionnaires
Anything in which you are not bound by a particular mode of response

        Standardized (multiple intelligences, emotional IQ)
            Norm referenced (most educational assessment tests)
            Criterion referenced

Norm referenced tests assume that there is a norm that you can compare against.

These tests work only for groups.  They are invalid for an individual.  It may have marginal value for a classroom, but more valid for whole school or school district, to compare this year with last year.  And then, only valid if the assumption about the norm being normal is valid.

So what are the legitimate uses of the tests?  When used for accountability, but not for individuals.

School reports showing position on a graph is inappropriate, "child abuse".

Most tests have an appropriate use.  Avoid the inappropriate use of tests.

Session 3

Standardized assessment will never help achieve learning objectives.  It is only for making comparisons with the norm.

We need "appropriate learning".

What does the student know?

What is known?

What can the student do?

What can be done?

How does the student think?

What are the thought processes?

Rate of acquisition: how much new stuff can a student learn in a given period of time?

How do you know you are at the optimum rate of acquisition?  By the amount of retention revealed in tests.  The tests have to be tailored according to starting point.

New things can be learned at around 7 (up to 7 2) new things maximum, to allow it to be integrated by the brain.  Also applies to learning new physical skills.  Spaced practice.  Learn basketball, then break for 6 hours, can perform better than if had continued to practice all the way.

Teach to the specific knowledge base and learning styles of the students.

Session 4 - Variation

How variation can be allowed, and what does it have to do with assessment.

What is the purpose of education?  It used to be to prepare people for adult life.

Spache found that limit of tolerance in schools is 6 months, ie students referred to special help if a student is more than 6 mths behind say.  But take the conservative view that "normal" IQ is 80-120. 

For a 6-year old
IQ 80 = 4.8 year equivalent
IQ 120 = 7.2 year

This is 14.45 mths, more than twice the range of tolerance.

We are teaching to the middle of normal, and eliminating the extremes, creating behavior problems.

We concentrate too much on the process and too little on the outcome?

Curriculum assessment, curriculum-based evaluation.
Criteria-based measures.

Measures for Learning
Note: These are assessments (collections of data) only, there is no value judgment here
1. time on task - are they concentrating?
2. task completion - do they finish the work?
3. task comprehension - do they understand it?  This is the ultimate goal

Assessment = to sit beside

Proactive interference
Reactive interference
Coactive interference

Incremental Rehearsal
An instructional strategy which presents information in small increments and allows for adequate rehearsal (repetition) to ensure automaticity.

This is how you learn to play a video game, or use a word processor

See 2nd article in current issue of JRCE: Would Jesus give an F? by James A Tucker.
Jesus does not give an F, but we may choose an F.


Find 3+ questions that we could ask in our own workplace, and develop the methodologies we would use to research these questions.

Keep a journal of the time spent developing the questions and the methodology.  The time should add up to about 35 hours.

For another 3 credits, actually do this.  Ask questions that I could study and answer.

I wrote the assignment on three questions from my workplace, but did not do the work for other three credits.

Return to 4b1 EDUC689 Assessment and Evaluation

Created: Tuesday, January 04, 2000 05:15 PM
Last Modified: Saturday, May 10, 2003 9:30 PM