Assessing 4th Year Theses

John Shepherd

Summary

This proposal aims to make thesis marking simpler, more consistent and more reliable. The description refers mainly to Thesis B, but the intention would be to extend it to Thesis A as well. Comments are welcome on both the overall aproach and specific details.

Background

The current system of awarding marks for Thesis B uses two possibilities:

enter a mark for each of five criteria
enter a single mark for the entire thesis

The tendency in recent years has been to enter a single final mark. In neither case is there a requirement to justify the mark(s).

The single mark approach has several problems:

the basis for arriving at a single number for such a large piece of work is unclear (and maybe not even given)
the granularity is too fine; is a thesis that scores 73 "better" than one that scores 69? how do you decide that one thesis should get 82 and another should get 80?
there is the potential for the single mark to be influenced by external factors (e.g. WAM needed for scholarship)

The multiple criteria approach also has its drawbacks:

the existing criteria were developed when CSE was part of EET and not all of them seem relevant to modern CSE theses
within each criteria, the granularity is still too fine; does the thesis deserve 16 or 18 for presentation?

This proposal suggests a system that

uses well-defined and more relevant criteria
employs a reasonable level of granularity
decouples the markers from the final numeric mark

The system is somewhat along the lines of how papers are reviewed for good quality conferences. Conference reviews employ a relatively small set of criteria (originality, relevance to conference, etc.) and employ a coarse grading (definitely accept, marginal, reject, etc.). Also, reviewers are obliged to justify their "grading" via comments to both the authors and the program committee. Obviously, the criteria for undergraduate theses are quite different to the criteria for refereed conference papers, but I think we can learn something from this process, and the approach has the advantage that it should be familiar to academic staff.

The aim is to try to improve the simplicity, consistency and reliability of assessment. We define a small set of assessment criteria. Markers award a grade, not a mark, for each criterion, and supply a comment to justify the grade. The final mark is computed by the system by mapping each grade to a mark and computing a weighted-sum of the individual criterion marks.

What's required to mark a Thesis B Report:

read it (unchanged from previously)
assign 4 grades, write 4 brief comments
optionally, write a general comment on the whole project
(would be required if the project was in the running for an award)

The discussion below talks only about the Thesis B report, since this is the most highly "weighted" assessment item in the Thesis universe. (Note: there's another whole discussion waiting to be had when talking aboutthe relative weighting of Thesis Part A and Thesis Part B).

Criteria for Thesis B Report

4. Evaluation

The above criteria need to be better defined, and the meaning of each grade below probably needs to be specified with respect to each criteria.

Grades

1. Presentation	quality of written english structure of thesis (chapters/sections) logical flow of arguments effective citation and referencing
2. Background	comprehensive description of problem space reference to and analysis of other work
3. Own Work	originality of approach to the problem quality of the final results or system for a research thesis: original contribution for a development thesis: quality of software
used appropriate analystical instruments carried out analysis effectively analysed results appropriately realistic appraisal of achievements/limitations

Grade-to-Mark Mapping

A+	absolutely top-quality work, best I've seen publishable in good conference with little change
A	excellent work, does everything required results are good, could be published with some re-working
B	good quality work, but with some deficiencies would need substantially more work to be publishable
C	adequate the topic could have been treated much better
D	just satisfactory, minimal standard for a CSE thesis
E	not up to standard required of a CSE thesis
F	very much below the standard required of a CSE thesis

Some alternative suggestions for mapping grades to marks:

JAS's: A+ = 99%, A = 88%, B = 77%, C = 66%, D = 55%, E = 40%, F = 20%
Mid-points: A+ = 100%, A = 90%, B = 80%, C = 70%, D = 58%, E = 40%, F = 20%
Maxima: A+ = 100%, A = 92%, B = 84%, C = 74%, D = 64%, E = 49%, F = 20%
Numeric: A+ = 10, A = 9, B = 8, C = 7, D = 6, E = 4, F = 2 (looks less coarse-grained to students)

Criteria Weightings (Thesis B)

The weights of the individual criteria towards the final mark could be determined in several ways:

fixed for all theses: 1 = 20%, 2 = 20%, 3 = 30%, 4 = 30% (preferred model for 08s1)
different for the different thesis types, e.g.

A+	100% of the mark for that component
A	90% of the mark for that component
B	80% of the mark for that compnent
C	70% of the mark for that component
D	58% of the mark for that component
E	40% of the mark for that component
F	20% of the mark for that component

R	1 = 20%, 2 = 20%, 3 = 30%, 4 = 30%
R+D	1 = 20%, 2 = 20%, 3 = 40%, 4 = 20%
D	1 = 20%, 2 = 10%, 3 = 50%, 4 = 20%

Rationale for the above: in a development thesis, we're more interested in the final product (system) than in the literature review ... although perhaps this suggests that maybe the criteria (and the weights) should be different for development theses rather than simply the weights.

determined up-front as a "contract" between student and supervisor

determined after thesis submission by the student

Note: the percentages above are illustrative only. We can debate

John Shepherd, March 2008

Some Extra Preliminary Thoughts...

Criteria for Thesis A Report

1. Presentation	quality of written english structure of thesis (chapters/sections) logical flow of arguments effective citation and referencing
2. Background	comprehensive description of problem space reference to and analysis of other work
3. Proposal	proposed approach to the problem thoroughness/feasibility of the plan
4. Preliminary Work	results so far (by Week 11)