Most teams recognised the simplicity of the basic task, and produced a correct answer. A common practice that detracted from the readability of many solutions was the use of ASCII codes for the digit characters '0' to '9'. For many students
If ASC(ch$) <= 57 AND ASC(ch$) >= 48 Then ...seems to make more sense than
If (ch$ >= "0") And (ch$ <= "9") Then...In fact, built-in operations in some languages can make the test even more human-friendly:
If IsNumeric(ch$) Then ... ' BASIC
if ch in ['0'..'9'] then ... { Pascal }
if (isdigit(ch)) ... /* C */
The extensions required some input history to be retained, and this was generally well recognised by teams that made a serious attempt at the extra points. The ability for strings to be easily appended in BASIC made solutions to the second extension in this language a little easier than for Pascal or C.
For the first extension, it's necessary to keep track of the number of digit sequences encountered (to generate the correct letter), and to determine when the current character is the start of a digit sequence, to interpolate the letter in brackets.for each input character c if c is special display its replacement else display c
The second extension incorporates an array of strings to retain the original text. Depending on the string processing supported by the language used, the end of the digit sequence may also need to be identified to terminate the string appropriately.
Algorithm 1 (basic task):
An equivalent algorithm replaces c by "_" and then displays c.
for each input character c if c is a digit display an underscore character ("_") else display c
Ends of lines are assumed to be echoed, that is, when the end of an input line is encountered the output line is terminated as well.
The first extension requires the program to keep count of the number of sequences, and to be able to tell when c is the start of a sequence. The new lines are in blue.
Algorithm 2 (first extension):
The best way to determine what the previous character was is to save the value of c before advancing it the the next character. It is set to a non-digit at the start of each line. Alternatively, if complete lines are stored the previous character can be picked out of the line.
set counter n to 0 for each input character c if c is a digit if previous character is not a digit increment n display corresponding letter in brackets display "_" else display c
Displaying the lower-case letter corresponding to a small integer n is possible
for all the common programming notations.
Say n = 1 corresponds to the letter a, 2 to b and so on,
then the mapping is
| Language | Mapping |
|---|---|
| BASIC | CHR$(n-1 + ASC("a")) |
| Pascal | chr(n-1 + ord('a')) |
| C | n-1 + 'a' |
| Java | (char) (n-1 + 'a') |
The second extension requires some thought. As well as keeping track of the number of digit sequences, the original sequences themselves need to be remembered too. The counter n is an obvious index to use into an array of strings, one for each "answer":
Algorithm 3 (second extension):
The trickiest part can be the step append c to the n-th string, since not all languages support string concatenation (joining operations) easily. You should investigate how this is done in your favourite notation.
set counter n to 0 for each input character c if c is a digit if previous character is not a digit increment n display corresponding letter in brackets display "_" append c to the n-th string else display c display "Answers:" for each value i from 1 to n display letter corresponding to i in brackets, then answer i
The problem normally falls into the character-stream type, where the input history only needs to comprise
However, solutions can also take the alternative approach of reading and processing input a line at a time. This makes counting lines very easy, but introduces an extra loop (an outer loop to read a line into a string and an inner one to extract characters from the string). The same algorithms (both correct and incorrect) were evident in both line-based and character-based solutions.
Algorithm 1:
Algorithm 1 has two serious errors:
set counters to zero for each input character c if end of line has been reached add 1 to line counter else if c is a space add 1 to word counter
A common attempt at overcoming the second difficulty is to add one to the word count for each line read. The first problem is fixed by counting spaces only if the previous character wasn't also a space:
Algorithm 2:
Note the introduction of a new variable prev providing some more context. It's usually the previous character to c, but of course must be set specially at the beginning of a line, where there is no previous character.
set counters to zero set prev to space for each input character c if end of line has been reached add 1 to line counter add 1 to word counter set c to space // then set prev below else if c is a space and prev isn't a space add 1 to word counter set prev to c
Algorithm 2 almost works. With "normal" text, whether close-spaced or spread out, it does seem to count words properly. For example, in processing the following input line the word counter is incremented only at the points indicated, where there's a space preceded by a non-space. After adding one for the end of the line, the correct count (3) is obtained.
pack my box ^ ^
Have you spotted the flaw in Algorithm 2? Think about it for a little while before checking.
What's wrong with Algorithm 2?
There are two situations where the wrong answer will be obtained:In the first instance the space following the last word is counted, so we would get 3 rather than 2 for the above example, and then 4 when the end of line is reached:
- when there are spaces at the end of a line; and
- when a line contains no words at all (whether or not it contains spaces).
pack my box ^ ^ ^In the second case the end of line increment is again inappropriate.
Both cases expose the logical error: adding 1 to the word counter at end of line is wrong. It was a bad way to try to account for the last (or first) word on a line.
In general, having to make adjustments of this kind to an algorithm often implies that there is something fundamentally wrong with the original strategy.
Let's reconsider how to identify a word reliably.
Rather than concentrate on spaces following a word, consider the start of a word. Each word is preceded either by a space, or by the beginning of a line. In Algorithm 2 prev is already to set to a space at the beginning of a line, which is exactly what we need for the algorithm to be able to handle the first word in exactly the same way as the other words.
The only changes to Algorithm 2 needed to correct it are to remove the offending increment and to reverse the space/non-space comparison:
Algorithm 3:
How does Algorithm 3 differ from Algorithm 2? As soon as the first character of a word is detected, the word counter is incremented. It's not necessary to wait for the appearance of a (potentially ambiguous) character following a word. If there are no words (even with lots of spaces), nothing happens. For the above example, the increment points are now at the appropriate places:
set counters to zero set prev to space for each input character c if end of line has been reached add 1 to line counter // don't touch word counter set c to space else if c is not a space and prev is a space add 1 to word counter set prev to c
pack my box ^ ^ ^
Besides the minority of submissions that employed the method described in the next section, one solution was particularly notable. It used the built-in date manipulation functions of Visual Basic to do almost the entire conversion. Knowing when not to re-invent the wheel is the mark of a good designer.
Leap-year checking was avoided by many teams, not so much because the test itself is difficult to code, but rather that it doesn't fit at all well with the collection of constants generally required by the main algorithm. A poorly-written program may even need two sets of numbers to handle leap years and non-leap years.
Assume the program has available to it the number of days in each month of the specified year (so the number of days in February can be set to 28 or 29 according to whether or not it's a leap year). Why not get the computer to determine the number of days up to the end of each month, not with a long literal addition but with a loop processing one month at a time? Once this progressive total passes the "target" day, you know which month must be represented.
The process very much like the following measurement problem:
I'm given a staircase and a short ruler. The heights of the steps are slightly different, but I can measure each of them individually with the ruler. If a frog can jump to a certain height h, which step would it land on?
Besides an array to hold the number of days in each month, we need two integer variables:
| Variable | Purpose |
|---|---|
| m | the month currently being checked (like the step number) |
| total | the number of days to the end of month preceding m |
Let d be the day number accepted from input (assumed for the present to be a valid day). Using the notation A[i] to mean the i-th element of the array A,
Algorithm 1 (basic task):
When the loop terminates, m is the correct month number, since we know that the expression
for each month m record the number of days in m in daysin[m] set total to 0 set m to 1 while d > total + daysin[m] add daysin[m] to total add 1 to m
is false (because the loop stopped), henced > total + daysin[m]
but it's also true thatd <= total + daysin[m]
(why?)d > total
From the definition of total, the day of the current month is just d-total.
* * *
Algorithm 1 doesn't check if d is really a valid day, and it doesn't handle leap years. Illegal day values are either 0 or negative (we can check d directly), or beyond 31 December of the specified year. Algorithm 1 will probably terminate with a run-time error if d is too large, when an attempt is made to inspect the non-existent daysin[13]. The algorithm should be modified so that this situation is detected and the loop stops:
Algorithm 2 (basic task with error detection):
Some languages (such as C and Java) provide conditional Boolean expressions. These permit expressions of the form A && B (meaning A and B) where B is not evaluated at all if A is false, since the entire expression must be false regardless of the value of B. A is often specified to "guard" array indexing in B, which may otherwise be out of range. Conditional expressions allow the loop and termination test to be slightly simplified (although this is largely a matter of opinion):
read y and d for each month m record the number of days in m in daysin[m] set total to 0 set m to 1 while m < 12 and d > total + daysin[m] add daysin[m] to total add 1 to m if d <= 0 or d > total + daysin[m] display "Date out of range"
Algorithm 2a (alternative loop formulation):
read yr and d for each month m record the number of days in m in daysin[m] set total to 0 set m to 1 while m <= 12 && d > total + daysin[m] add daysin[m] to total add 1 to m if d <= 0 or m >12 display "Date out of range"
The simple leap year test is whether the year number is divisible by 4. The full test could have two parts, according to whether the year is also a century year (ends in 00):
where a mod b is the remainder when a is divided by b. A more compact form is(year mod 100 = 0 and year mod 400 = 0) or (year mod 100 <> 0 and year mod 4 = 0)
which means "either a quad-century or a quad-year unless also a century", or evenyear mod 400 = 0 or (year mod 4 = 0 and year mod 100 <> 0)
which means "a quad-year and either a quad-century or not a century". Use whichever seems most natural.year mod 4 = 0 and (year mod 400 = 0 or year mod 100 <> 0)
The final algorithm uses the leap year calculation at only one point, to adjust the number of days in February:
Algorithm 3 (extension):
The solution as finally coded should call a procedure or function containing the above algorithm within a reading loop that looks for the 0000/0 terminator.
read yr and d for each month m record the number of days in m in daysin[m] if yr is a leap-year add 1 to daysin[2] set total to 0 set m to 1 while m < 12 and d > total + daysin[m] add daysin[m] to total add 1 to m if d <= 0 or d > total + daysin[m] display "Date out of range" else display d-total display name of m-th month display yr
The extension is really a separate problem, and was completed by checking the current input string against all previous strings saved in an array.
The elimination method is a natural analogue of a pencil-and-paper approach where each letter of the test phrase is crossed off the key phrase. If a match can't be found for any letter, the process can stop and "NO" displayed. If all letters match, the answer is "YES". Implementation consideration and efficiency issues are discussed in the next section.
The frequency method is more suited to computer solution because of the ease with which the number of occurrences of each possible symbol can be recorded (compared to the inconvenience of drawing up a table on paper). Since the order of occurrence of the letters is irrelevant, if, for example, the letter e occurs 2 times in the key phrase but 3 times in the test phrase, the test phrase will fail to match. The problem reduces to
Algorithm 1 (search and elimination):
Although this is a direct method it is far from easy to code neatly. Each time a character is found the inner search has to stop, disrupting the predictable flow of the algorithm. At the end of each loop it is then necessary to determine whether the loop stopped because of the successful match or because no match could be found, introducing further complexity.
read key phrase for each test phrase copy key phrase to ref for each letter c in test phrase for each character r in ref if c = r remove r from ref // or replace by space exit loop if search failed exit loop if search failed display "NO" else display "YES"
For 26-character phrases, the maximum number of comparisons is 26*27/2 = 351. (For longer phrases the fact that there are only 26 different symbols limits the total comparisons to about 13n for strings of length n. The maximum value is 738 for 60-character phrases.)
If the test phrase contains an e we really don't care which e in the key it matches. All that matters is whether there are more es in the test phrase than in the key phrase.
Algorithm 2 counts the number of letters in the key phrase and saves the 26 counts in an array. For each test phrase it applies the same tally operation (in a procedure or function) and then checks whether any of the test counts exceeds the corresponding key letter count.
Algorithm 2 (frequency counts):
read key phrase tally letters in key phrase, store in key_freqs array for each test phrase tally letters in test phrase, store in freqs array for each possible letter c if freqs[c] > key_freqs[c] exit loop if match failed display "NO" else display "YES"
If the maximum number of different phrases (100) occurs, as many as 100*101/2 or 5500 string comparisons will be needed to determine that none of them is a duplicate of its predecessors. This is another example of a quadratic algorithm, this time in the number of inputs. Unfortunately, more sophisticated methods to store the strings (search trees, hash tables) are beyond the scope of this discussion. They are an important part of university computing courses, though.
Most attempts correctly extracted the useful content from the input and assigned them to elements of a 3x3 array. (Some Visual Basic solutions accepted data from 9 entry boxes rather than reading from the test files; this approach is acceptable since the time saved in not having to parse an input file is offset by time to design the form and perform the interactive tests.) The methods used to identify winners included
Teams that attempted the extension (identifying impossible board positions) did so with varying success. Multiple winners can be handled fairly easily, and counting schemes can easily check for an inappropriate number of symbols.
Identifying winners can be done with any of the methods mentioned in the Judges Report. If a counting approach is used, however, the other checks are more easily completed. For example, if we add up the number of 0s and Xs in every row and every column (but not the diagonals), we'll end up counting every symbol twice. Thus if no winner has been found, the total number of symbols will be 2*9 = 18 if every square is filled (a draw), or the game is incomplete otherwise.
Algorithm 1 (valid board assumed):
How the board array elements are stored and inspected will vary with the language, of course. For example, inspecting a row in Pascal could look like:
set totalX and total0 to zero read in board position to board array for each row count Xs and 0s add to totalX and total0 counters if X or 0 count = 3 declare winner for each column count Xs and 0s add to totalX and total0 counters if X or 0 count = 3 declare winner for each diagonal count Xs and 0s if X or 0 count = 3 declare winner if a winner has been found display winner " wins" else if totalX+total0 = 2*3*3 display "Draw" else display "Incomplete"
For the basic task, declaring the winner simply means saving the winning symbol in a character variable. For example, in C we could have
for col := 1 to 3 do begin if board[row, col] = 'X' then countX := countX + 1 else if board[row, col] = '0' then count0 := count0 + 1 end;
For the extension, the previous value of this variable is relevant too. Five different cases can arise when a new result is obtained:
/* result holds the result of the last row/column/diagonal check */ if (result != ' ') /* New winner */ winner = result;
| Current winner | Next result | New winner |
|---|---|---|
| Nobody | any | = result |
| any | Nobody | unchanged |
| X or 0 | same as current | unchanged |
| X or 0 | 0 or X (different) | error |
| error | any | unchanged (still error) |
Why not check for the same winner?
If the same symbol wins in two ways, the game may or may not be legal. It's possible for an "X" to be placed in a corner, say, completing a row and column simultaneously, or a diagonal with a row or column or the other diagonal. On the other hand, two rows or two columns can't be completed in one move.How can we distinnguish the legal and illegal cases?
The second scheme is already required to test for other illegal board positions, so there's no extra effort in using it to account for the double-winner problem.
- we could keep track of whether the current winner completed a row, column or diagonal; or
- we could observe that the double row or double column case requires six symbols, which only leaves three for the opposition. Hence by counting the total number of symbols on each row and column, we will be able detect the difference and report the error instead of declaring the winner.
Algorithm 2 (first extension):
The current winner is updated by replacing it by the new winner (if appropriate) or by a multiple-winner indicator, or leaving it unchanged. Writing the update algorithm out in full:
set totalX and total0 to zero read in board position to board array for each row count Xs and 0s add to totalX and total0 counters if X or 0 count = 3 update winner for each column count Xs and 0s add to totalX and total0 counters if X or 0 count = 3 update winner for each diagonal count Xs and 0s if X or 0 count = 3 update winner if X and 0 counts differ by more than 2 display "Impossible board position" else if multiple winners found display "multiple winners" else if a winner has been found display winner " wins" else if totalX+total0 = 2*3*3 display "Draw" else display "Incomplete"
Update current winner:
There are shorter ways to implement this, but be careful that you don't fold incompatible cases together.
if current is multiple no change to current else if no new winner no change to current else if no current winner replace current by new winner else if current and new winner are the same no change to current else set current to multiple
Teams that attempted the problem usually managed to score most of the available points. The output format was chosen to make it easy to produce with a single loop based on the required number of iterations; some people cheated by relying on the output device to move to a new line after each five iterations instead of coding it as such. Others, looking toward the extension (even if it wasn't completed), loaded up an entire array with values before starting to print. The limitation of attractors being found in the last 16 iterations means that only this many need be retained for checking. The checking algorithm and associated data structure are discussed below.
given a function f (p) and an initial value p0, print the first n terms in the sequence obtained using p' = f (p), ending the line after the first, sixth, ..., 5m+1-th iteration, for m = 0+.
First code the function, then write the main program to read the parameters and set up the iteration. Simple.
For the extension, it's necessary to keep some history of previous iterations. In increasing order of sophistication, the following schemes will work:
Algorithm 1 (basic task):
If you prefer to count from 1 to nit, the remainder test is made against 1 rather than 0. Most languages make it easy to format the floating point number to have 4 decimal places:
read parameters: k, p and nit for i ranging from 0 to nit-1 display p if i mod 5 = 0 then terminate output line calculate next p
| Language version | Formatting statement or expression |
|---|---|
| BASIC (most) | PRINT USING "#.####"; p; |
| Visual Basic | Format(p, "0.0000") |
| C | printf("%6.4f", p); |
| Pascal | write(p:6:4); |
| Java | Double.toString(p).substring(0,8); |
Algorithm 2 (find attractors):
Most of the extra work is performed in manipulating the stack. In a well designed program the stack operations will be separate from the main algorithm. They include:
read parameters: k, p and nit initialise stack for i ranging from 0 to nit-1 display p add p to stack if i mod 5 = 0 then terminate output line calculate next p remove last element x of stack if x occurs n places earlier in stack display n " attractors" and the last n values on the stack else display "Chaos"
2.3 4.5 1.1 7.0 4.2 6.9 1.0 4.9 9.8 6.5 ^The marker indicates the position of the oldest item on the stack, 7.0. Call the corresponding array index next_free. The last one pushed lies to its left (1.1), and it was preceded by 4.5, 2.3, 6.5 (wrapping around the end), and so on. The push algorithm becomes
So the operation push(3.3) results in the
push(item x) :- store x in stack[next_free] set next_free to (next_free + 1) mod stack_size
2.3 4.5 1.1 3.3 4.2 6.9 1.0 4.9 9.8 6.5 ^Getting access to earlier elements (for searching and printing) requires the inverse of the step
set next_free to (next_free + 1) mod stack_sizeto be computed. It's tempting to make this
set next_free to (next_free - 1) mod stack_sizebut mod operators do funny things when their left-hand operation is negative, which occurs when next_free = 0. Luckily, adding one less than the size of the stack is equivalent to subtracting 1 modulo stack_size, giving the function
previous(position pos) :- return (pos + stack_size - 1) mod stack_size