Task 4.  Censor

Available marks: 24

Write a program that censors naughty words in a text file. For the purposes of this task, naughty words are defined as those provided at the beginning of the input.

The censor's output replaces each character of a naughty word by an asterisk ("*") character, so the length of the original word (though not its naughtiness) is preserved.

Definitions and Rules

  1. The input file consists of the number of naughty words on the first line (one or more), then one word per line up to the specified number of words. The remainder of the file consists of any number of lines of text to be censored.

  2. The output should consist of the censored text (whether or not the censor made any changes). Don't print the naughty word list.

  3. Each naughty word will contain at least one character. The characters may include non-letters.

  4. A naughty word is replaced only if it is surrounded by non-letters or the start or end of a line. For example, if "cat" is designated a naughty word, the line
    the cat sat on the concatenation of "pole" and "cat" giving "polecat"
    would be replaced by
    the *** sat on the concatenation of "pole" and "***" giving "polecat"

  5. Upper and lower case are distinct, so "polish" and "Polish" are considered different words.

  6. (Advanced rule). Word order is relevant. For each line, replace all occurrences of the first specified naughty word before trying the next naughty word, and so on. This matters if any naughty words contain asterisks. For example, if the naughty words include, in order
    cat
    [***]
    
    then input of the form
    the word [cat] is just a bit naughty
    
    must produce
    the word ***** is just a bit naughty
    
    rather than
    the word [***] is just a bit naughty
    

Example

Input Output
4
hello
sailor
Jack
tar
Hello, sailor.
I'm Jack Tar.
That's not "jack tar".
hello
sailor
Jack
tar
xhello
xsailor
xJack
xtar
hellox
sailorx
Jackx
tarx
Hello, ******.
I'm **** Tar.
That's not "jack ***".
*****
******
****
***
xhello
xsailor
xJack
xtar
hellox
sailorx
Jackx
tarx

Testing

Test files are provided in two batches: A and B, named in the form testA2.txt, testB1.txt etc. The A series files do not use asterisks, the B series do.

Assessment

When you believe you have completed this task, ask the judges to assess it. They will ask you to run the program on one or more sets of data.

Up to 16 marks may be awarded if the program substantially works, except for naughty words containing asterisks.