Week 09 Laboratory Exercises

Objectives

  • Developing Python & Shell skills
  • Exploring simple approaches to scraping data from the web

Preparation

Before the lab you should re-read the relevant lecture slides and their accompanying examples.

Getting Started

Set up for the lab by creating a new directory called lab09 and changing to this directory.
mkdir lab09
cd lab09
There are no provided files for this lab.

Exercise:
What Courses Does UNSW Have this Year - Shell

Write a POSIX-compatible shell script courses.sh which given a course prefix, e.g. COMP, prints the course codes and names of all UNSW courses with that prefix offered this year on the Kensington Campus.

courses should be sorted by course number (lowest to highest).

duplicate course codes should be removed, keeping the course whose name is alphabetically first.

./courses.sh VISN
VISN1101 Seeing the World: Perspectives from Vision Science
VISN1111 Geometrical and Physical Optics
VISN1221 Visual Optics
VISN2111 Ocular Anatomy and Physiology
VISN2211 Organisation and Function of the Visual System
VISN3111 Development and Aging of the Visual System
VISN4016 Vision Science Honours
VISN5511 The Visual System, Impairments and Implications
VISN5512 Sensory Processes and Movement
VISN5513 Orientation and Mobility Foundations: Disability, Diversity and Inclusion
VISN5521 Orientation and Mobility Techniques
VISN5522 Vision Rehabilitation
VISN5523 Orientation and Mobility in Practice
VISN5531 Development and Ageing: Implications for Orientation and Mobility
./courses.sh COMP | tail
COMP9491 Applied Artificial Intelligence
COMP9511 Human Computer Interaction
COMP9517 Computer Vision
COMP9727 Recommender Systems
COMP9801 Extended Algorithm Design and Analysis
COMP9814 Extended Artificial Intelligence
COMP9900 Information Technology Project
COMP9991 Research Project A
COMP9992 Research Project B
COMP9993 Research Project C

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest shell_courses 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab09_shell_courses courses.sh

before Monday 29 July 12:00 (midday) (2024-07-29 12:00:00) to obtain the marks for this lab exercise.

Exercise:
What Courses Does UNSW Have this Year - Python/subprocess

Write a Python script courses_subprocess.py which given a course prefix, e.g. COMP, prints the course codes and names of all UNSW courses with that prefix offered this year on the Kensington Campus.

./courses_subprocess.py VISN
VISN1101 Seeing the World: Perspectives from Vision Science
VISN1111 Geometrical and Physical Optics
VISN1221 Visual Optics
VISN2111 Ocular Anatomy and Physiology
VISN2211 Organisation and Function of the Visual System
VISN3111 Development and Aging of the Visual System
VISN4016 Vision Science Honours
VISN5511 The Visual System, Impairments and Implications
VISN5512 Sensory Processes and Movement
VISN5513 Orientation and Mobility Foundations: Disability, Diversity and Inclusion
VISN5521 Orientation and Mobility Techniques
VISN5522 Vision Rehabilitation
VISN5523 Orientation and Mobility in Practice
VISN5531 Development and Ageing: Implications for Orientation and Mobility
./courses_subprocess.py COMP | tail
COMP9491 Applied Artificial Intelligence
COMP9511 Human Computer Interaction
COMP9517 Computer Vision
COMP9727 Recommender Systems
COMP9801 Extended Algorithm Design and Analysis
COMP9814 Extended Artificial Intelligence
COMP9900 Information Technology Project
COMP9991 Research Project A
COMP9992 Research Project B
COMP9993 Research Project C

You should use the subprocess module to download the web page.
Using the same curl command as the last activity.

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest python_courses_subprocess 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab09_python_courses_subprocess courses_subprocess.py

before Monday 29 July 12:00 (midday) (2024-07-29 12:00:00) to obtain the marks for this lab exercise.

Exercise:
What Courses Does UNSW Have this Year - Python/requests

Write a Python script courses_requests.py which given a course prefix, e.g. COMP, prints the course codes and names of all UNSW courses with that prefix offered this year on the Kensington Campus.

./courses_requests.py VISN
VISN1101 Seeing the World: Perspectives from Vision Science
VISN1111 Geometrical and Physical Optics
VISN1221 Visual Optics
VISN2111 Ocular Anatomy and Physiology
VISN2211 Organisation and Function of the Visual System
VISN3111 Development and Aging of the Visual System
VISN4016 Vision Science Honours
VISN5511 The Visual System, Impairments and Implications
VISN5512 Sensory Processes and Movement
VISN5513 Orientation and Mobility Foundations: Disability, Diversity and Inclusion
VISN5521 Orientation and Mobility Techniques
VISN5522 Vision Rehabilitation
VISN5523 Orientation and Mobility in Practice
VISN5531 Development and Ageing: Implications for Orientation and Mobility
./courses_requests.py COMP | tail
COMP9491 Applied Artificial Intelligence
COMP9511 Human Computer Interaction
COMP9517 Computer Vision
COMP9727 Recommender Systems
COMP9801 Extended Algorithm Design and Analysis
COMP9814 Extended Artificial Intelligence
COMP9900 Information Technology Project
COMP9991 Research Project A
COMP9992 Research Project B
COMP9993 Research Project C

You should use the requests module to download the web page.
You should use the BeautifulSoup and html5lib modules to parse the HTML.

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest python_courses_requests 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab09_python_courses_requests courses_requests.py

before Monday 29 July 12:00 (midday) (2024-07-29 12:00:00) to obtain the marks for this lab exercise.

Challenge Exercise:
What Can't Regexes Do?

Write a regular expression which matches a unary number iff it is composite (not prime).

In other words, write a regex that matches a string of n ones iff n is composite.

Here is a test program assist you in doing this:

#! /usr/bin/env python3

from sys import argv
from re import search
from math import log, floor

assert len(argv) == 4, f"Usage: {argv[0]} <min> <max> <regex>"

min, max, regex = argv[1], argv[2], argv[3]

assert len(regex) <= 80, "regex too large";

padding = floor(log(int(max) + 1, 10)) + 1

for i in range(int(min), int(max) + 1):
    unary = '1' * i
    print(f"{i:{padding}} = {unary} unary -", "composite" if search(regex, unary) else "prime")

Download test_regex_prime.py, or copy it to your CSE account using the following command:

cp -n /import/ravel/A/cs2041/public_html/24T2/activities/regex_prime/test_regex_prime.py test_regex_prime.py

For example to test the regex ^1{7,10}$ against the integers 2 to 12, you can run

chmod 755 test_regex_prime.py
./test_regex_prime.py 2 12 '^1{7,10}$'
 2 = 11 unary - prime
 3 = 111 unary - prime
 4 = 1111 unary - prime
 5 = 11111 unary - prime
 6 = 111111 unary - prime
 7 = 1111111 unary - composite
 8 = 11111111 unary - composite
 9 = 111111111 unary - composite
10 = 1111111111 unary - composite
11 = 11111111111 unary - prime
12 = 111111111111 unary - prime

Put your solution in regex_prime.txt, for example:

./test_regex_prime.py 40 50 "$(cat regex_prime.txt)"
40 = 1111111111111111111111111111111111111111 unary - composite
41 = 11111111111111111111111111111111111111111 unary - prime
42 = 111111111111111111111111111111111111111111 unary - composite
43 = 1111111111111111111111111111111111111111111 unary - prime
44 = 11111111111111111111111111111111111111111111 unary - composite
45 = 111111111111111111111111111111111111111111111 unary - composite
46 = 1111111111111111111111111111111111111111111111 unary - composite
47 = 11111111111111111111111111111111111111111111111 unary - prime
48 = 111111111111111111111111111111111111111111111111 unary - composite
49 = 1111111111111111111111111111111111111111111111111 unary - composite
50 = 11111111111111111111111111111111111111111111111111 unary - composite

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest regex_prime 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab09_regex_prime regex_prime.txt

before Monday 29 July 12:00 (midday) (2024-07-29 12:00:00) to obtain the marks for this lab exercise.

Submission

When you are finished each exercises make sure you submit your work by running give.

You can run give multiple times. Only your last submission will be marked.

Don't submit any exercises you haven't attempted.

If you are working at home, you may find it more convenient to upload your work via give's web interface.

Remember you have until Week 10 Monday 12:00:00 (midday) to submit your work.

You cannot obtain marks by e-mailing your code to tutors or lecturers.

You check the files you have submitted here.

Automarking will be run by the lecturer several days after the submission deadline, using test cases different to those autotest runs for you. (Hint: do your own testing as well as running autotest.)

After automarking is run by the lecturer you can view your results here. The resulting mark will also be available via give's web interface.

Lab Marks

When all components of a lab are automarked you should be able to view the the marks via give's web interface or by running this command on a CSE machine:

2041 classrun -sturec