COMP3311 25T2 Assignment 2
Python, PostgreSQL, psycopg2
Database Systems
Last updated: Tuesday 29th July 5:10pm
Most recent changes are shown in red ... older changes are shown in brown.

[Assignment Spec]  [SQL Schema]  [SQL Data]   [Grades+Rules]   [Examples]   [Testing]   [Submitting]

Aims

This assignment aims to give you practice in

You could complete this assignment with minimal use of SQL
But it is highly recommended that you use python for its intended purpose
Use SQL queries, views, and functions to filter and manipulate the data
Use Python to format and display the data

The goal is to build some useful data access operations on the mymyunsw database.

Summary

Marks: This assignment contributes 15 marks toward your total mark for this course.
Submission: via WebCMS3 or give, submit the files
q1.py, q2.py, q3.py, q4.py, q5.py, helpers.py, helpers.sql
Deadline: Monday 4 August 2025 @ 21:59:59
Late Penalty: 0.2 percent off the raw mark for each hour late, for 5 days
Any submission after 5 days scores 0 marks
This is the UNSW standard late penalty.

How to do this assignment:

And, of course, if you have PostgreSQL installed on your home machine, you can do all of your development there.
But don't forget to test it on vxdb2 before submitting.

Introduction

All Universities require a significant information infrastructure in order to manage their affairs. This typically involves a large commercial DBMS installation. UNSW's student information system sits behind the MyUNSW web site. MyUNSW provides an interface to a PeopleSoft enterprise management system with an underlying Oracle database. This back-end system (Peoplesoft/Oracle) is sometimes called NSS. The specific version of PeopleSoft that we use is called Campus Solutions.

Despite its successes, however, MyUNSW/NSS still has a number of deficiencies, including:

The first point is inconvenient, since it means that the only way for a student to change tute classes is to drop the course and re-enrol into the course, selecting th new tute. If the course is already full, students would be unwilling to drop the course in case someone else grabs their place before they can re-enrol.

The second point prevents MyUNSW/NSS from being used for three important operations that would be extremely helpful to students in managing their enrolment:

Note: The people's data in this database is not real, and does not correspond to any real students or staff at UNSW. It is synthetic data, created to simulate the kinds of information that might be found in a real database.

Doing this Assignment

The following sections describe how to carry out this assignment. Some of the instructions must be followed exactly; others require you to exercise some discretion. The instructions are targetted at people doing the assignment on d.cse. If you plan to work on this assignment at home on your own computer, you'll need to adapt the instructions to local conditions.

If you're doing your assignment on the CSE machines, some commands must be carried out on vxdb02, while others can (and probably should) be done on a CSE machine other than vxdb02. In the examples below, we'll use vxdb02$ to indicate that the comand must be done on vxdb02 and cse$ to indicate that it can be done elsewhere.

Setting Up

In addition to the database dump file, you are also provided a template Python files, and Python and SQL helper files.

The "template files" aim to save you some time in writing Python code. E.g. they do handle the command-line arguments and let you focus on the database interaction.

The helpers.py and helpers.sql files are provided in case you want to define Python functions or PLpgSQL functions that might be useful in several of your scripts.
You are not required to use them (i.e. you can leave them unchanged).

The template files are available in a single ZIP or TAR

or copy them to your CSE account with the following command:

cse$ mkdir -pm ~/COMP3311/ass2 # or any other directory of your choice
cse$ cd ~/COMP3311/ass2
cse$ cp /web/cs3311/current/assignments/ass2/files/* .

The database dump file can be downloaded HERE

or linked to your CSE account with the following command:

cse$ cd ~/COMP3311/ass2
cse$ ln -s /web/cs3311/current/assignments/ass2/database/mymyunsw.dump .

Now you can setup and use your database, eg:

vxdb02$ source /localstorage/$USER/env
vxdb02$ p1
vxdb02$ createdb mymyunsw
vxdb02$ psql mymyunsw -f ~/COMP3311/ass2/mymyunsw.dump
vxdb02$ psql mymyunsw

mymyunsw> SELECT * FROM ...
# ect
mymyunsw> \q

# after writing code in q1.py (if using vscode not on vxdb02)

vxdb02$ python3 ~/COMP3311/ass2/q1.py
# hopefully some output
vxdb02$ p0
vxdb02$ logout

Your Tasks

Q0.  Style Mark (2 marks)

Style mark.

Ugly, inconsistent layout of SQL queries and PLpgSQL functions will be penalised. It's hard to layout Python3 code wrong, given that indentation replaces brackets, but if you manage to make your Python code ugly, that will also be penalised. You should ensure that your Python variable names are understandable and consistent.

Q1.  OrgUnit Type Summary (4 marks)

Write a Python script q1.py that for every faculty in orgunits, reports:

Additional requirements:

Output format:

Faculty                                 #Schools #Staff

You can assume that at least one faculty will exist.

You can assume schools will be below faculties directly (no recursion needed).

Q2.  Longest Increasing Average-Mark Run (5 marks)

Write a Python script q2.py that, for a given subject code, prints the longest strictly increasing sequence of term average marks for that subject.

Q3.  Person Information (6 marks)

In q3.py, write a script that given a zID, provides information about the person that zID belongs to.

If the course title is over 40 characters, truncate the title to its first 40 characters.

If either of the mark or grade is null, print a "-", right-aligned, where mark or grade would normally go.

What to print for uoc and how to use the grades and marks to determine the WAM is given in the Grades + Rules page. The precise format of the output will be available in the Examples page.

Note that there are two UOC totals in this question:

WAM = weighted_mark_sum / total_attempted_uoc If the WAM is not computable due to a denominator of 0, print instead of WAM = wam, the phrase "Can't compute WAM".

* Do the rounding with SQL instead of Python, due to the inconsistency of floats.

Q4.  Subject Filter and Listing (6 marks)

Write a Python script that takes in one argument (a semicolon-separated list of filter conditions) that finds subjects that match the conditions, and prints them.

Output format

Code      Title                                                    UoC    Career

Q5.  Progression Check (7 marks)

Note: Please read the pinned forum post about this question for some previously missing details

Write a Python script q5.py to show a student's progression through their program/stream, and what they still need to do to complete their degree. The script takes three command line parameters:

python3  q5.py  StudentID  [ ProgramCode  StreamCode ]

If no program and stream is given, use the program for the student's most recent enrolment (determined by term starting date , and then largest enrolment id) and grab the first stream that appears in the program's stream requirements.
If a program is provided, but no stream, get the first stream that appears in the program's stream requirements (first stream of the first id-wise program requirement of type "stream").

The script already checks the validity of the command-line arguments.

The progression check should start with a two-line heading.

zID FamilyName, GivenNames
ProgramCode StreamCode ProgramName
Then check if the stream is part of the program's requirement if both stream and program are provided. If not, print
StreamCode is not a stream in ProgramCode

Otherwise, the output should look like this

CourseCode  Term  CourseTitle  Mark  Grade  UOC  NameOfRequirement
Use this f-string to get the formatting right:
f"{CourseCode} {Term} {SubjectTitle:<40s}{Mark:>3} {Grade:>2s}  {UOC:2d}uoc  {NameOfRequirement}"

The order should be initially by term, then by course code within the term. If either of the mark or grade is null, print a "-", right-aligned, where grade or mark would normally go.

You should keep track of which courses and how many UOC in which requirements have been completed. After the line for each of the courses taken, you should display a sequence of lines indicating which core courses have not been completed, and how many UOC from each group of electives remains to be done.

If you consider each requirement as a bucket, then the process of determining which requirement a course satisfies, is a process of determining which bucket a particular course belongs in. If the bucket for the most appropriate requirement is full, the course cannot be allocated to that requirement, and a new requirement must be sought. In the "worst" case, the course will end up in the free electives bucket. If the free electives bucket is full, and if all of the other buckets that the course potentially be allocated to are also full, then the course cannot be allocated to any requirement and does not count toward the degree. Such courses should have 0 UOC against them and have a note "Cannot be allocated".

The strategy for ordering the "to be completed" info

In other words, most specific to least specific.

Within groups (e.g stream Core's), order by Requirements.id. For Core requirements, print remaining UOC and the course codes and names of any not yet completed courses, in the order they appear in the group definition. For all other rule types, print remaining UOC and the name of the group. If a student has completed all UOCs for a rule, then no information on this rule needs to be printed.

If a student has satisfied all rules and enough UOC for the program, you should print

Eligible to graduate

instead of the "to be completed" text.

More details on the precise output format for rules will be available in the Examples page.

This question will not be tested against any programs that have no stream requirements. See the bottom of Grades+Rules to some programs and streams that have a proper set of requirements.

Submission and Testing

You can find the simple test cases in the Examples page.

Note that there is a time-limit of 2 seconds for each script.