COMP3311 25T2 |
Assignment 2 Python, PostgreSQL, psycopg2 |
Database Systems |
[Assignment Spec] [SQL Schema] [SQL Data] [Grades+Rules] [Examples] [Testing] [Submitting]
This assignment aims to give you practice in
You could complete this assignment with minimal use of SQL
But it is highly recommended that you use python for its intended purpose
Use SQL queries, views, and functions to filter and manipulate the data
Use Python to format and display the data
The goal is to build some useful data access operations on the mymyunsw database.
Marks: | This assignment contributes 15 marks toward your total mark for this course. |
Submission: |
via WebCMS3 or give, submit the files q1.py, q2.py, q3.py, q4.py, q5.py, helpers.py, helpers.sql |
Deadline: | Monday 4 August 2025 @ 21:59:59 |
Late Penalty: |
0.2 percent off the raw mark for each hour late, for 5 days
Any submission after 5 days scores 0 marks This is the UNSW standard late penalty. |
And, of course, if you have PostgreSQL installed on your home machine,
you can do all of your development there.
But don't forget to test it on vxdb2 before submitting.
All Universities require a significant information infrastructure in order to manage their affairs. This typically involves a large commercial DBMS installation. UNSW's student information system sits behind the MyUNSW web site. MyUNSW provides an interface to a PeopleSoft enterprise management system with an underlying Oracle database. This back-end system (Peoplesoft/Oracle) is sometimes called NSS. The specific version of PeopleSoft that we use is called Campus Solutions.
Despite its successes, however, MyUNSW/NSS still has a number of deficiencies, including:
The second point prevents MyUNSW/NSS from being used for three important operations that would be extremely helpful to students in managing their enrolment:
suggestedcourses)
The following sections describe how to carry out this assignment.
Some of the instructions must be followed exactly; others require
you to exercise some discretion. The instructions are targetted at
people doing the assignment on d.cse. If you plan to work on this
assignment at home on your own computer, you'll need to adapt the
instructions to local conditions
.
If you're doing your assignment on the CSE machines, some commands must be carried out on vxdb02, while others can (and probably should) be done on a CSE machine other than vxdb02. In the examples below, we'll use vxdb02$ to indicate that the comand must be done on vxdb02 and cse$ to indicate that it can be done elsewhere.
In addition to the database dump file, you are also provided a template Python files, and Python and SQL helper files.
The "template files" aim to save you some time in writing Python code. E.g. they do handle the command-line arguments and let you focus on the database interaction.
The helpers.py and helpers.sql files are provided in case you want to define
Python functions or PLpgSQL functions that might be useful in several of your scripts.
You are not required to use them (i.e. you can leave them unchanged).
The template files are available in a single ZIP or TAR
or copy them to your CSE account with the following command:
cse$ mkdir -pm ~/COMP3311/ass2 # or any other directory of your choice cse$ cd ~/COMP3311/ass2 cse$ cp /web/cs3311/current/assignments/ass2/files/* .
The database dump file can be downloaded HERE
or linked to your CSE account with the following command:
cse$ cd ~/COMP3311/ass2 cse$ ln -s /web/cs3311/current/assignments/ass2/database/mymyunsw.dump .
Now you can setup and use your database, eg:
vxdb02$ source /localstorage/$USER/env vxdb02$ p1 vxdb02$ createdb mymyunsw vxdb02$ psql mymyunsw -f ~/COMP3311/ass2/mymyunsw.dump vxdb02$ psql mymyunsw mymyunsw> SELECT * FROM ... # ect mymyunsw> \q # after writing code in q1.py (if using vscode not on vxdb02) vxdb02$ python3 ~/COMP3311/ass2/q1.py # hopefully some output vxdb02$ p0 vxdb02$ logout
Style mark.
Ugly, inconsistent layout of SQL queries and PLpgSQL functions will be penalised. It's hard to layout Python3 code wrong, given that indentation replaces brackets, but if you manage to make your Python code ugly, that will also be penalised. You should ensure that your Python variable names are understandable and consistent.
Write a Python script q1.py that for every faculty in orgunits, reports:
Additional requirements:
Output format:
Faculty #Schools #Staff
f"{type_name:<40}{num_schools:>8}{num_staff:>7}"
You can assume that at least one faculty will exist.
You can assume schools will be below faculties directly (no recursion needed).
Write a Python script q2.py that, for a given subject code, prints the longest strictly increasing sequence of term average marks for that subject.
Subject <SubjectCode> not found.and exit.
No increasing run found for <SubjectCode>.
<SubjectCode> (SubjectTitle): <Term1>(<Avg1>) -> ... -> <TermK>(<AvgK>)
In q3.py, write a script that given a zID, provides information about the person that zID belongs to.
zID FamilyName, GivenNames (Domestic/International student from Country) ProgramCode1 ProgramName1 (StreamCode1 and StreamCode2 and ... and StreamCodeN)
zID FamilyName, GivenNames is a staff member, and not a student.and exit. Assume this is true from the ER diagram, even without any actual implementation of the disjoint constraint.
f"{CourseCode} {Term} {SubjectTitle:<40s}{Mark:>3} {Grade:>2s} {UOC:2d}uoc"On the last line, print total UOC and WAM (rounded to 3 decimal places)*. Ignore requirements for this question.
Total achieved UOC = total_achieved_uoc, WAM = wamIf the course title is over 40 characters, truncate the title to its first 40 characters.
If either of the mark or grade is null, print a "-", right-aligned, where mark or grade would normally go.
What to print for uoc and how to use the grades and marks to determine the WAM is given in the Grades + Rules page. The precise format of the output will be available in the Examples page.Note that there are two UOC totals in this question:
WAM = weighted_mark_sum / total_attempted_uoc If the WAM is not computable due to a denominator of 0, print instead of WAM = wam, the phrase "Can't compute WAM".
* Do the rounding with SQL instead of Python, due to the inconsistency of floats.
Write a Python script that takes in one argument (a semicolon-separated list of filter conditions) that finds subjects that match the conditions, and prints them.
"uoc:>=6 && <12; career:!('PG'); title:'math' && !(' 1A' || 'Comp'); code:'10' && ('2' || '3' || '4')"This example means:
Output format
Code Title UoC Career
Write a Python script q5.py to show a student's progression through their program/stream, and what they still need to do to complete their degree. The script takes three command line parameters:
python3 q5.py StudentID [ ProgramCode StreamCode ]
If no program and stream is given, use the program for the
student's most recent enrolment (determined by term starting date , and then largest enrolment id) and grab the first stream that appears
in the program's stream requirements.
If a program is provided, but no stream, get the first stream that appears
in the program's stream requirements (first stream of the first id-wise program requirement of type "stream").
The progression check should start with a two-line heading.
zID FamilyName, GivenNames ProgramCode StreamCode ProgramNameThen check if the stream is part of the program's requirement if both stream and program are provided. If not, print
StreamCode is not a stream in ProgramCode
Otherwise, the output should look like this
CourseCode Term CourseTitle Mark Grade UOC NameOfRequirementUse this f-string to get the formatting right:
f"{CourseCode} {Term} {SubjectTitle:<40s}{Mark:>3} {Grade:>2s} {UOC:2d}uoc {NameOfRequirement}"
The order should be initially by term, then by course code within the term. If either of the mark or grade is null, print a "-", right-aligned, where grade or mark would normally go.
You should keep track of which courses and how many UOC in which requirements have been completed. After the line for each of the courses taken, you should display a sequence of lines indicating which core courses have not been completed, and how many UOC from each group of electives remains to be done.
If you consider each requirement as a bucket, then the process of determining which requirement a course satisfies, is a process of determining which bucket a particular course belongs in. If the bucket for the most appropriate requirement is full, the course cannot be allocated to that requirement, and a new requirement must be sought. In the "worst" case, the course will end up in the free electives bucket. If the free electives bucket is full, and if all of the other buckets that the course potentially be allocated to are also full, then the course cannot be allocated to any requirement and does not count toward the degree. Such courses should have 0 UOC against them and have a note "Cannot be allocated".
The strategy for ordering the "to be completed" info
In other words, most specific to least specific.
Within groups (e.g stream Core's), order by Requirements.id. For Core requirements, print remaining UOC and the course codes and names of any not yet completed courses, in the order they appear in the group definition. For all other rule types, print remaining UOC and the name of the group. If a student has completed all UOCs for a rule, then no information on this rule needs to be printed.
If a student has satisfied all rules and enough UOC for the program, you should print
Eligible to graduate
instead of the "to be completed" text.
More details on the precise output format for rules will be available in the Examples page.
This question will not be tested against any programs that have no stream requirements. See the bottom of Grades+Rules to some programs and streams that have a proper set of requirements.
You can find the simple test cases in the Examples page.
Note that there is a time-limit of 2 seconds for each script.