Week 02 Laboratory Exercises

Objectives

  • Understanding use of UNIX pipelines
  • Understanding use of UNIX filters (sed, sort, uniq, cut, tr)

Preparation

Before the lab you should re-read the relevant lecture slides and their accompanying examples.

Getting Started

Set up for the lab by creating a new directory called lab02 and changing to this directory.
mkdir lab02
cd lab02

There are some provided files for this lab which you can fetch with this command:

2041 fetch lab02

If you're not working at CSE, you can download the provided files as a zip file or a tar file.

Exercise:
Sorting UNSW Enrolments

There is a template file named sorting_enrolments_answers.txt which you must use to enter the answers for this exercise.

The autotest scripts depend on the format of sorting_enrolments_answers.txt so just add your answers don't otherwise change the file.

The file enrolments.psv contains a list of fake CSE enrolments.

The file enrolments.psv has 9 columns of data (columns are pipe separated):

  1. UNSW Course Code

  2. UNSW zID

  3. Name

  4. UNSW Program

  5. UNSW Plan

  6. WAM

  7. UNSW Session

  8. Birthdate

  9. Sex

Each row of data represents one enrolment.

  1. Write the sort and the head or tail commands needed to print the enrolment for the student with the lowest zID.

    If the student with the lowest zID has multiple enrolments, print their enrolment in the course with the highest Course Code.

    As always autotests are available

                2041 autotest sorting_enrolments Q1
            
  2. Write the sort and the head or tail commands needed to print the first 100 enrolments ordered first by Course Code, then by zID.

    As always autotests are available

                2041 autotest sorting_enrolments Q2
            
  3. Write the sort and the head or tail commands needed to print the first 50 enrolments ordered first by Birthdate, then by Course Code, then by Zid.

    As always autotests are available

                2041 autotest sorting_enrolments Q3
            
  4. Write the sort and the head or tail commands needed to print the first 25 enrolments ordered first by the decimal part of the WAM in descending order, then by zID in ascending order, then by Course Code also in ascending order.

    As always autotests are available

                2041 autotest sorting_enrolments Q4
            

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest sorting_enrolments 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab02_sorting_enrolments sorting_enrolments_answers.txt

before Tuesday 27 February 12:00 (midday) (2024-02-27 12:00:00) to obtain the marks for this lab exercise.

Exercise:
Counting UNSW classes

There is a template file named counting_classes_answers.txt which you must use to enter the answers for this exercise.

The autotest scripts depend on the format of counting_classes_answers.txt so just add your answers don't otherwise change the file.

The file classes.tsv contains a list of CSE classes.

The file classes.tsv has 7 columns of data (columns are tab separated):

  1. UNSW course code

  2. UNSW class id

  3. CSE class type

  4. Number of enrolled students

  5. Class enrolment cap

  6. Class time

  7. Class Location

Each row of data represents one class.

  1. Write a shell pipeline which will print how many classes there are.

    As always autotests are available

                2041 autotest counting_classes Q1
            
  2. Write a shell pipeline which will print how many different courses have classes.

    As always autotests are available

                2041 autotest counting_classes Q2
            
  3. Write a shell pipeline which will print the course with the most classes, and how many classes are in this course.

    If there are multiple courses with the same number of classes, print the course that is alphabeticaly first.

    As always autotests are available

                2041 autotest counting_classes Q3
            
  4. Write a shell pipeline which will print the two rooms most frequently used by non-LAB CSE classes and how often they are used.

    If there are multiple rooms that are used by the same number of non-LAB CSE classes, print order them alphabeticaly.

    As always autotests are available

                2041 autotest counting_classes Q4
            
  5. Write a shell pipeline which will print the most common day in the week and hour in the day for classes to start and how many classes start at that time.

    If there are multiple days and times that are used by the same number of classes, print the day and time that is alphabeticaly first.

    As always autotests are available

                2041 autotest counting_classes Q5
            
  6. Write a shell pipeline which will print the latest time a class will finish.

    As always autotests are available

                2041 autotest counting_classes Q6
            
  7. Write a shell pipeline which will print a list of the course codes of COMP courses that run 2 or more classes of the same type starting at the same time on the same day.
    (e.g. three tuts starting Monday at 10:00).

    As always autotests are available

                2041 autotest counting_classes Q7
            

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest counting_classes 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab02_counting_classes counting_classes_answers.txt

before Tuesday 27 February 12:00 (midday) (2024-02-27 12:00:00) to obtain the marks for this lab exercise.

Exercise:
Editing C Source Files

There is a template file named editing_programs_answers.txt which you must use to enter the answers for this exercise.

The autotest scripts depend on the format of editing_programs_answers.txt so just add your answers don't otherwise change the file.

The file program.c contains a C library implementing some simple sorting algorithms.

  1. Write a sed command to change all the sort related functions from V1 to V2.
    This includes all relevant comments.

    As always autotests are available

                2041 autotest editing_programs Q1
            
  2. Write a sed command to remove all single line comments starting with TODO or FIXME.

    As always autotests are available

                2041 autotest editing_programs Q2
            
  3. Write a sed command to print all lines starting with extern.

    As always autotests are available

                2041 autotest editing_programs Q3
            
  4. Write a sed command to replace all include statements using "" with <>.

    As always autotests are available

                2041 autotest editing_programs Q4
            
  5. Write a sed command to remove the main method.

    As always autotests are available

                2041 autotest editing_programs Q5
            

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest editing_programs 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab02_editing_programs editing_programs_answers.txt

before Tuesday 27 February 12:00 (midday) (2024-02-27 12:00:00) to obtain the marks for this lab exercise.

Challenge Exercise:
Exploring Regular Expression Extensions

There is a template file named advanced_ab_answers.txt which you must use to enter the answers for this exercise.

The autotest scripts depend on the format of advanced_ab_answers.txt so just add your answers don't otherwise change the file.

Use grep -P to test your answers to these questions.

These questions can't be solved using the standard regular expression language described in lectures.
The following commands may provide useful information:

    man 1 grep
    info grep
    man 7 regex
    perldoc perlre

We've provided a set of test cases in input.txt

  1. Write a grep -P command that prints the lines in a file named input.txt containing only the characters A and B such that there are exactly n A's followed by exactly n B's and no other characters.

    Matching AAABBB AB AABB AAAAAAAAAABBBBBBBBBB
    Not Matching AAABB ABBBBB AAAAAA AABBAB

    As always autotests are available

                2041 autotest advanced_ab
            

When you think your program is working, you can use autotest to run some simple automated tests:

2041 autotest advanced_ab 

When you are finished working on this exercise, you must submit your work by running give:

give cs2041 lab02_advanced_ab advanced_ab_answers.txt

before Tuesday 27 February 12:00 (midday) (2024-02-27 12:00:00) to obtain the marks for this lab exercise.

Submission

When you are finished each exercises make sure you submit your work by running give.

You can run give multiple times. Only your last submission will be marked.

Don't submit any exercises you haven't attempted.

If you are working at home, you may find it more convenient to upload your work via give's web interface.

Remember you have until Week 3 Tuesday 12:00:00 (midday) to submit your work.

You cannot obtain marks by e-mailing your code to tutors or lecturers.

You check the files you have submitted here.

Automarking will be run by the lecturer several days after the submission deadline, using test cases different to those autotest runs for you. (Hint: do your own testing as well as running autotest.)

After automarking is run by the lecturer you can view your results here. The resulting mark will also be available via give's web interface.

Lab Marks

When all components of a lab are automarked you should be able to view the the marks via give's web interface or by running this command on a CSE machine:

2041 classrun -sturec