Week 02 Tutorial Questions

Objectives

Consider the following columnar (space-delimited) data file containing (fictional) contact for various CSE academic staff:
```
        G Heiser       Newtown      9381-1234
        S Jha          Kingsford    9621-1234
        C Sammut       Randwick     9663-1234
        R Buckland     Randwick     9663-9876
        J A Shepherd   Botany       9665-4321
        A Taylor       Glebe        9692-1234
        M Pagnucco     North Ryde   9868-6789
    
```
This data is fictitious.
Do not ring these phone numbers.

The data is currently sorted in phone number order.
Can we use the sort(1) filter to re-arrange the data into telephone-book order?
(alphabetically by last name)
If not, how would we need to change the file in order to achieve this?

Consider this Unix password file
(usually found in /etc/passwd):

        root:ZHolHAHZw8As2:0:0:root:/root:/bin/dash
        jas:iaiSHX49Jvs8.:100:100:John Shepherd:/home/jas:/bin/bash
        andrewt:rX9KwSSPqkLyA:101:101:Andrew Taylor:/home/andrewt:/bin/cat
        postgres::997:997:PostgreSQL Admin:/usr/local/pgsql:/bin/bash
        oracle::999:998:Oracle Admin:/home/oracle:/bin/bash
        cs2041:rX9KwSSPqkLyA:2041:2041:COMP2041 Material:/home/cs2041:/bin/bash
        cs3311:mLRiCIvmtI9O2:3311:3311:COMP3311 Material:/home/cs3311:/bin/zsh
        cs9311:fIVLdSXYoVFaI:9311:9311:COMP9311 Material:/home/cs9311:/bin/bash
        cs9314:nTn.JwDgZE1Hs:9314:9314:COMP9314 Material:/home/cs9314:/bin/fish
        cs9315:sOMXwkqmFbKlA:9315:9315:COMP9315 Material:/home/cs9315:/bin/bash

Provide a command that would produce each of the following results:

Display the first three lines of the file
Display lines belonging to class accounts
(assuming that class accounts have a username that starts with either "cs", "se", "bi" or "en", followed by four digit)
Display the username of everyone whose shell is /bin/bash
Create a tab-separated file passwords.txt containing only the username and password of each user

Consider this fairly standard split-into-words technique.
```
        tr -cs 'a-zA-Z0-9' '\n' < someFile
    
```
Explain how this command works.
What does each argument do?

What is the output of each of the following pipelines if the text:

        this is big Big BIG
        but this is not so big

is supplied as the initial input to the pipeline?

                tr -d ' ' | wc -w

                tr -cs '[:alpha:]' '\n' | wc -l

                tr -cs '[:alpha:]' '\n' | tr '[:lower:]' '[:upper:]' | sort | uniq -c

Consider a file containing (fake) zIDs and marks in COMP1511:
```
        4279700|61
        4212240|59
        4234024|57
        4286024|50
        4270657|75
        4227010|52
        4299716|84
        4236088|74
        4245033|87
        4222098|46
        4228842|85
        4209182|96
        4276270|61
        4224421|72
        4207416|76
    
```
and another file containing (fake) zIDs and marks in COMP2041:
```
        4200549|92
        4283960|77
        4203704|48
        4261741|43
        4224421|67
        4223809|75
        4276270|80
        4279700|68
        4233865|61
        4207416|56
        4209669|71
        4209182|70
        4213591|49
        4236221|53
        4201259|91
    
```
1. Can the files be used as-is with the join command?
  If not, what needs to be changed?
2. Write a join command which prints the marks in COMP1511 and COMP2041 of everyone who did both courses.
3. Write another join command which prints the marks in COMP1511 and COMP2041 of everyone, across both files,
  With -- in the case where a student didn't do a particular subject
4. Write a shell pipeline which prints the marks in COMP1511 and COMP2041 of everyone who did both courses,
  sorted by their COMP1511 mark in ascending order,
  then by their COMP2041 mark in descending order.
Consider a file containing tab-separated benchmarking results for 20 programs, in three different benchmarks, all measured in seconds.
```
        program1	08	03	05
        program2	14	03	05
        program3	17	08	10
        program4	15	11	05
        program5	16	10	24
        program6	15	09	17
        program7	15	06	10
        program8	17	10	17
        program9	12	07	08
        program10	09	04	16
        program11	11	03	24
        program12	16	11	20
        program13	16	08	17
        program14	08	07	06
        program15	06	06	05
        program16	12	05	08
        program17	09	05	10
        program18	06	06	06
        program19	14	09	22
        program20	16	04	24
    
```
1. Write a sort command which sorts by the results in the second benchmark, then by the results in the first benchmark.
2. Write a sort command which sorts by the results in the third benchmark, then by the program number.
3. Write a sed command which removes the leading zeroes from the benchmark times.
4. Write a sed command which removes the benchmark results from program2 through program13.