Week 09 Tutorial Questions
Objectives
- 
        
The assignment specification doesn't fully explain the assignment - what can I do? 
- 
        
How hard are the subsets? 
- 
        
What does git init do? How does this differ from grip-init? 
- 
        
What do git add file and grip-add file do? 
- 
        
What is the index in grip (and git), and where does it get stored? 
- 
        
What is a commit in grip (and git), and where does it get stored? 
- 
        
Apart from the grip-* scripts what else do you need to submit (and give an example)? 
- 
        
You work on the assignment for a couple of hour tonight. 
 What do you need to do when you are finished?
- 
        
Write a Python program, tags.pywhich given the URL of a web page fetches it by running wget(1) and prints the HTML tags it uses.The tag should be converted to lower case and printed in alphabetical order with a count of how often each is used. Don't count closing tags. Make sure you don't print tags within HTML comments. ./tags.py https://www.cse.unsw.edu.au a 141 body 1 br 14 div 161 em 3 footer 1 form 1 h2 2 h4 3 h5 3 head 1 header 1 hr 3 html 1 img 12 input 5 li 99 link 3 meta 4 noscript 1 p 18 script 14 small 3 span 3 strong 4 title 1 ul 25Note the counts in the above example will not be current - the CSE pages change almost daily. 
- 
        
Add an -foption totags.pywhich indicates the tags are to be printed in order of frequency../tags.py -f https://www.cse.unsw.edu.au head 1 noscript 1 html 1 form 1 title 1 footer 1 header 1 body 1 h2 2 hr 3 h4 3 span 3 link 3 small 3 h5 3 em 3 meta 4 strong 4 input 5 img 12 br 14 script 14 p 18 ul 25 li 99 a 141 div 161
- 
        
Modify tags.py to use the requestsandbeautifulsoup4modules.
- 
        
If you fell like a harder challenge after finishing the challenge activity in the lab this week have a look at the following websites for some problems to solve using regexp: