Question | Mark |
1. Inspections and version control [12 marks] | |
2. Perl Test [12 marks] | |
3. Perl scripting [12 marks] | |
4. Bash scripting [12 marks] | |
5. Makefiles, compilation, and version control [12 marks] | |
6. Profiling: gprof [12 marks] | |
7. Specifications, testing, and life cycles [12 marks] | |
8. Regular expressions [12 marks] | |
9. Design choices [12 marks] | |
Exam Total [96 marks] |
Question 1: Inspections and version control [12]
One of the code inspection issues we discussed dealt with keeping track of discussion points during asynchronous code inspections. If we allow members of the inspection team to discuss the item under inspection through email, messaging, voice mail, etc, then ensuring all members of the team have the same information about the discussions can be a challenge.
One solution is to include all inspection discussion content in our version control process. Your task is to detail a process a team could follow that permits asynchronous inspections, but keeps the "discussion" content under version control. Be sure to discuss the strengths and weaknesses of your proposed process.
Sample solution Process: When posting general observations/discussion points, team members are expected to post them to the shared document INSERT DOCUMENT LOCATION/ID HERE All team members are expected to check that document before contacting any other inspection team members with questions. If members of the inspection team discuss the material with each other (by phone, email, chat, etc) the person who initiated the discussion is responsible for posting the salient points (especially any action items) to the shared document before the end of that working day. Each morning, the lead of the inspection team will review the previous day's changes to the shared document, and ensure the latest copy is placed in the version control repository with an appropriate commit message. Strengths: - ensures the communication is under version control - assigns direct responsibility for ensuring the content is updated on the shared document and for checking said document before initiating new discussions - ensures the document is reviewed daily by the lead Weaknesses: - relies on team members to remember to update the shared document, and to remember (take notes) for all the key points from any conversations with other inpection team members - relies on team members to check the shared document before initiating new discussions - requires the lead to spend the beginning of each day on inspection administration |
Question 2: The Perl Test module[12]
Using Perl's Test module (either Simple or More) write a perl script, complete with test cases, to test the perl function whose prototype is shown below.
# inRange(lo, hi, val) # -------------------- # inRange returns true (1) if lo, hi, and val are all numeric values # where lo <= val <= hi # returns false (0) otherwise sub inRange($$$);Assume the function is provided in module mathstuff.pm.
Sample solution #! /usr/bin/perl use strict; use warnings; use mathstuff; # the collection of test cases, each of form # [ lo, hi, val, expected, casename ] my @Tests; BEGIN { @Tests = ( # cases that should pass [ 1, 3, 2, 1, "valid: 1<=2<=3" ], # basic test, integers [ 0.1, 0.3, 0.2, 1, "valid: 0.1<=0.2<=0.3" ], # basic test, floats [ 1, 1, 1, 1, "valid: 1<=1<=1" ], # test ok for equality [ -3, -1, -2, 1, "valid: -3<=-2<=-1" ], # test negatives # cases that should fail [ 2, 3, 1, 0, "errcheck: too low, ints" ], # too low, integers [ 1, 2, 3, 0, "errcheck: too high, ints" ], # too high, integers [ 0.2, 0.3, 0.1, 0, "errcheck: too low, floats" ], # too low, floats [ 0.1, 0.2, 0.3, 0, "errcheck: too high, floats" ], # too high, floats [ -2, -1, -3, 0, "errcheck: too low, negatives" ], # test negatives [ -3, -2, -1, 0, "errcheck: too high, negatives" ], # test negatives [ "x", 3, 2, 0, "errcheck: lo not a number" ], # lo is non-numeric [ 1, "x", 2, 0, "errcheck: lo not a number" ], # hi is non-numeric [ 1, 3, "x", 0, "errcheck: lo not a number" ], # val is non-numeric ); } use Test::More tests => scalar @Tests; foreach my $testCase (@Tests) { my $tname = pop @$testCase; my $exp = pop @$testCase; my $val = pop @$testCase; my $hi = pop @$testCase; my $lo = pop @$testCase; ok( inRange($lo, $hi, $val) == $exp, $tname); } |
Question 3: Perl scripting [12]
Below you are given a function that takes a file extension and directory as arguments. It then returns the number of files with that extension that are in the directory. For example, countLines("pl", "/bin"); would return the total number of .pl files in the bin directory.
# fileCount(extension, directory) # ------------------------------- # returns a count of the number of files in the specified directory # that have the specified file extension sub fileCount($$) { my $extension = $_[0]; my $directory = $_[1]; # if the directory cannot be opened then terminate early opendir (my $handle, $directory) or die "Error: could not open directory $directory\n"; # read the directory contents, checking each file against the pattern my $count = 0; my $pattern = "\\.$extension\$"; while ( readdir $handle ) { my $nextfile = $_; if ($nextfile =~ $pattern) { $count++; } } closedir $handle; return $count; }
Your task is to revise the function so that instead of taking a single file extension as the first argument, it instead takes a hash reference whose keys are the file extensions to be processed.
The revised function returns a hash reference where the values associated with the keys are the number of files with that extension.
For example, if the keys in the hash were 'pl' and 'cpp' then the value associated with key 'pl' would be a count of all the .pl files, and the value associated with key 'cpp' would be a count of all the .cpp files.
Sample solution # fileCount(extensionsHash, directory) # ------------------------------------ # given a hash whose keys are a set of file extensions, # and a directory to examine, # count the number of files with each extension, # setting the associated hash value # returns a count of the number of files in the specified directory # that have the specified file extension sub fileCount(%$) { my $extHash = $_[0]; my %extensions = %$extHash; my $directory = $_[1]; print "searching dir $directory\n"; # if the directory cannot be opened then terminate early opendir (my $handle, $directory) or die "Error: could not open directory $directory\n"; # read the directory contents, checking each file against each extension while ( readdir $handle ) { my $nextfile = $_; print "counting file $nextfile\n"; # compare the current file to each of the extensions foreach my $extension (keys %extensions) { my $pattern = "\\.$extension\$"; print " Comparing $extension to pattern $pattern\n"; # if the file has the desired extension, # increment the count for that extension if ($nextfile =~ $pattern) { $extensions{$extension} = $extensions{$extension} + 1; } } } closedir $handle; return \%extensions; } |
Question 4: Bash scripting [12]
Write a bash script that takes a directory as a parameter
and recursively searches all subdirectories to locate and display the locations
of all git repositories anywhere within that directory tree.
Notes:
(1) You do not have to search hidden directories (names beginning with .)
(2) Testing for the presence of .git will be adequate for deciding if
a directory is a repository.
(3) remember that in bash you can use the following to process the files
in a directory
for file in DIRECTORYNAME/*
Sample solution # bash script to recursively search a directory for git repositories, # which will be recognized by the presence of .git directories # recursive function to search for repositories function repoSearch() { # check for null directory names local dir=$1 if [ -z $dir ] ; then return 0 # see if it actually a directory elif [ -d $dir ] ; then # if it contains a git repository then display it local gitname="$dir/.git" if [[ -d $gitname ]] ; then echo "git repository: $dir" # otherwise see if it contains any subdirectories to search else for file in "$dir"/*; do if [[ -d $file ]] ; then repoSearch $file fi done fi fi } # start directory should be specified as a lone command line argument, # check the number of arguments then call the recursive search function if [ $# -ne 1 ] ; then echo "Expected use to search for git repositories is:" echo " repoSearch dirname" else repoSearch $1 fi |
Question 5: Makefiles, compilation, and version control [12]
(1) Modify the makefile shown below to include a 'submit' target. Submit's associated actions are to perform a git add/commit/push sequence for the current directory.
The script must include a Message variable, which holds the message to be used in the commit. This allows the user to override the default commit message, which should be "Committing via make submit".
progX: prog.o g++ -Wall prog.o -o progX prog.o: prog.h prog.cpp g++ -Wall -c prog.cpp(2) Discuss any problems/risks you can think of associated with the use of your "make submit".
Sample solution Message="Committing via make submit" Sources = progX prog.cpp prog.h progX: prog.o g++ -Wall prog.o -o progX prog.o: prog.h prog.cpp g++ -Wall -c prog.cpp submit: for file in ${Sources} ; do \ git add $$file ; \ done git add -u git commit -m ${Message} git push |
Question 6: Profiling: gprof [12]
Study the gprof output shown below, and explain what the results tell you about the memory and cpu utilization in the program, particularly with regard to the behaviour of the c function (since it takes up most of the cpu utilization) and its relationships with the functions that call it.
gprof -b results |
Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls s/call s/call name 100.76 8.83 8.83 26 0.34 0.34 c(long) 0.00 8.83 0.00 2 0.00 0.00 p(long) 0.00 8.83 0.00 1 0.00 6.79 f(long) 0.00 8.83 0.00 1 0.00 1.70 g(long) 0.00 8.83 0.00 1 0.00 0.34 h(long) Call graph granularity: each sample hit covers 2 byte(s) for 0.11% of 8.83 seconds index % time self children called name |
Sample solution Memory use: gprof does not reveal information about cpu use, for that we would need to use a different profiler (e.g. valgrind) CPU use: as noted in the question, the overwhelming majority of cpu use takes place in function c of the 26 calls to c, 20 come from function f, 5 from function g, and one from function h the cpu use in c for calls coming from f, g, and h are shown below, then the number of calls, then from that we calculate the cpu use per call caller calls cpu use calculated use per call f 20 6.79 0.34 g 5 1.70 0.34 h 1 0.34 0.34 from this we can include that the per-call running time of c is independent of which function is calling it |
Question 7: Specifications, testing, and life cycles [12]
In lectures, the approaches we discussed mostly followed the idea that test cases should be based on the specifications - seeing if the product actually does what we said it would.
Test-based development, on the other hand, is a process in which the specifications are very very vague initially, and over time the users give us more and more test cases they want the program to pass. Effectively we keep re-design the product to handle the old cases plus the new ones, and they keep giving us more detailed and special test cases.
For instance, if they want us to build a function that checks if phone numbers are valid, they might start by giving us a local 250 phone number. Then, when we give them a version that succeeds on that, they give us a long distance number for Canada/US (1-xxx-xxx-xxxx). When we give them a version that handles that, they give us a long distance number for Australia. When it handles that, they give us a local number with an extension. Etc. etc.
Give specific examples of programs and scenarios for which test-based development would be particularly useful and explain why, then give examples of programs and scenarios for which test-based development would be a poor choice and explain why.
Sample solution Suitable for test based development - small, flexible systems with a co-operative user base happy with a gradual/incremental development approach, e.g. - a small home business website - a text editor (the increments may be the new desired editing features) - systems where the users can recognize and give examples of what they need, but have a hard time (or lack of time for) documenting/articulating their needs, e.g. - a small company transitioning from a paper-based business model - systems with straight forward core functionality, but a great many special cases to be dealt with, e.g. - a system for calculating sale prices/discounts where the company has a great many negotiated deals - an advising system for determining what courses a student has left to complete ;-) NOT suitable for test based development - complex systems with heavily inter-related data elements of process components, where a poor early design decision will likely have massive effects later in the project, e.g. - complex multiplayer games - operating systems - systems that have low tolerance for risk (safety-critical, real-time, high security, etc), e.g. - financial systems - flight control systems |
Question 8: Regular expressions [12]
Pick one of the two scripting languages we focused on this term (Perl or Bash), identify which one you chose, and for each of the patterns described below provide a suitable regular expression to describe each pattern:
Sample solution # perl versions # Part I # any non-empty string that does not contain a newline my $noNewline = '^[^\n]+$'; # Part II # an optional +/- # followed by one or more 1-9's # followed by zero or more 0-9's my $noLeadZero = '^[+-]?[1-9][0-9]*$'; # Part III # ^\" so it begins with a " # \"$ so it ends with a "\ # [^\"]* so inside can be anything except a " my $quotedStr = '^\"[^\"]*\"$'; |
Question 9: Design choices [12]
Explain the meaning of the claim below, then make arguments both for and against the claim.
Sample solution Argument for: - Cohesion measures how tightly integrated the logic is within individual components. If we introduce controllability (which is the ability to set or control the value of internal program data values from "outside" the component) or observability (which is the ability to observe internal component data values from outside) then we are introducing elements that are less tightly integrated with the rest of the component. This links its workings with the outside world and thus lowers its cohesion. - Coupling measures the amount of interaction between components. To increase controllability or observability from outside a component, we are necessarily increasing its coupling with the external system, since we are adding a new data channel into or out of the component. Argument against: - Sweeping generalizations are rarely entirely correct (note the sweeping generalization there ;-). It may be the case that well integrated mechanisms to get/set data values already exist within a program, but did not need to be fully exploited to achieve the basic program functionality. Now, for the purposes of testing, we could introduce greater controllability and/or observability through adding new components, which are themselves nicely cohesive and loosely coupled, which make use of those get/set mechanisms without significantly impacting either the cohesiveness or coupling of the existing components. |