CSCI 265: Fall 2018 Final Exam

Question Mark
1. Bash functions
[10 marks]
 
2. Bash and regular expressions
[10 marks]
 
3. Makefiles
[10 marks]
 
4. Perl functions
[10 marks]
 
5. Automated testing with perl
[10 marks]
 
6. Debugging and debuggers
[10 marks]
 
7. Test case development
[10 marks]
 
8. Life cycle models and specifications
[10 marks]
 
9. Gtk and code inspections
[10 marks]
 
10. Profiling and profilers
[10 marks]
 
Exam Total

[100 marks]

 

Question 1: Bash functions [10]

Write a bash function to explore one or more directories whose names are passed as parameters (as usual, paths are a valid part of directory names). The specific behaviour is as follows:

Sample Solution
# assumes user calls toplevel, passing the directory name(s),
# toplevel first figures out if a -r flag is anywhere in the parameter list
#    and then goes through the directory names calling the explore function
#    (letting explore know whether to do it recursively or not)
function toplevel()
{
   local rec=false  # will set to true if we see a -r in parameter list
   for dir in "$@" ; do
      if [ $idr = "-r" ] ; then
         rec=true
      fi
   done

   for dir in "$@" ; do
       if ! [ $dir = "-r" ] ; then
          explore $dir $rec
       fi
   done
}

# explore expects a directory name and a true/false flag telling it
#    whether or not to explore recursively
function explore()
{
   local dir=$1;
   local rec=$2;
   if [ -d $dir ] && [ -r $dir ] && [ -x $dir ] ; then
      # see if there is a makefile present, and if so
      #     temporarily switch to that directory and run the make command
      local mfile="$dir"/makefile
      if [ -f $mfile ] ; then
         (cd $dir && make all CC=g++)
      else
         # no makefile, so go through any .cpp files and compile them
         for cfile in "$dir"/*.cpp ; do
             # -f test needed to catch case *.cpp when there is no .cpp file present
             if [ -f $cfile ] ; then
                (cd $dir && g++ -c $cfile)
             fi
         done
      fi

      # now do any recursive exploration if rec option in use
      if [ $rec = true ] ; then
         for subdir in "$dir"/* ; do
             if [ -d $subdir ] ; then
                explore $subdir $rec
             fi
         done
      fi
   fi
}

Question 2: Bash and regular expressions [10]

Write a bash script that uses regular expressions and the =~ operator to match each of its command line arguments against the following patterns, indicating which one(s) the argument matches:

  1. floating point numbers in exponential notation (e.g. 1.75e-9)
  2. alphanumeric strings which can also contain underscores, but never multiple underscores in a row
  3. strings beginning with an uppercase alphabetic character and ending with either a digit or a @ symbol.

Sample Solution
#! /bin/bash

pat1='^[+-]?[0-9]([.][0-9]+(e[+-]?[0-9]+)?)?$'
pat2='^([_]?[a-zA-Z0-9]+)*[_]?$'
pat3='^[A-Z].*[09-@]$'

# check each argument against every pattern
# (there might be multiple matches per argument)
for str in "$@" ; do
    if [[ $str =~ $pat1 ]] ; then
       echo "$str matches pattern 1"
    fi
    if [[ $str =~ $pat2 ]] ; then
       echo "$str matches pattern 2"
    fi
    if [[ $str =~ $pat3 ]] ; then
       echo "$str matches pattern 3"
    fi
done

Question 3: Makefiles [10]

The rules and dependencies in the makefile below are valid, but the rules themselves do not appear in the correct order.

(i) Fix the ordering and briefly (1 sentence) explain why the ordering was wrong.

(ii) Suppose that files dataFormat.h and dataFormat.cpp are actually generated by a perl script, dataGen.pl, which needs to be re-run if the file dataDesc.txt has been updated. Update the makefile to include appropriate rules and dependencies to update the dataFormat.h/cpp files when needed, assuming the command to run the perl script is "./dataGen.pl dataDesc.txt".

Formats=dataDesc.txt
ObjCompile=g++ -Wall -Wextra -c
ExeCompile=g++ -Wall
ClientObj=dataFormat.o client.o
ServerObj=dataFormat.o dataServer.o dataProcessor.o

all: clientx dataServerx

dataFormat.o: dataFormat.cpp dataFormat.h
	$(ObjCompile) dataFormat.cpp

client.o: client.h client.cpp dataFormat.h
	$(ObjCompile) client.cpp

dataServer.o: dataServer.h dataServer.cpp dataFormat.h
	$(ObjCompile) dataServer.cpp

dataProcessor.o: dataProcessor.h dataProcessor.cpp dataFormat.h
	$(ObjCompile) dataProcessor.cpp

dataServerx: $(ServerObj)
	$(ExeCompile) $(ServerObj) -o dataServerx

clientx: $(ClientObj)
	$(ExeCompile) $(ClientObj) -o clientx

Sample Solution
Formats=dataDesc.txt
ObjCompile=g++ -Wall -Wextra -c
ExeCompile=g++ -Wall
ClientObj=dataFormat.o client.o
ServerObj=dataFormat.o dataServer.o dataProcessor.o

all: clientx dataServerx

dataServerx: $(ServerObj)
	$(ExeCompile) $(ServerObj) -o dataServerx

clientx: $(ClientObj)
	$(ExeCompile) $(ClientObj) -o clientx

dataFormat.o: dataFormat.cpp dataFormat.h
	$(ObjCompile) dataFormat.cpp

client.o: client.h client.cpp dataFormat.h
	$(ObjCompile) client.cpp

dataServer.o: dataServer.h dataServer.cpp dataFormat.h
	$(ObjCompile) dataServer.cpp

dataProcessor.o: dataProcessor.h dataProcessor.cpp dataFormat.h
	$(ObjCompile) dataProcessor.cpp

dataformat.h: dataGen.pl dataDesc.txt
	./dataGen.pl ./dataDesc.txt

dataformat.cpp: dataGen.pl dataDesc.txt
	./dataGen.pl ./dataDesc.txt

Question 4: Perl functions [10]

Write a perl function called listContent that accepts one or more filenames as parameters.

For each file, the function runs the command "cat -b" on the file, capturing both the output and the exit status of the command, and then:

Sample Solution
sub listContent()
{
   foreach my $fname (@_) {

      # capture the output as one big string
      my $output = `cat -b $fname`;

      # capture the exit status of the cat command
      my $status = #?;

      # process successful runs, otherwise give error message
      if ($status == 0) {

         # split the big string into an array of lines (split on newlines)
         my @lines = split(/\n/, $output);

         # process each line, splitting it into words (split on whitespace)
         foreach my $oneline (@lines) {
            my @words = split(/\s+/, $oneline);
            print "$words[0] $words[1]\n";
         }

      } else {
         # handle the non-zero exit status from the command
         print "command failed on $fname\n";
      }
   }
}

Question 5: Automated testing with perl [10]

Suppose a perl squareRoot function is provided in module myMath.pm, where the function expects a number as a parameter and returns its square root.

It returns the string "Complex root" if the argument was a negative number, and the string "Invalid argument" if it was not passed a single numeric argument.

Write a perl test script for the function using Test::More, and including a small (6-8) set of test cases for representative instances of valid and invalid input.

Sample Solution
#! /usr/bin/perl
use strict;
use warnings;
use Test::More tests => 6;
use myMath qw(squareRoot);

is(squareRoot(4), 2, "normal integer root");
is(squareRoot(0.01), 0.1, "normal fractional root");
is(squareRoot(-1), "Complex root", "negative argument");
is(squareRoot("foo"), "Invalid argument", "non-numeric argument");
is(squareRoot(), "Invalid argument", "too few arguments");
is(squareRoot(1, 2), "Invalid argument", "too many arguments");

Question 6: Debugging and debuggers [10]

Perl scripts can be debugged using a debugger similar to gdb, by running them with the command
perl -d yourscriptname.pl any command line args
The debugger uses many of the same commands as gdb, including (b, c, n, s, p, r, q, w) and a T command that is similar to gdb's backtrace.

A perl script is shown below, along with a run showing the current behaviour the user is unhappy with.

#! /usr/bin/perl

use strict;
use warnings;

# euclid's algorithm to determine greatest common divisor of two positive integers m and n
sub gcd($$) {
   if ($_[1] == 0) { return $_[0]; }
   return &gcd($_[1], $_[0] % $_[1]);
}

# determine if the passed parameter is a perfect square integer
sub isps($) {
   my $r = int(sqrt($_[0]));
   if (($r * $r) == $_[0]) { return 1; }
   return 0;
}

# determine if passed parameter is part of the fibonacci sequence
# (known to be fib if 5n^2 +/- 4 is a perfect square)
sub isfib($) {
   my $n = 5 * $_[0] * $_[0];
   if (isps($n + 4)) { return 1; }
   elsif (isps($n - 4)) { return 1; }
   return 0;
}

# determine least common multiple of all passed parameters
# (treats anything except positive integers as 1, notifies user if
#  it calls euclid's algorithm with successive fibonacci numbers)
sub lcm() {
   my $argc = scalar @_;
   my $intpat = '^[1-9][0-9]*$';
   if ($argc < 1) { return 1; }
   elsif ($argc < 2) {
      if ($_[0] =~ $intpat) { return $_[0]; }
      else { return 1; }
   } elsif ($argc == 2) {
      if (($_[0] =~ $intpat) and ($_[1] =~ $intpat)) {
         if (isfib($_[0]) and isfib($_[1]) and isfib($_[0] + $_[1])) {
            warn("   Note: computing gcd of two successive fib numbers $_[0], $_[1]\n");
         }
         return $_[0] * ($_[1] / gcd($_[0], $_[1]));
      } else { return &lcm($_[0]) * &lcm($_[1]); }
   } else {
      my @left = @_;
      my @right = splice @left, (scalar @_ ) / 2;
      return &lcm( &lcm(@left), &lcm(@right));
   }
}

# run lcm on the list of arguments and display the results
print "computing LCM of @ARGV \n";
my $res = &lcm( @ARGV );
print "lcm is $res\n";

When user ran "lcm.pl 12 20 34 21 11 55 89 9 16" they expected warnings about pairs 34 21 and 55 89, and a result of 83880720 The actual output is shown below
computing LCM of 12 20 34 21 11 55 89 9 16 Note: computing gcd of two successive fib numbers 34, 21 Note: computing gcd of two successive fib numbers 89, 144 lcm is 83880720

Provide a specific sequence of debugger commands you would use as a starting point for exploring/understanding the behaviour of this script, along with a brief explanation of your logic. (You are not expected to actually identify/fix the problem, just show your preliminary investigation approach.)

Sample Solution
There is lots of leeway here, but at the very least the answer
should examine what the user thinks went wrong, and proceed logically
from there.

The user expected warnings about 34,21 (which they got) and
    55 89, which they didn't get -- they got 89,144 instead
The error becomes visible when these messages are printed,
    so it makes sense to examine where the messages come from,
    i.e. the "warn" line from the lcm function.
That line in turn depends on the isfib function, so that too seems
    a logical thing to examine.

Sample idea:
(1) start up the debugger
    perl -d scriptname.pl
(2) set up breakpoints at our spots of interest
    b isfib
    b 41
(3) start the program running, maybe just with the pair of values
    that failed in the example, i.e. 55,89
    r 55 89
(4) when we wind up in isfib, use n to step our way through and see
    if we say yes/no is it a fib, using p to use at the value of $n,
    doing this through both calls to isfib
(5) if we see an error in isfib, probe deeper next run (e.g. maybe
    into isps) otherwise move on to the warn line and see if it runs
(6) if we DON'T see any errors in this test run, i.e. 55,89 gets detected
    when run alone, then something more complicated is happening and we'll
    need to try a larger sequence containing 55,89

For those of you who are curious, the program is actually fine, the user was wrong in their expectations of what should happen!

Question 7: Test case development [10]

The function whose prototype is shown below is supposed to read the nth line of text from the named file and return it as a string, including the end-of-line character(s). (Note that depending on the system/tool of origin for the file, the end-of-line might be marked by any one of the following: "\r\n", "\r", or "\n".)

The function does not produce any output, and if anything goes wrong should simply return an empty string.

string readLine(string fileName, int lineNum);

Your task is to create a specific list of test cases to apply to the function. (You do not need to provide actual code to make the call/evaluate it.)

Sample Solution
I was looking for at least 10 different test cases, covering
some of the following:
  • cases where the line in question ends with \r\n, with \r, with \n
  • cases where the preceeding line(s) end with \r\n, with \r, with \n
  • cases where the line in question is the last line of the file and is incomplete (i.e. doesn't end with any of the return sequences)
  • cases where the file contains only the one line
  • cases where the file is empty
  • cases where the line is the first/last/middle of a multi-line file
  • cases where the line number is negative
  • cases where the line number is greater than the number of lines in the file
  • cases where the file doesn't exist or isn't readable
  • cases where the filename is local, given by a relative path, given by an absolute path

Question 8: Life cycle models and specifications [10]

Suppose you find yourself in the following circumstances:

Recommend a software life cycle model you would follow for this situation, justify your recommendations, and identify any particular concerns you have for your chosen model and this specific situation.

Sample Solution
The justification and discussion is far more important here than the specific
model chosen (they all have good/bad points for this scenario).  What is particularly
important is to relate your discussion to the specific issues mentioned above:
  • a small team of decent developers
    - both are good points for any methodology, but crucial for agile
  • an inexperienced team
    - makes it tougher to follow any model, but especially agile
  • one interested partner, one uninterested
    - might get feedback/input weighted towards the preferences of one partner, so the other might wind up dissatisfied
  • an overworked group of users (the staff)
    - makes it tough to get quality feedback, either to develop requirements or to cycle through agile
  • conflicting ideas about what should happen
    - means either careful development and approval for specifications early (if using waterfall), or a spiral or agile style where we can change it later
  • a poor existing system in place
    - means we do have something to work from/improve on, but if it's really a bad system then maybe we don't WANT to base on that
  • your team did't design the existing system
    - means less familiarity with the good/bad innards of the existing system, which makes basing work on that more difficult

Question 9: Gtk and code inspections [10]

Choosing either C++ or Perl, provide a list of specific code inspection points to be used when inspecting Gtk interface code written in your chosen language. Note the focus for this is strictly on the relevant Gtk/GUI aspects.

Sample Solution

Note that this is inspecting the code, NOT testing the run-time behaviour or appearance.

There are lots of potential gtk specific things to look for, including

 - ensure use of Glib true/false, gtk2/init for perl
 - check conversion to/from the generic gdata type for every such parameter
   sent/received for C++
 - check each interactive element has a handler function for each permissible action
 - check that each such action has a signal connect between it and its handler
 - check which outputs are supposed to be directed to console (as opposed to gui)
   and that they actually are
 - for each widget that returns a value, check the value is actually being
   captured and used appropriately
 - check the "show" order of items to ensure they will give the desired on screen effect
 - check the heirarchy of container widgets and that each contains the intended items

Question 10: Profiling and profilers [10]

The two tables below are extracted from a run of lab 10's graph program (on a graph of 100 nodes, 3000 edges), but omitting the analysis of the std/gnu library functions and omitting the latter half of both tables.

For the _second_ table, give a common-sense explanation of the meaning of each of the four centre columns (%time, self, children, called), and clarify the significance of why some of the names are indented within the "name" column and others are not.

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls  ms/call  ms/call  name    
 43.76      0.07     0.07     3000     0.02     0.02  print_node(int)
 25.01      0.11     0.04        1    40.01    63.35  breadth_first()
 12.50      0.13     0.02        1    20.01    20.01  mst()
  6.25      0.14     0.01    11965     0.00     0.00  add_edge(int, int, float)
  6.25      0.15     0.01     7005     0.00     0.02  processcommand(char)
  6.25      0.16     0.01        1    10.00    33.34  depth_first()

   ... remaining entries omitted for this question ...


Call graph (explanation follows) index % time self children called name [1] 100.0 0.00 0.16 main [1] 0.01 0.15 7005/7005 processcommand(char) [2] 0.00 0.00 7005/7005 getcommand() [18] 0.00 0.00 1/1 printmenu() [152] ----------------------------------------------- 0.01 0.15 7005/7005 main [1] [2] 100.0 0.01 0.15 7005 processcommand(char) [2] 0.04 0.02 1/1 breadth_first() [4] 0.01 0.02 1/1 depth_first() [5] 0.00 0.02 1/1 print_graph() [6] 0.02 0.00 1/1 mst() [7] 0.01 0.00 11965/11965 add_edge(int, int, float) [8] 0.00 0.00 12000/12000 getint() [15] 0.00 0.00 7000/7000 getfloat() [19] ----------------------------------------------- 0.02 0.00 1000/3000 depth_first() [5] 0.02 0.00 1000/3000 breadth_first() [4] 0.02 0.00 1000/3000 print_graph() [6] [3] 43.7 0.07 0.00 3000 print_node(int) [3] ----------------------------------------------- 0.04 0.02 1/1 processcommand(char) [2] [4] 39.6 0.04 0.02 1 breadth_first() [4] 0.02 0.00 1000/3000 print_node(int) [3] ----------------------------------------------- 0.01 0.02 1/1 processcommand(char) [2] [5] 20.8 0.01 0.02 1 depth_first() [5] 0.02 0.00 1000/3000 print_node(int) [3] ----------------------------------------------- 0.00 0.02 1/1 processcommand(char) [2] [6] 14.6 0.00 0.02 1 print_graph() [6] 0.02 0.00 1000/3000 print_node(int) [3] ... remaining entries omitted for this question ...

Sample Solution
The "indented" function name is the one being examined in the current block,
e.g. print_graph is the focus of the block
                0.00    0.02       1/1           processcommand(char) [2]
[6]     14.6    0.00    0.02       1         print_graph() [6]
                0.02    0.00    1000/3000        print_node(int) [3]
and within that block it is called from processcommand and it calls print_node

The "called" column lists how many times the function is called in this
particular block out of the total number of times it is called.  In the [6]
example above, print_node is called 1000 times from print_graph, out of the
3000 times it is called overall.

The "%time" column indicates what percent of the program's total cpu run time
was taken up by that function and the children function it called.

The "self" column indicates the total cpu seconds spent within that function
across all calls to it, but not counting time spent in children functions.

The "children" column indicates the total cpu seconds spent in functions called
from the function being examined on the current line.