CSCI 265: Fall 2019 Final Exam Sample Solutions

Question Mark
1. Git, version control
[10 marks]
 
2. Recursive exploration in bash
[10 marks]
 
3. Specifications and drivers in bash
[10 marks]
 
4. C++, gdb and debugging
[10 marks]
 
5. C macros, C++ templates
[10 marks]
 
6. Code inspections, gtk in C++
[10 marks]
 
7. Regular expressions in bash
[10 marks]
 
8. Profiling
[10 marks]
 
9. Test automation with bash
[10 marks]
 
10. Project implementation discussion
[10 marks]
 
Exam Total (best 9)

[90 marks]

 

Question 1: Git, version control [10]

Suppose we have the following revision history for a makefile in different branches of a repository:

  1. the original was created in the master branch, looking like this:
    CC=g++
    WARN=-Wall -Wextra
    
    .PHONY: all clean
    
    all: bin/proj
    
    bin/proj: obj/data.o obj/proj.o obj/process.o
    	${CC} ${WARN} obj/data.o obj/proj.o obj/process.o -o bin/proj
    
    obj/data.o: src/data.cpp src/data.h
    	${CC} ${WARN} -c src/data.cpp -o obj/data.o
    
    obj/proj.o: src/proj.cpp src/proj.h
    	${CC} ${WARN} -c src/proj.cpp -o obj/proj.o
    
    obj/process.o: src/process.cpp src/process.h
    	${CC} ${WARN} -c src/process.cpp -o obj/process.o
    
    clean: ;
    	rm -f obj/data.o obj/process.o obj/proj.o bin/proj

  2. then a new branch, Alternate was created, and the makefile was modified to have different content in the two branches (master and Alternate) as follows:

    # revised master makefile
    CC=g++
    WARN=-Wall -Wextra
    
    .PHONY: all clean
    
    all: bin/proj
    
    bin/proj: obj/data.o obj/proj.o obj/process.o
    	${CC} ${WARN} obj/data.o obj/proj.o obj/process.o -o bin/proj
    
    obj/%.o: src/%.cpp src/%.h
    	${CC} ${WARN} -c $< -o $@
    
    clean: ;
    	rm -f obj/data.o obj/process.o obj/proj.o bin/proj
    # revised alternate makefile
    CC=g++
    WARN=-Wall -Wextra
    
    OFILES=obj/dbg.o obj/data.o obj/process.o obj/proj.o
    HFILES=src/dbg.h src/data.h src/process.h src/proj.h
    TARG=bin/proj
    
    .PHONY: all clean
    
    all: ${TARG}
    
    bin/proj: ${OFILES} ${HFILES}
    	${CC} ${WARN} ${OFILES} -o bin/proj
    
    obj/dbg.o: src/dbg.cpp src/dbg.h
    	${CC} ${WARN} -c src/dbg.cpp -o obj/dbg.o
    
    obj/data.o: src/data.cpp src/data.h
    	${CC} ${WARN} -c src/data.cpp -o obj/data.o
    
    obj/proj.o: src/proj.cpp src/proj.h
    	${CC} ${WARN} -c src/proj.cpp -o obj/proj.o
    
    obj/process.o: src/process.cpp src/process.h
    	${CC} ${WARN} -c src/process.cpp -o obj/process.o
    
    clean: ;
    	rm -f ${OFILES}  ${TARG}

    (i) Give the git command(s) needed to merge the Alternate branch into the master.

    (ii) Show (as accurately as you can) what the makefile will look like after git identifies the merge conflict.

    (iii) Revise the conflicted makefile, appropriately blending the two sets of changes.

    Sample solution
    
    (i) git checkout master
        git merge Alternate
    
    (ii) The key here is to note that it will show the content common to both versions, the content that appears in just the alternate branch, and the content that appears in just the master branch, all separated using lines of <<<<<<<<<<<<<<<<<<< =================== >>>>>>>>>>>>>>>>>>> Content appearing in both: CC=g++ WARN=-Wall -Wextra .PHONY: all clean Content appearing in just the master: all: bin/proj bin/proj: obj/data.o obj/proj.o obj/process.o ${CC} ${WARN} obj/data.o obj/proj.o obj/process.o -o bin/proj obj/%.o: src/%.cpp src/%.h ${CC} ${WARN} -c $< -o $@ clean: ; rm -f obj/data.o obj/process.o obj/proj.o bin/proj Content appearing in just the Alternate: all: ${TARG} bin/proj: ${OFILES} ${HFILES} ${CC} ${WARN} ${OFILES} -o bin/proj obj/dbg.o: src/dbg.cpp src/dbg.h ${CC} ${WARN} -c src/dbg.cpp -o obj/dbg.o obj/data.o: src/data.cpp src/data.h ${CC} ${WARN} -c src/data.cpp -o obj/data.o obj/proj.o: src/proj.cpp src/proj.h ${CC} ${WARN} -c src/proj.cpp -o obj/proj.o obj/process.o: src/process.cpp src/process.h ${CC} ${WARN} -c src/process.cpp -o obj/process.o clean: ; rm -f ${OFILES} ${TARG}
    (iii) The key here is to recognize we need the variables from the Alternate branch but the .o rule from master, giving CC=g++ WARN=-Wall -Wextra OFILES=obj/dbg.o obj/data.o obj/process.o obj/proj.o HFILES=src/dbg.h src/data.h src/process.h src/proj.h TARG=bin/proj .PHONY: all clean all: ${TARG} bin/proj: ${OFILES} ${HFILES} ${CC} ${WARN} ${OFILES} -o bin/proj obj/%.o: src/%.cpp src/%.h ${CC} ${WARN} -c $< -o $@ clean: ; rm -f ${OFILES} ${TARG}

    Question 2: Recursive exploration in bash [10]

    Write a bash function that recursively explores a directory tree as follows:

    Sample solution
    
    function explore()
    {
       if [ $# -ne 1 ] ; then
          echo "0"
          return 0
       fi
    
       # in case we're given a path that revisits a directory multiple times, e.g.
       #    /foo/blah/../blah/../blah/somewhere
       # we'll try to extract a real absolute path from the one we're given
       dir=$(realpath -s $1)
       if [[ -d "$dir" && -x "$dir" && -r "$dir" ]] ; then
          # count files in this directory
          count=0
          for file in "$dir"/* ; do
              if [ -f "$file" ] ; then
                 (( count++ ))
              fi
          done
    
          # see if we've reached the root (no recursive call necessary, just output the count)
          if [ "$dir" == "/" ] ; then
             # done
             echo "$count"
             return 0
    
          else   # otherwise make a recursive call on parent, capture the echo'd count from that
             parent="$(dirname "$dir")"
             rec=$(explore "$parent")
             rec=${rec%$'\n'}        # removes trailing newline from captured output
             (( count += rec ))      # add counts from this dir and from recursive call
             echo "$count"
             return 0
          fi
       fi
    
       # default error handling case
       echo "0"
       return 0
    }
    

    Question 3: Specifications and drivers in bash [10]

    Given the specifications for the bash function, factor, below:
    (i) Write a bash script to act as a driver in testing factor (do NOT write factor).
    (ii) Write a suitable set of test data to use with your driver in testing factor.

    # function factor is defined in file factor.sh
    # factor expects a single parameter, which must be an integer greater than 0
    # if passed an incorrect parameter, or an incorrect number of parameters,
    #    then factor displays the value 0 (NOTHING ELSE) and returns 1
    # if passed a valid parameter, N,
    #    then factor displays the prime factors of N, separated by spaces, and returns 0
    # e.g. factor(12) displays 2 2 3
    

    Sample solution
    
    #! /bin/bash
    
    # driver for factor function
    # --------------------------
    # this driver expects two or more parameters:
    #    first is a filename, file contains expected output from factor
    #    second is an int, the expected return value from factor
    #    all remaining parameters are supposed to be passed in the call to factor
    #
    # driver runs factor on the parameter list,
    #    captures the output and return value,
    #    compares both to expected,
    #    and displays a pass/fail result
    #    in the event of a failure it details what aspect(s) failed
    
    # check params
    if [ $# -lt 2 ] ; then
       echo "correct use is $0 filename retval otherargs"
       echo "   where the filename contains expected output"
       echo "   the retval is the expected return value"
       echo "   and the otherargs are to be passed to function factor"
       echo "$0 will then call the function and compare its results to those expected"
       echo "   and display pass/fail results"
       exit 0
    fi
    
    expout=$1
    shift
    
    expret=$1
    shift
    
    if ! [[ -f "$expout" && -r "$expout" ]] ; then
       echo "invalid expected-output file: $expout"
       exit 0
    fi
    
    intpat='^[0-9]+$'
    if ! [[ $expret =~ $intpat ]] ; then
       echo "invalid expected-return value: $expret"
       exit 0
    fi
    
    tmpf=$(mktemp)
    dfile=$(mktemp)
    
    factor $@ 2>&1 > $tmpf
    retv=$?
    outp=$(cat $tmpf)
    
    # test for failure by return value
    if [ $retv -ne $expret ] ; then
       echo "failed: expected return $expret, actual $retv"
    fi
    
    # test for failure by output differences
    ((diff $tmpf $expout 2>&1 > $dfile) || (echo "failed: output differences"; cat $dfile))
    
    # test for success
    (((diff $tmpf $expout 2>&1 > /dev/null) && [ $retv -eq $expret ] ) && echo "passed")
    
    
    (ii) # Description and Expected Expected call to driver output return # one driver exp1 1 1 0 # a prime number, 3 driver exp3 3 3 0 # a composite number, 12 driver exp12 12 2 2 3 0 # a non number, foo driver expfoo foo 0 1 # zero driver exp0 0 0 1 # a negative integer, -1 driver exp-1 -1 0 1 # a non-integer, 5.7 driver exp5.7 5.7 0 1 # multiple args driver multi 1 2 3 0 1

    Question 4: C++, gdb and debugging [10]

    Suppose we have started gdb on a program (suitably compiled) containing the following segment of C++ code:
    int f(int m, int n)
    {
       int lim=sqrt(n);
       while (m <= lim) {
          if ((n%m) != 0) return m;
          m++;
       }
       return n;
    }
    
    void g(int n)
    {
       int m = 2;
       while (n > 1) {
          m = f(m,n);
          n = n / m;
          std::cout << m << std::endl;
       }
       std::cout << n << std::endl;
    }
    
    void h()
    {
       std::cout << "testing 12" << std::endl;
       g(12);
       std::cout << "done" << std::endl;
    }
    As accurately as you are able, show the program/gdb output that would result from the following sequence of commands (assuming that g is called from function h, shown above, and that no other program input/output takes place prior to first encountering the breakpoint).

    For each gdb command, be sure to clearly indicate which line/instruction in the program gdb pauses on.

    gdb q4x 
    break g 
    run 
    next 
    step 
    s 
    n 
    print m 
    p n 
    s 
    s 
    p m 
    p n 
    clear g 
    break f 
    c 
    p m 
    p n 
    

    Sample solution
    
    I did NOT expect you to get the exact output character for character,
    but I did expect you to recognize which instructions gdb would stop on,
    and what actions it would take at each point.
    
    One thing that was very important was to make it clear, at every point,
    which precise source code instruction you were currently operating on.
    
    > gdb q4x 
       ... assorted gdb license, help, initialization info ...
    (gdb) break g 
    Breakpoint 1 at 0xa43: file q4.cpp, line 17. 
    (gdb) run 
    Starting program: PATH/q4x  
    testing 12 
     
    Breakpoint 1, g (n=12) at q4.cpp:17 
    17	   int m = 2; 
    (gdb) n 
    18	   while (n > 1) { 
    (gdb) s 
    19	      m = f(m,n); 
    (gdb) s 
    f (m=2, n=12) at q4.cpp:7 
    7	   int lim=sqrt(n); 
    (gdb) n 
    8	   while (m <= lim) { 
    (gdb) p m 
    $1 = 2 
    (gdb) p n 
    $2 = 12 
    (gdb) s 
    9	      if (!(n%m)) return m; 
    (gdb) s 
    13	} 
    (gdb) p m 
    $3 = 2 
    (gdb) p n 
    $4 = 12 
    (gdb) clear g 
    Deleted breakpoint 1  
    (gdb) break f 
    Breakpoint 2 at 0x5555555549fe: file q4.cpp, line 7. 
    (gdb) c 
    Continuing. 
    2 
     
    Breakpoint 2, f (m=2, n=6) at q4.cpp:7 
    7	   int lim=sqrt(n); 
    (gdb) p m 
    $5 = 2 
    (gdb) p n 
    $6 = 6 
    

    Question 5: C macros, C++ templates [10]

    A segment of C++ code is provided below. Rewrite it, staying as close to the original as possible except for the following:

    1. Replace the C-style macro with an inline function.
    2. Make the class into a templated class.
    3. Provide an explicit instantiation of the class for doubles.
    #define ROOT(x)  (((x)>=0)?(sqrt(x)):(sqrt(fabs(x))))
    
    class Q5 {
       public:
          Q5() { }
          ~Q5() { }
          int look() { return data; }
          void set(int c);
       protected:
          int data;
    };
    
    void Q5::set(int c)
    {
       data = c;
    }

    Sample solution
    
    Note the question above has the corrected content for the question - you were given a
    fair amount of freedom to come up with a valid macro/inline function and
    valid templated class and explicit instantiation.
    
    // part 1 inline double root(double x) { if (x >= 0) return sqrt(x); else return sqrt(fabs(x)); }
    // part 2 template <class T> class Q5 { public: Q5() { } ~Q5() { } T look() { return data; } void set(T c); protected: T data; }; template <class T> void Q5::set(T c) { data = c; }
    // part 3 class Q5<double>;

    Question 6: Code inspections, gtk in C++ [10]

    Suppose you are tasked with developing an inspection checklist for C++ code that delivers visual elements using GTK. Specifically, your checklist is to be used by an inspector whose perspective is that of tester.

    Provide a list of inspection items that are specific to both gtk and the tester perspective, and justify the choices of those specific items.

    Sample solution
    
    Note: as a tester, I would want the code to contain as many "hooks"
       as possible, to support controllability and observability:
       I want to be able to easily force specific scenarios and observe
       their results as directly as possible, to make my life as a tester
       easier (i.e. making it easier to create specialized suites of
       test files and scripts).
    
    Key points: - is it VERY clear what widgets we have what events can take place with each of them (what actions are available to the user, what should then happen) and which handler is invoked for each event on each widget I would want it very clearly laid out and commented in the code, and in a very consistent/predictable fashion - related to the above: - have any widget/event/handler combinations been missed? - does the casting of gdata elements in each handler exactly match the casting from the data source - for each event that involves data capture, is the appropriate input checking/response being invoked - with respect to the "composition" of widgets, again, is it clearly and consistently commented/laid out and correctly handled: - are the correct items composed in the correct layout - are the items initially shown/hidden as per specifications - for events that are supposed to hide/unhide components, is the correct code in place to do so

    Question 7: Regular expressions in bash [10]

    Provide bash regular expressions for each of the following:

    1. the set of non-empty alphanumeric identifiers that do not begin with a digit

    2. the set of fixed point real numbers that have at least one digit before and after the decimal point

    3. the set of strings that end in 1 to 7 uppercase alphabetic characters

    4. the set of strings that contain fewer than 3 digits and do not contain the characters '@' or '#'

    Sample solution
    
    # part 1
    '^[a-zA-z][a-zA-Z0-9]*$'
    
    # part 2 (note the optional +/-, and that the . needs to be in []) '^[+-]?[0-9]+[.][0-9]+$'
    # part 3 (note we really only need to specify what's at the end of the string '[A-Z]{1,7}$'
    # part 4 concept: # possibly stuff that isn't @, #, or digits # then possibly a digit # then possibly stuff that isn't @, #, or digits # then possibly a digit # then possibly stuff that isn't @, #, or digits '^([^@#0-9]*[0-9]?){0,2}[^@#0-9]$'

    Question 8: Profiling [10]

    Suppose we have a functional version of a product ready for testing.

    The data sets the product works on involve a large number of files, each of which contains many lines of data.

    The client has not yet converted all of their data to the file format and directory structure that will be used by the final product - they have smaller sets of test data ready, but nothing approaching the scale that will be used in production.

    As part of testing, we wish to begin profiling our product to determine if it meets the requirements for time and memory efficiency.

    Provide a plan/process for carrying out preliminary profiling until realistic data sets are actually available. Discuss the strengths and weaknesses of your plan.

    Sample solution
    
    We should get more specific details on the time/memory constraints,
     - is it cpu time, real time, response time, worst case time, average time, ...?
     - is it peak memory use, average memory use, heap use, stack use, ...?
    
    Then we should identify the tools we will use to capture the relevant data,
       e.g. possibly valgrind with massif to capture memory data
                 and gprof to capture cpu data
    
    Then we need to profile the time and memory use across a variety of sizes
       of data sets, so we can extrapolate from those to predict utilization
       across larger sets.
    
         At the moment, the only "real" data we have are small subsets of the
         client data that has been converted to the needed format.  We can do
         our best to work with the smallest-through-largest of these as a
         starting point for our extrapolations, but there is a good chance
         this won't give us enough data for meaningful extrapolation.
    
         This means we need to try to generate larger pseudo-random sets of
         test data, which means we'll need to interact with the client for ideas
         on the best way to generate those, and come up with a parameterized
         script or program we can run (using the parameters to specify the
         desired "size" of test set).
    
    To carry this out on a meaningful number of test cases and
       plot the resulting data, we'll need a collection of scripts to
        - compile the program for gprof,
          run the program,
          run gprof on the results,
          filter the results and produce the data we can actually use
        - compile the program for valgrind/massif,
          run valgrind,
          run ms_print on the results,
          filter the results and produce the data we can actually use
    
    Finally, we can probably use a coordination script that
      - runs the test set generation
      - runs the gprof script
      - runs the valgrind script
    
    With all that in place, we can start generating and tabularizing
    our actual data, and use that as the basis for extrapolating the
    actual time and memory utilization.
    
    None of these scripts should be particularly complex, so the time
    required by our team should not be excessive.  This should allow us
    to begin generating and analyzing data long before the client has
    completed their data changeover.
    

    Question 9: Test automation with bash [10]

    Write a bash test script for a program, P, that expects two command line arguments: a filename and an integer specifying the number of lines of text in the file.

    Your bash script takes a command line argument, N, and prompts the user to enter N lines of data, each containing the arguments to be passed for one call to P.

    Your script reads each of the N lines, and for each it also prompts the user to enter the name of a file containing the expected output of the program for that set of arguments.

    If a file of expected output does not exist or is not a readable file then the script prints an error message and skips the test case, otherwise the script handles the test case as follows:

    Sample solution
    #! /bin/bash
    
    # here actually assumed we'll get the name of the program under test
    #    and the number of tests to run as command line arguments
    # (in question they were P and unspecified)
    progname=$1
    numtests=$2
    
    # keep prompting user for
    #    line of data
    #    name of expected output files
    ntest=0
    while [ $ntest -lt $numtests ] ; do
       echo "enter a line of arguments to be passed to $progname"
       read -e args
       echo "enter the name of the expected-output file"
       read expout
    
       if [[ -f "$expout" && -r "$expout" ]] ; then
    
          tmpf=$(mktemp)
          dfile=$(mktemp)
    
          $progname $args 2>&1 > $tmpf
    
          # test for failure by output differences
          ((diff $tmpf $expout 2>&1 > $dfile && echo "passed") || echo "failed"; cat $dfile)
    
       else
          echo "invalid file $expout, skipping this test case"
       fi
    
       # increment test count
       (( ntest++ ))
    done
    

    Question 10: Project implementation discussion [10]

    As one of the phase 2 changes for the course project, we added the requirement that grade displays be broken down by category, e.g. showing the total combined marks for the labs, the total combined marks for assignments, etc., as well as the overall total mark and lettergrade.

    Carefully explain your implementation of this specific feature in your project solution, and the specific challenges that arose from the addition of this requirement following phase 1.

    (If you did not complete the implementation of this requirement, then carefully explain your intended/planned solution approach, and the resulting challenges you still had to deal with.)

    Sample solution
    
    This of course depends heavily on the nature of your project solution.
    While marking, the instructor will have a copy of your solution open to
    better understand and evaluate your response.
    
    The sample solution is based on the posted phase 1 sample solution.
    
    The phase 1 sample solution actually decomposed the marks by category already:
     - When loading the marks file, an instance of the category class
       was constructed each time a new category name was encountered.
     - Once the title, category, weights, and maxmarks lines had all been read,
       the build of each category class was completed to hold all the information
       for that category, as well as recording where in the student marks vectors
       each result for that category could be found.
     - When evaluating a student's marks, their vector of marks was passed to a method
       in each category, which then computed the resulting contribution that category
       made to the total mark.  (The same function also returned a true/false results,
       indicating whether the student passed or failed that category, to support the
       use of the bonus rules later.)
     - In phase 1, the contributions for each category were simply added together
       to obtain and then display the overall mark.
    
    To revise this for phase 2, the category code did not need to change at all,
       we simply had to display each category contribution as it was calculated
       (in addition to still displaying the overall total).
    
    The challenge was thus effectively dealt with in phase 1, in determining how
       each category could know where "its marks" were in the student vector.
       This was actually done by having a position vector associated with the
       category, recording which positions in the student marks vector contained
       results relevant to this category.