Experiments: advanced features
In the 5-minute tutorial, we have seen how to create a simple experiment that sorts an array and calculates the time it takes. We will modify this example in various ways to show the functionalities you can add to an experiment.
In this section, you will learn to use advanced features of experiments:
Using multiple experiment classes
Suppose first that we would like to compare multiple sorting algorithms. We already created the GnomeSort
experiment, so we could simply create other classes that use a different sorting algorithm, such as Bubble sort or Quick sort.
However, there are lots of things in common with these experiments: apart from the actual sorting, everything else is similar. It is therefore wise, in the object-oriented tradition, to factor out these functionalities into a common ancestor. Let us call it the SortExperiment
:
Each specific experiment now only needs to implement the sort
method, which performs the actual sorting. Hence our GnomeSort experiment becomes:
Note that our experiment now has an additional input parameter, which contains the name of the algorithm used for sorting.
Structured in this way, it is easy to create new classes that would use other sorting algorithms (we won't show them here; look at the code examples). Our lab can then include experiments of various kinds:
We also need to change the table, since now, there are multiple experiments for the same value of parameter Size. Let us add column "Algorithm" to the table:
If you run the experiments of this new lab, the table it produces will now look like this:
Algorithm
Size
Duration
100
Gnome Sort
200
...
100
Quick Sort
200
...
100
Bubble Sort
200
...
The table automatically groups all cells with the same value of "Algorithm". (The way cells are grouped depends on the order in which the names are enumerated when creating it. See Tables.)
Prerequisites
Our new lab has one slight problem: the array is randomly generated by each experiment. This means that for a given size, the experiments do not sort the same array! We could fix this by making the random number generator deterministic (by giving the same seed every time), but a better way would be to generate the arrays of each size in advance, save them to files called array-100.txt
, array-200.txt
, etc., and simply have the experiments read these files when asked.
Suppose we already have these files. We could change SortExperiment
so that it reads the array from the corresponding file:
Now, our experiment depends on an external resource to run successfully; this is called a prerequisite. Clearly, we do not want to run an experiment if the corresponding file does not exist. It is possible to signal this to our lab by implementing a method of class Experiment
called prerequisitesFulfilled
. This method returns true
by default, indicating that the experiment is ready to run. We can override this behaviour so that it returns false
if we can't find the input file:
If you compile and run this lab, you will see that an experiment will have the status "Needing prerequisites", instead of "Ready", if it is missing the file it is looking for.
Generating prerequisites
Instead of creating the file by some external means, it would be even better if our lab could generate the files by itself. A first possibility would be to include code that creates the files in the beginning of its setup
method, before the experiments are actually created. However, all the files would require to be generated, even if you wish to run just a few of the experiments. Moreover, if you change the parameters of your experiments (e.g., using other values for size), you must make sure that your generation code follows suit.
Better still is to make each experiment responsible of creating its input file if it does not exist. This is done by implementing another method called fulfillPrerequisites
. The process is as follows: when running an experiment, LabPal first calls prerequisitesFulfilled
; if the experiment returns true
, it calls execute
right away. Otherwise, LabPal calls this experiment's fulfillPrerequisites
method, and then executes it.
This method presents several advantages:
The dependency of each experiment on some external resource is made explicit.
For a given array size, the corresponding input file is generated only once, by the first instance of the experiment that is executed. For other experiments with the same value of "Size", method
prerequisitesFulfilled
will returntrue
since the file is already there.The lab does not need to include the input files; they can be generated on demand. This means that the resulting JAR file is much smaller. Moreover, since the random number generator is instantiated with a specific seed, the generated files will be identical on every machine.
The code for generating the resources is placed near the code that uses them.
It is also possible to clean the environment of temporary files that belong to an experiment, by implementing the cleanPrerequisites
method. In our example, this cleanup would involve deleting the corresponding input file if it exists:
This method can be called from the web console, by selecting an experiment and clicking on the Clean button.
Dealing with errors
An experiment may encouter an error during its execution, either due to a problem in the environment (missing resource or program), number or memory overflows, or some other reason. It is possible to signal the abnormal termination of an experiment to the user; this will be displayed in the web console by showing a red "X" icon next to the faulty experiment.
Doing so is simple: method execute
can be made to throw an exception of type ExperimentException.
If method execute
terminates without throwing anything, LabPal assumes that the experiment finished successfully, and will display a green "checkmark" icon.
For example, method execute
of SortExperiment could be modified to add an extra check for the presence of the required file, and throw an exception if it is not there:
In the unlikely event that the experiment is executed without first creating the required file, an exception would be thrown with a descriptive error message. In the web console, clicking on this experiment would show this message.
Setting a progress indicator
Some experiments can be very long to execute; in such cases, it may be desirable to have an idea of the progression of the experiment. To this end, each experiment has its own "progress bar" that it can update as it pleases. One can do so by simply calling method setProgression
. This method takes a single value between 0 and 1, indicating the level of progression of the experiment (0 meaning not started, and 1 indicating completion).
For example, here is the use of a progress bar in the body of method sort
for the ShellSort experiment class:
The interesting bit is the call to setProgression
in line 4. Since the value of gap
is divided by 2 on every iteration, the loop executes log~2~(array.length / 2) times. The progress indicator is updated at every turn of the loop to the fraction of all iterations completed so far. In LabPal's web console, this progress indicator is shown as a blue bar beside the experiment that is currently running; refreshing the page will refresh the indicator.
Obviously, it is up to the experiment's designer to come up with a meaningful way of measuring progress. Doing so is optional, especially for experiments that are very short. If no call to setProgression
is made during the execution of an experiment, the indicator will stay at 0 until the experiment is finished.
Generate multiple data points
It may be cumbersome to have one experiment for each data point you want to generate. For example, suppose you have a function f that processes numbers from 0 to 1,000, and you'd like to measure the elapsed time after each 100 values processed. One possible way would be to create an experiment like this:
To get the elapsed time after each 100, you could do:
This is not very efficient. To get the time for 200 values, you need to restart the processing from the beginning, and re-process the first 100; ditto for 300, 400, etc. What we rather want is an experiment that processes the 1,000 values once, but generates multiple data points.
Fortunately, this can be done. Instead of writing a single value, an experiment can write a list of values. A more efficient way of implementing the experiment would hence be:
When instantiated, this experiment creates two lists: the first contains the values 0, 100, ... 900 and is declared as an input parameter called "n". The second is a list of nulls of the same size, declared as an output parameter called ("Time");
When running, the experiment processes the number one by one; at every step of 100, it computes the elapsed time since the start, and replaces the corresponding null value in list "Time" by this elapsed time. The end result is that the "Time" list is progressively filled with partial running times.
The end result is that this experiment has for parameters two lists of 10 elements, called "n" and "Time", like these:
n = [0, 100, 200, ..., 900]
Time = [0, 1234, 2345, ..., 123456]
When encountering such an experiment, an ExperimentTable will expand the values of these lists into as many entries as there are matching elements. In this case, this would yield the table we are looking for:
n
Time
0
0
100
1234
200
2345
...
...
900
123456
Group experiments
Rather than have a long list of experiments, it may be desirable to organize experiments in groups. This is simple, one simply has to create a Group
object, and to put experiments inside:
As usual, don't forget to add
the group to the lab.
The experiment list will now show a collapsible section heading named "Gnome Sort"; clicking on the triangle to the left will show/hide the set of experiments associated to this group.
Last updated