Experiments: advanced features

In the 5-minute tutorial, we have seen how to create a simple experiment that sorts an array and calculates the time it takes. We will modify this example in various ways to show the functionalities you can add to an experiment.

In this section, you will learn to use advanced features of experiments:

Using multiple experiment classes

Suppose first that we would like to compare multiple sorting algorithms. We already created the GnomeSort experiment, so we could simply create other classes that use a different sorting algorithm, such as Bubble sort or Quick sort.

However, there are lots of things in common with these experiments: apart from the actual sorting, everything else is similar. It is therefore wise, in the object-oriented tradition, to factor out these functionalities into a common ancestor. Let us call it the SortExperiment:

public abstract class SortExperiment extends Experiment {
  public SortExperiment(int n) {
    setInput("Size", n);
  }

  public void execute() {
    // Generate random array of given size
    Random rand = new Random();
    int n = readInt("Size");
    int[] array = new int[n];
    for (int i = 0; i < n; i++)
      array[i] = rand.nextInt();
    // Sort
    long start = System.currentTimeMillis();
    sort(array);
    long end = System.currentTimeMillis();
    write("Time", end - start);
  }

  public abstract void sort(int[] array);
}

Each specific experiment now only needs to implement the sort method, which performs the actual sorting. Hence our GnomeSort experiment becomes:

public class GnomeSort extends SortExperiment {
  public GnomeSort(int n) {
    super(n);
    setInput("Algorithm", "Gnome Sort");
  }

  public void sort(int[] array) {
    int i = 0;
    while (i < array.length) {
      if (i == 0 || array[i-1] <= array[i]) i++;
      else {int tmp = array[i]; array[i] = array[i-1]; array[--i] = tmp;}
    }
  }
}

Note that our experiment now has an additional input parameter, which contains the name of the algorithm used for sorting.

Structured in this way, it is easy to create new classes that would use other sorting algorithms (we won't show them here; look at the code examples). Our lab can then include experiments of various kinds:

...
add(new GnomeSort(size), t);
add(new QuickSort(size), t);
add(new BubbleSort(size), t);
...

We also need to change the table, since now, there are multiple experiments for the same value of parameter Size. Let us add column "Algorithm" to the table:

ExperimentTable t = new ExperimentTable("Algorithm", "Size", "Duration");

If you run the experiments of this new lab, the table it produces will now look like this:

The table automatically groups all cells with the same value of "Algorithm". (The way cells are grouped depends on the order in which the names are enumerated when creating it. See Tables.)

Prerequisites

Our new lab has one slight problem: the array is randomly generated by each experiment. This means that for a given size, the experiments do not sort the same array! We could fix this by making the random number generator deterministic (by giving the same seed every time), but a better way would be to generate the arrays of each size in advance, save them to files called array-100.txt, array-200.txt, etc., and simply have the experiments read these files when asked.

Suppose we already have these files. We could change SortExperiment so that it reads the array from the corresponding file:

public abstract class SortExperiment extends Experiment {
  public SortExperiment(int n) {
    setInput("Size", n);
  }

  public void execute() {
    // Read array from file
    int n = readInt("Size");
    String filename = "array-" + n + ".txt";
    String[] elements = FileHelper.readToString(new File(filename)).split(",");
    int[] array = new int[n];
    for (int i = 0; i < n; i++)
      array[i] = Integer.parseInt(elements[i].trim());
    // Sort
    long start = System.currentTimeMillis();
    sort(array);
    long end = System.currentTimeMillis();
    write("Time", end - start);
  }

  public abstract void sort(int[] array);
}

Now, our experiment depends on an external resource to run successfully; this is called a prerequisite. Clearly, we do not want to run an experiment if the corresponding file does not exist. It is possible to signal this to our lab by implementing a method of class Experiment called prerequisitesFulfilled. This method returns true by default, indicating that the experiment is ready to run. We can override this behaviour so that it returns false if we can't find the input file:

public boolean prerequisitesFulfilled() {
  int n = readInt("Size");
  String filename = "array-" + n + ".txt";
  return FileHelper.fileExists(filename);
}

If you compile and run this lab, you will see that an experiment will have the status "Needing prerequisites", instead of "Ready", if it is missing the file it is looking for.

Generating prerequisites

Instead of creating the file by some external means, it would be even better if our lab could generate the files by itself. A first possibility would be to include code that creates the files in the beginning of its setup method, before the experiments are actually created. However, all the files would require to be generated, even if you wish to run just a few of the experiments. Moreover, if you change the parameters of your experiments (e.g., using other values for size), you must make sure that your generation code follows suit.

Better still is to make each experiment responsible of creating its input file if it does not exist. This is done by implementing another method called fulfillPrerequisites. The process is as follows: when running an experiment, LabPal first calls prerequisitesFulfilled; if the experiment returns true, it calls execute right away. Otherwise, LabPal calls this experiment's fulfillPrerequisites method, and then executes it.

public void fulfillPrerequisites() {
  Random rand = new Random(0);
  int n = readInt("Size");
  String filename = "array-" + n + ".txt";
  PrintStream ps = new PrintStream(new File(filename));
  for (int i = 0; i < n; i++) {
    if (i > 0) ps.print(",");
    ps.print(rand.nextInt());
  }
  ps.close();
}

This method presents several advantages:

  • The dependency of each experiment on some external resource is made explicit.

  • For a given array size, the corresponding input file is generated only once, by the first instance of the experiment that is executed. For other experiments with the same value of "Size", method prerequisitesFulfilled will return true since the file is already there.

  • The lab does not need to include the input files; they can be generated on demand. This means that the resulting JAR file is much smaller. Moreover, since the random number generator is instantiated with a specific seed, the generated files will be identical on every machine.

  • The code for generating the resources is placed near the code that uses them.

It is also possible to clean the environment of temporary files that belong to an experiment, by implementing the cleanPrerequisites method. In our example, this cleanup would involve deleting the corresponding input file if it exists:

public void cleanPrerequisites() {
  int n = readInt("Size");
  String filename = "array-" + n + ".txt";
  if (FileHelper.fileExists(filename))
    FileHelper.deleteFile(filename);
}

This method can be called from the web console, by selecting an experiment and clicking on the Clean button.

Dealing with errors

An experiment may encouter an error during its execution, either due to a problem in the environment (missing resource or program), number or memory overflows, or some other reason. It is possible to signal the abnormal termination of an experiment to the user; this will be displayed in the web console by showing a red "X" icon next to the faulty experiment.

Doing so is simple: method execute can be made to throw an exception of type ExperimentException.

If method execute terminates without throwing anything, LabPal assumes that the experiment finished successfully, and will display a green "checkmark" icon.

For example, method execute of SortExperiment could be modified to add an extra check for the presence of the required file, and throw an exception if it is not there:

public void execute() throws ExperimentException {
  // Read array from file
  int n = readInt("Size");
  String filename = "array-" + n + ".txt";
  if (!FileHelper.fileExists(filename))
    throw new ExperimentException("File " + filename + " not found");
  String[] elements = FileHelper.readToString(new File(filename)).split(",");
  int[] array = new int[n];
  for (int i = 0; i < n; i++)
    array[i] = Integer.parseInt(elements[i].trim());
  // Sort
  long start = System.currentTimeMillis();
  sort(array);
  long end = System.currentTimeMillis();
  write("Time", end - start);
}

In the unlikely event that the experiment is executed without first creating the required file, an exception would be thrown with a descriptive error message. In the web console, clicking on this experiment would show this message.

Setting a progress indicator

Some experiments can be very long to execute; in such cases, it may be desirable to have an idea of the progression of the experiment. To this end, each experiment has its own "progress bar" that it can update as it pleases. One can do so by simply calling method setProgression. This method takes a single value between 0 and 1, indicating the level of progression of the experiment (0 meaning not started, and 1 indicating completion).

For example, here is the use of a progress bar in the body of method sort for the ShellSort experiment class:

public void sort(int[] array) {
  float num_iterations = 0, max_iterations = (Math.log(2, array.length) - 1);
  for (int gap = array.length / 2; gap > 0; gap /= 2 ) {
    setProgression(++num_iterations / max_iterations);
    for(int i = gap; i < array.length; i++) {
      int tmp = array[i];
      for (int j = i; j >= gap && tmp < array[ j - gap ]; j -= gap) {
        array[ j ] = array[ j - gap ];
      }
      array[ j ] = tmp;
    }
  }
}

The interesting bit is the call to setProgression in line 4. Since the value of gap is divided by 2 on every iteration, the loop executes log~2~(array.length / 2) times. The progress indicator is updated at every turn of the loop to the fraction of all iterations completed so far. In LabPal's web console, this progress indicator is shown as a blue bar beside the experiment that is currently running; refreshing the page will refresh the indicator.

Obviously, it is up to the experiment's designer to come up with a meaningful way of measuring progress. Doing so is optional, especially for experiments that are very short. If no call to setProgression is made during the execution of an experiment, the indicator will stay at 0 until the experiment is finished.

Generate multiple data points

It may be cumbersome to have one experiment for each data point you want to generate. For example, suppose you have a function f that processes numbers from 0 to 1,000, and you'd like to measure the elapsed time after each 100 values processed. One possible way would be to create an experiment like this:

public MyExperiment extends Experiment {
  public MyExperiment(int n) {
    setInput("n", n);
  }
  public void execute() {
    long start = System.currentTimeMillis();
    for (int i = 0; i < readInt("n"); i++) {
      f(i);
    }
    long end = System.currentTimeMillis();
    write("Duration", end - start);
  }
}

To get the elapsed time after each 100, you could do:

for (int i = 0; i < 1000; i+= 100)
  add(new MyExperiment(i));
}

This is not very efficient. To get the time for 200 values, you need to restart the processing from the beginning, and re-process the first 100; ditto for 300, 400, etc. What we rather want is an experiment that processes the 1,000 values once, but generates multiple data points.

Fortunately, this can be done. Instead of writing a single value, an experiment can write a list of values. A more efficient way of implementing the experiment would hence be:

public MyExperiment extends Experiment {
  public MyExperiment() {
    JsonList list_x = new JsonList(), list_y = new JsonList();
    for (int i = 0; i < 1000; i += 100) {
      list_x.add(i);
      list_y.add(JsonNull.instance);
    }
    setInput("n", list_x);
    write("Time", list_y);
  }
  public void execute() {
    long start = System.currentTimeMillis();
    for (int i = 0; i < readInt("n"); i++) {
      f(i);
      if (i % 100 == 0) {
        long time = System.currentTimeMillis();
        ((JsonList) read("Time")).set(i / 100, time - start);
      }
    }
  }
}

When instantiated, this experiment creates two lists: the first contains the values 0, 100, ... 900 and is declared as an input parameter called "n". The second is a list of nulls of the same size, declared as an output parameter called ("Time");

When running, the experiment processes the number one by one; at every step of 100, it computes the elapsed time since the start, and replaces the corresponding null value in list "Time" by this elapsed time. The end result is that the "Time" list is progressively filled with partial running times.

The end result is that this experiment has for parameters two lists of 10 elements, called "n" and "Time", like these:

  • n = [0, 100, 200, ..., 900]

  • Time = [0, 1234, 2345, ..., 123456]

When encountering such an experiment, an ExperimentTable will expand the values of these lists into as many entries as there are matching elements. In this case, this would yield the table we are looking for:

Group experiments

Rather than have a long list of experiments, it may be desirable to organize experiments in groups. This is simple, one simply has to create a Group object, and to put experiments inside:

Group g = new Group("Gnome Sort");
add(g);
GnomeSort exp = new GnomeSort(10);
g.add(exp);

As usual, don't forget to add the group to the lab.

The experiment list will now show a collapsible section heading named "Gnome Sort"; clicking on the triangle to the left will show/hide the set of experiments associated to this group.

Last updated