Test Documentation

$Id: test_documentation.html,v 1.4 2005/04/20 18:31:40 bostic Exp $


Table of Contents

  1. Overview
  2. XQuery Tests
  3. DB XML Tests
  4. Benchmark

Overview

This document summarizes the state of the QA and Testing work done by Andy Wood (Parthenon Computing) as at October 2004. The intended audience is developers.

XQuery Tests

The XQuery test suite runs the W3C Use Cases directly through the XQuery engine - there is no DB XML involvement. The eval program is used to drive the tests.

The queries are configured in "xquery/test/w3c_usecases/*.xquery", and the expected results in "xquery/test/w3c_usecases/*.out". The program writes to "xquery/test/w3c_usecases/actual/*.out", and string comparisons are used.

The test data (queries and expected results) was put together as part of the Stylus XQuery project. Changes to the XQuery and XPath specifications may require changes to the test data.

The test code is mostly in "xquery001.tcl".

The test suite is run interactively as follows (from within the build directory):

% source ../test/test.tcl
% run xquery
      

The test suite is run in batch mode (e.g. by CruiseControl) by the script "run_tests.tcl".

DB XML Tests

The testing framework used for V1.x has been retained and extended for V2.0. A significant difference is the use of C++ programs for some of the tests. These programs can be executed either from the command line or from within the Tcl framework.

Tcl Code

The framework now handles the container type - whole document (WDS) or node level (NLS) - as a top-level variable, in much the same way as transacted and non-transacted environments are used. Various top-level procedures exist to run the test groups using different permutations of these variables. A brief summary is presented here - type help for more information.

A test group identifies all the tests in one unit, i.e. one Tcl source file.

  • run
    Runs a specific test group using WDS, with the option ("n)" of using a non-transacted environment. Type run ? for more information
  • run_nls
    As run, but using NLS. Type run_nls ? for more information
  • run_all
    Runs test group(s) under a choice of environments and/or container types. Type run_all ? for more information.
  • run_xml
    Runs the entire test suite - all test groups, using both environment types and both container types. Output is redirected to "ALL.OUT".

All of these procedures analyse the results and report any errors (lines starting with "FAIL" or pending items).

Test groups can still be executed directly, e.g. xml008. Default arguments to the procedures are used (no environment, WDS). Note that the results are not analyzed for failures and warnings (this was not an issue for 1.x since failures were fatal).

Note that "fail cases" often work by catching a DB XML exception, and checking the suitablity of the message. In the absence of an environment, these messages pollute the test output - but they are not failures.

The convert procedure generates query plans for the indexing and query processor tests in the 1.x test suite. Type convert ? for more information.

The test suite is run in batch mode (e.g. by CruiseControl) by the script "run_tests.tcl".

Developer Practices

I often "spot check" tests by commenting out the calls to sub-procedures within a test group. The results summary attempts to detect such comments to prevent the code being committed in this state!

Many of the procedures offer the option of forking - the advantage being that an application crash in one of the test groups does not abort the entire test run. The disadvantage is that we need to find a way of grabbing output from the sub-process - it is currently buffered.

C++ Code

C++ code has been used for a few tests (the program name is given):

  • 8.5 - multiple databases (uses dbxml_test_databases)
  • 9.3 - a simple resolver (uses dbxml_test_driver)
  • 11.1.5 - iterative methods on XmlResults (uses dbxml_test_query_processor_api)
  • 11.5 - W3C Use Cases (uses dbxml_test_driver)
  • 14.4 - updates using DOM methods (uses dbxml_test_driver)
  • 17 - input streams (uses dbxml_test_input_streams))

Additionally, the code that generates query plans from the 1.x test data uses the dbxml_test_driver program.

A logging system is used - - by default, the log files are written to "test_logs/" in the execution directory. Note that log files are appended to, not overwritten.

The code is configured in the "test/cpp" directory:

  • databaseManagement/
    The main function and test code for 8.5
  • inputStreams/
    The main function and test code for 17.
  • queryProcessorAPI/
    The main function and test code for 11.1.5.
  • unitTests/
    The code for the dbxml_test_driver program - a basic class hierarchy of test code
  • util/
    The logging code, and string transcoding code (copied from DB XML code).

Running the Programs Using Tclsh

A utility procedure (run_external_test_program in "xmlutils.tcl") wraps execution of these programs. Output is redirected to files of the form "<program name>.out" and "<program name>.err". Note that the Tcl exec command treats any output to stderr as an error - hence the need to redirect this output. The log files that result from program execution are scanned for errors and warnings.

The Tcl code usually reads the configuration data (program name, information about which documents to load etc.) from "index.xml" files that live in appropriate test directories, e.g. "test/document_set_14/". On some occasions the test data is hardwired.

Running the Programs Using the Command Line

All programs can be executed from the command line. Under Unix, ensure that both a DB XML environment directory (e.g. "TESTDIR") and a directory for the log files (e.g. "test_logs") exist prior to execution.

A number of common options exist. The main() function in dbxml_test_driver passes unrecognised arguments to sub-classes of UnitTest.

I've added notes to SR 11003 and SR 11009 about running the programs from the command line. Some examples are given here.

  1. W3C Use Cases (11.5)

    To run the NS tests using WDS and a non-transacted environment, type

    [prompt]$ rm -rf TESTDIR/ test_logs/; mkdir TESTDIR/ test_logs/
    [prompt]$ ./dbxml_test_driver --datadir ../test/document_set_11_5/w3c_usecases_NS --env ./TESTDIR --id 11.5.1 --logdir ./test_logs
                  

    This will write to "dbxml_test_driver.11.5.1.log" in "./test_logs". Add "--nls" to use NLS, and "--transacted" for a transacted environment.

    Edit the "index.xml" files in "test/document_set_11_5/w3c_usecases_*/" to reinstate the tests that have been commented out.

  2. DOM Update Tests (14.4)

    To run the document update tests using eager evaluation , a transacted environment and WDS:

    [prompt]$ ./dbxml_test_driver --env ./TESTDIR/ --datadir ../test/document_set_14/ --id 14.4 --logdir ./test_logs/ --eval eager --transacted --verify
    

    Use --debug to turn on full DB XML logging, and --nls to use NLS. The --verify flag scans the log file for errors and warnings.

    Adding additional tests will involve the following:

    • add a new <base> element to the "index.xml" file in "../test/document_set_14"
    • derive a new class from Functor in "UpdateDocumentTest.cpp", and implement the modify() and check() methods
    • add the name of the test (which matches the <method> element in the "index.xml" file) to the map inside UpdateDocumentTest::prepare()

Benchmark

The XBench site is an excellent reference for this work.

XBench Data

I downloaded XBench and followed the instructions. Under Linux, the TPC-W population generator program needs building. Edit "toxgene/toxgene" and set MIN_HEAP to an appropriate value.

Xbench can generate datasets of varying sizes. Using their terminology, "small" is about 10MB, "normal" about 100MB, "large" about 1GB, and "huge" about 10GB. Under Linux, there appears to be a bug in the population generator that prevents generation of the data-centric "large" datasets (I haven't tried anything bigger).

I have hacked the "small" datasets to generate additional "medium" (about 50MB), "tiny" (about 1MB) and "miniscule" (less than 20KB) data sets. This has been done for all four data groups.

Generating the data is a once-only task. The location of the data is specified as an argument to the benchmark programs. I have created four sub-directories of the parent - one for each of the data groups ("DC-MD/" etc.). Each of these sub-directories contains "small/", "medium/" etc.

The XBench download includes the queries that subsume the W3C Use Case functions. The queries have been written in a form suitable for use in DB XML (the files are configured in "test/benchmark/queries*.xml"). Note that some of the queries were malformed - changes have been made, and are documented as comments in the XML files.

Source Code

The benchmark is split into two areas - loading a container, and querying a container. The work is ultimately done by C++ code that is configured in "src/utils/load_container/" and "src/utils/query_runner/" respectively.

The test harness code is configured in "test/benchmark/". Two (nearly) distinct strands of work exist.

A Perl script has been written that does everything - from generating the datasets, to loading and querying the containers. This script, along with a configuration file and some platform specific modifications to the XBench scripts, is configured in "test/benchmark/xbench/". This is the original work done by Gareth.

A standalone Tclsh harness (i.e. nothing to do with DB XML) is configured in "test/benchmark/". The "xbench.tcl" file permits specific executions of the programs (source this file and type help for more information). The "batch_*.tcl" scripts enable batch execution of the programs.

Loading Containers

The C++ load_container program ultimately does the work. The -f (filelist) option needs to be used for large, multiple document sets under Linux.

A bulk load is best done using the Tcl batch script. Edit the arrays in "batch_load.tcl" to specify the range of container sizes. As an example (assuming the current directory is "build_unix"), the following creates and loads conainers in the environment "./benchmark/", reading the datasets from "~/xbench-data/":

tclsh ../test/benchmark/batch_load.tcl benchmark/ ~/xbench-data/ .

Type tclsh ../test/benchmark/batch_load ? for further information.

Note The Tcl code reports the timings to standard output - sumamrizing these figures is a manual task. Maybe the load container program could be adapted to write the timings to an XML file, similar to the query runner program?

Querying Containers

The C++ query_runner program ultimately does the work.

Note that eager evaluation is used, and that the containers are unindexed. Indexing strategies could be identifed by analysing the schemas for the data sets (in the XBench download).

The query runner program uses an XML configuration file. This file drives the data groups and container sizes that are to be queried. A sample file that contains all four data groups and a selection of container sizes has been comitted to CVS (in "src/utils/query_runner/"). Edit this file to restrict the scope of the benchmark.

A Tclsh batch script ("batch_query.tcl") can be used to query a set of containers. This calls the "query_container" procedure in "xbench.tcl". This procedure works by creating a config file on-the-fly from the XML query data. I generated the config file by saving these intermediate config files and then editing.

Note that I (nearly) always use the query runner program directly, rather than this Tcl script.

The results from the query runner are written to an XML file. I have written a new stylesheet that generates an HTML summary. Some sort of sanity check on the result counts for NLS and WDS should be made.

Here is an example of using the query runner:

[prompt] $ ./query_runner -c ../src/utils/query_runner/config.xml -f results.xml
[prompt] $  xsltproc ../src/utils/query_runner/stylesheet.xsl results.xml > query_runner_results.html
        

Note that the timings are sum of the calls to XmlContainer::query() - the actual elapsed time for program execution is considerably longer.