DESCRIPTION

This program combines evidence maps (currently GRASS floating point raster only) using Dempster's Rule of Combination. It is part of a general framework for Dempster-Shafer Theory (DST) modelling with GRASS GIS.

Introduction to the DST

(Note: the following is a very brief introduction to DST. Many details have been left out to keep everything easy to understand. If you want the full mathematical background, refer to one of the many excellent sources on the internet that can be found by using "Dempster Shafer Theory" as a search phrase. The original publication is: Shafer, G. (1976). A Mathematical Theory of Evidence. Princeton University Press, Princeton, New Jersey, USA.)

Consider a standard situation in GIS-based research. A scientist wants to build a spatial model and check its plausibility. He/she has:
  1. written down a number of hypotheses. These hypotheses include all possible outcomes of the model,
  2. created a number of GIS maps that encode variables (evidences) which he/she deems to be of importance for the model,
  3. found a way to quantify the evidence.
For each hypothesis, our scientist would now simply like to combine all supporting evidences and calculate the total degree of belief he/she can have in that hypothesis. These belief values should then be written to GIS maps to form a spatial model.

The mathematical framework to achieve this is provided by the Dempster-Shafer Theory of Evidence (DST). To clarify things, we will need to make some definitions:

BPAs can be assigned to a singleton hypothesis in H as well as to subsets such as {H1,H2}. What this means is that DST has the ability to represent uncertainty as subsets of H. E.g. if two hypotheses H1='a' and H2='b' are supplied, then there will always also exist a set of hypotheses A={H1,H2} to represent the belief that both could be true ('a OR b').

Evidences from a variety of source can be combined using Dempster's Rule of Combination:

BPA chart

In words: Dempster's Rule computes a measure of agreement between two sources of evidence for various (sets of) hypotheses in the FOD (in the formula above: A,B,C). It focuses on the hypotheses which both sources support. The resulting BPA is calculated by combining the BPAs of the hypotheses from both sources that yield the hypotheses of the combined body. The result is also normalised (denominator) to ensure that it is a valid BPA. From the BPAs, a number of useful DST functions can be calculated. The most important ones are belief and plausibility.

The belief function Bel(H) computes the total belief in a hypothesis (set) A. The total belief in A is the BPA mass of A itself plus the BPA mass attached to all the subsets of (B in the formula below:

BPA chart

In words: Bel(A) is all the evidence that speaks in favour of A.

DST has a very important characteristic which makes it different from probability theory: if Bel(H)< 1, then the remaining evidence 1-Bel(H) does not necessarily refute H (in probability theory we would have H2=1-H1 if the FOD was {H=H1,H2}). Thus, some of the remaining evidence might plausibly be assigned to (sets of) hypotheses that are subsets of or include A. This is represented by the plausibility function:

BPA chart

In words: Pl(A) represents the maximum possible belief in A that could be achieved if there was no uncertainty at all.

Program Functionality

The dst.combine program can be used to combine evidences from different sources using Dempster's Rule of combination. Currently, all evidences must be in GRASS 5 floating point raster map format. In order to turn any raster map into an evidence map, the BPAs have to be calculated. The way to do this is totally dependent on the model design and it is left up to the user to e.g. create a custom r.mapcalc script to take care of BPA calculations (with one exception: for predictive models, the r.dst.bpa program can be used).

The model layout, i.e. evidences and associated hypotheses must respect some formal mathematical restrictions. To take the burden of this from the user, all evidences and hypotheses are stored in a special XML format element called DST knowledge base file in the current working location. This file and its contents can be managed by using the dst.update command.

From the sources of evidence listed in the DST knowledge base file, one or more of the following DST functions can be calculated (for each hypothesis/set of hypotheses in the knowledge base file) and the result(s) output to a new GRASS raster map (all values will be in the range 0 to 1):

Belief
Bel(A) is the total belief in a hypothesis. This is the most basic DST function.
Plausibility
Pl(A) is the maximum achievable belief given that there is no uncertainty.
Doubt
This is simply defined as 1-Pl(A).
Commonality
(Definition missing).
Belief Interval
Another aspect of representing uncertainty, the belief interval measures the difference between current belief and maximum achievable belief. It is defined as Pl(A)-Bel(A). Areas with high values for the belief interval represent "hot spots" where additional/better information would improve model results.
Weight of Conflict
If the weight of conflict is > 0 it indicates that evidences from different sources disagree with each other. A high weight of conflict might indicate a serious flaw in the model design or disagreement of evidences supplied by different people.
Maximum BPA
This gives the maximum belief mass contributed by any of the sources of evidence.
Minimum BPA
This gives the minimum belief mass contributed by any of the sources of evidence.
Maximum BPA source
This identifies the source of evidence by name, which has contributed the highest belief mass. It is a useful tool in combination with the "maximum BPA" measure.
Minimum BPA source
This identifies the source of evidence by name, which has contributed the lowest belief mass. It is a useful tool in combination with the "minimum BPA" measure.

DST Predictive Modelling

If you have need for a flexible, raster-based spatial predictive modelling framework, DST might be just the right tool for you. In fact, this software has been developed with predictive models in mind for which the DST approach has a number of benefits: A DST predictive model may be very simple and consist of only two hypotheses for 'site' and 'no site' (plus the combination of the two to represent uncertainty). It can be built like this:
  1. Save your known site locations in a GRASS site list and take a random sample using v.random.sample.
  2. Convert all sources of evidence (coverage maps, buffer objects, height maps etc.) into evidence raster maps using r.categorize and r.dst.bpa (with the random sample).
  3. Register all evidence maps and hypotheses in a DST knowledge base file using dst.update (this will automatically create all additional uncertainty hypotheses sets).
  4. Associate sources of evidence using dst.source.
  5. Combine evidence from the sources in the knowledge base using dst.combine.
  6. Take a look at the output maps and verify results with the full set of known sites (use v.report for convenience).
NEW as of version 1.5: You can now use dst.predict to create predictive models with a single, easy-to-use command!

A protocol of a GRASS command line session might look like this:

# A predictive model of lost Maya towns in the jungle.
# (do not take serious ;)

# (point positions of known Maya towns are stored in vector map "locations")
# take a random sample of 50%
v.random.sample input=locations output=sample size=50

# Create a digital elevation raster map from height measurements.
# Turn into an integer map with categories 0-5m, 5-10m, etc.
s.surf.rst input=elevations elev=height
r.categorize input=height output=height_5m mode=width,5

# Import a vector map showing jungle vegetation coverages.
# Interactively label vegetation categories.
v.to.rast in=vegetation output=vegetation
r.support

# Create evidence maps. The idea is that Maya towns are 
# found at certain heights and in areas with specific vegetation.
r.dst.bpa raster=height_5m sites=sample output=bpa.height
r.dst.bpa raster=vegetation sites=sample output=bpa.vegetation

# Create new knowledge base file.
dst.create mayatowns

# Register hypotheses in a knowledge base file.
# Attach evidences to hypotheses.
dst.update mayatowns add=SITE
dst.update mayatowns add=NOSITE
dst.update mayatowns rast=bpa.height.SITE hyp=SITE
dst.update mayatowns rast=bpa.height.NOSITE hyp=NOSITE
dst.update mayatowns rast=bpa.height.SITE_NOSITE hyp=SITE,NOSITE
dst.update mayatowns rast=bpa.vegetation.SITE hyp=SITE
dst.update mayatowns rast=bpa.vegetation.NOSITE hyp=NOSITE
dst.update mayatowns rast=bpa.vegetation.SITE_NOSITE hyp=SITE,NOSITE

# Define sources of evidence.
dst.source mayatowns add=height
dst.source mayatowns add=vegetation
dst.source mayatowns source=height rast=bpa.height.SITE hyp=SITE
dst.source mayatowns source=height rast=bpa.height.NOSITE hyp=NOSITE
dst.source mayatowns source=height rast=bpa.height.SITE_NOSITE hyp=SITE,NOSITE
dst.source mayatowns source=vegetation rast=bpa.vegetation.SITE hyp=SITE
dst.source mayatowns source=vegetation rast=bpa.vegetation.NOSITE hyp=NOSITE
dst.source mayatowns source=vegetation rast=bpa.vegetation.SITE_NOSITE hyp=SITE,NOSITE

# Combine sources of evidence and output belief maps
# for all hypotheses.
dst.combine mayatowns sources=height,vegetation output=dst.mayatown

# Compare with the full set of sites to see how well the
# model does (most of the sites should fall into cells
# with high belief values for "SITE").
v.report map=dst.mayatown.SITE.bel sites=locations

Flags

-a
Append log output to existing ASCII file.
-q
Quiet operation: do not display progress on screen.
-n
Turn off normalisation and signal a warning if evidence does not sum to 1. Per default, the program will normalise evidences to force all evidences to have the same weight and ensure that the results will be in the range 0 to 1. You can turn this off if you want to check for potential errors in your own BPA assignments.

Parameters

file=name
Name of DST knowledge base file to use. You can get a listing of DST knowledge base files in your current mapset by invoking dst.list .
sources=name,[name,...]
Specify which sources of evidence in the knowledge base file to combine. The default is all sources.
type=name
Select what type of evidence to combine. This refers to the GIS data format in which the evidence is stored. Currently only 'rast' is a valid choice and also the default.
output=name
Basename of all output maps. This is set to the name of the current location by default and will be suffixed with [hypothesis].[value] (see below).
hypotheses=name,[name,...]
Choose for which hypotheses the DST values (see below) should be calculated. The default is all (including the NULL hypothesis). NOTE: if you want to specify hypotheses sets, you must enclose this parameter in double quotes and all set names in "set brackets"! E.g: hypotheses="a,b,c,{a,c},{a,b,c}". Be sure to use the correct element order when specifying sets.
values=name,[name,...]
Valid choices are: bel,pl,doubt,common,bint,woc. See section on "Program functionality" for a description. The default is to calculate only belief (bel) values.
logfile=name
Specfiy a valid file name if you want a full log of the calculations.

Notes

This program was developed as part of the GRASS 5 DST Predictive Modelling Toolkit.
A big "Thank You!" goes to Gavin Powell, 3d Vision and Geometry, Dept of Computer Science, Cardiff University, for the DST core routines used in this program, help and advise and his permission to publish his code as part of a GPL'd software.
A lot of the information in this document was also taken from the work of Mounia Lalmas et al. on information retrieval from structured documents using DST.
The term 'knowledge base file' was first used in the manual of the IDRISI GIS software which also offers some DST functionality.
GIS-based predictive modelling using belief values is also discussed in a paper by Eric J. Lorup and another one by Bo Ejstrud.

SEE ALSO

dst.predict
dst.source
dst.update
r.categorize
r.dst.bpa
r.mapcalc
v.random.sample
v.report

AUTHORS

Benjamin Ducke,
University of Bamberg, Germany
Gavin Powell,
3d Vision and Geometry, Dept of Computer Science,
Cardiff University

Last changed: 2005/02/20