OPeNDAP/GDAL Driver HOWTO James Gallagher 12 December 2003 This document describes how to configure the OPeNDAP/GDAL driver. It documents version 1.0 of the driver and uses version 3.4.1 of DODS/OPeNDAP. 0. Introduction The OPeNDAP/GDAL driver enables GDAL to read from data sources served using the OPeNDAP (aka DODS) software. In the rest of this HOWTO, I'll use the names DODS and OPeNDAP interchangeably, however, there is a difference. DODS (and its sister project NVODS) are oceanographic data systems; OPeNDAP is a not-for-profit corporation formed to develop the discipline-neutral software component used by those (and other) projects. You can find out about the DODS project at its web site: http://www.unidata.ucar.edu/packages/dods/. And you can learn about OPeNDAP at its web site: http://www.opendap.org/. Currently there is considerable overlap between the two and OPeNDAP's software is being distributed at the DODS web site. The web sites contain on-line and printable documentation along with software download pages, news and a list of data sources. In addition, you can search for DODS data sources using the NASA Global Change Master Directory or Google (look for DODS or GDS). * Limitations of the current version of the driver This driver is limited to data stored in Arrays or Grids (aka Rasters). In the future we will support reading vector data using the OGR API. DODS is a read-only data system, thus it is not possible to use this driver to write data to a DODS/OPeNDAP server since that really has no meaning in the context of the Data Access Protocol (DAP). You can read from the this driver and use another driver (say geotiff) to write data to a local file. This code is currently at version 1.0 (version.h holds the current version number). It needs more testing. --------------------------------------------------------------------------- 1. Building GDAL with OPeNDAP support To build the OPeNDAP driver, you will need the DODS software distribution version 3.4.1. Later versions may also work. First build and install DODS. Then run GDAL's configure script and provide it with the pathname to the root of your DODS installation. To do this use the --with-dods-root= option. See 'configure --help' for more information. If you plan on using EPSG numbers for Projection and Geographic coordinate systems, you'll also need to be sure and build the geotiff software. I am fairly certain that these are automatically built for newer versions of GDAL/OGR. --------------------------------------------------------------------------- 2. Enabling OPeNDAP data sources To use the driver, pass the GDALDataset::Open() method a string which names the DODS data source and the raster. To do this you set the pszFilename field of the GDALOpenInfo structure to the string. While most drivers accept a simple filename for this argument, the DODS driver takes a URL to the DODS data source. This URL has some additional information added to it so the driver knows which raster within the data source you would like to access and how different dimensions of that raster are to be interpreted. Note that the DODS/GDAL driver can open only one raster per call to the GdalDataset::Open() method. If you want to open several rasters at once you will need to call the driver several times. * Syntax of the request string To open a DODS data source using the driver you must append the name of the Array or Grid variable to be accessed to the data source's DODS URL. In addition, you must also supply some extra information about which dimensions of the Raster correspond to Latitude and Longitude and, optionally, which are bands. The syntax of the string is: ? The is the URL of the DODS data source. The is the name of a raster (Array or Grid variable) held in the data source and the is used to describe how different dimensions are to be interpreted by the driver. Here's a simple example: http://dods.gso.uri.edu/cgi-bin/nph-dods/DODS/pathfdr/1986/12/k86336175759.hdf?dsp_band_1[lat]lon] There are many data sources in DODS that hold variables with three or four dimensions. In cases where the GDAL driver is used to access these, it must know which dimensions correspond to Latitude and Longitude. It must also know if it should always use one specific 'slice' of a dimension or if a dimension should be interpreted as making up a set of bands. This is the function of the . The full syntax is: ( [ ] )+ Where '-', 'lat' and 'lon' are literal and N is a positive integer. Here's an example for a three-dimensional Array: u[1-16][lat][lon] (This variable can be found in th test data source at http://dodsdev.gso.uri.edu/dods-test/nph-dods/data/nc/fnoc1.nc). The bracket expression says that the first dimension holds sixteen bands (numbers 1 to 16, note that ones-based indexing is used to match GDAL's band numbering scheme. The DAP uses zero-based indexing, so you'll need to take this into account), the second dimension corresponds to Latitude and the third to Longitude. If the variable 'u' had four dimensions, the expression would need to specify which element of the fourth dimension should be used. If a data source array has four or more dimensions, two *must* be labeled 'lat' and 'lon', one must be used to specify a group of bands and the remaining dimensions must use the 'slice' notation to tell the driver which single element of that dimension is to be used. For example, a hypothetical four dimensional array called 'four' would need a bracket expression like: four[1][1-12][lat][lon] or four_prime[3][3-9][lat][lon] In the first example the driver is told to use only first element of dimension one, that the second dimension is to be considered twelve bands and the last tow dimensions are to be taken as latitude and longitude. In the second example the driver is to use the third element of the first dimension and the bands 3 through 9 of the second dimension. When the driver opens a data source with a raster like 'four_prime' int he above example, it will map the raster's bands 3 through 9 to GDAL bands 1 through 6. It bears repeating: The band numbers and 'slice' in the 'bracket expression' use ones-based indexing. * Summary To use the OPeNDAP/GDAL driver to read a variable from a data source, pass the driver a URL with the name of the variable and the bracket expression. Use a question mark to separate the variable name from the data source URL. Here's an example using the sample data set at dodsdev: http://dodsdev.gso.uri.edu/dods-test/nph-dods/data/nc/fnoc1.nc?u[1-16][lat][lon] * Adding geo-referencing information Most data sources available from DODS do not include the sort of geo-referencing information that a GIS application expects or needs. However, you can easily add this information to a data source using the AIS (Ancillary Information Service) of DODS. The AIS is a new feature of DODS introduced in beta form with the 3.4.0 version of the software. For this reason it has not been fully documented in the DODS User Guide or other documentation. Here's a short description of the system and how to configure it. In the subdirectory example_conf.d there are DAS files and an AIS database which configure the driver for the URI/GSO Pathfinder and GSFC MODIS Aqua-Atmosphere MYD08_E3 (Level 3) data sources. * How to configure the AIS for a data source The AIS provides a way to augment the metadata for a data source. The service uses text files which contain the extra metadata along with a (simple) database that matches data sources to the files. The database is encoded in XML and the files are DODS Dataset Attribute Structure (DAS) files. In the following I'll explain how to write one of the DAS files and then how to write the AIS database. * Writing a DAS file for a particular data source or group of data sources A DAS file is a simple text file. It contains a description of the attributes for a data source. For general information about DAS objects and files, see the DODS User's Guide available on-line at the DODS project web site. To add geo-referencing information for a particular raster from a given data source, add attributes which describe the Latitude and Longitude extent and geographic projection for the raster. Here's the DAS file for the Pathfinder data source held at the University of Rhode Island (URI): Attributes { dsp_band_1 { Float64 Northernmost_Latitude 71.1722; Float64 Southernmost_Latitude 4.8278; Float64 Easternmost_Longitude -27.8897; Float64 Westernmost_Longitude -112.11; String ProjectionCS Equirectangular; String GeographicCS WGS84; Norm_Proj_Param { Float64 latitude_of_origin 0.0; Float64 central_meridian 0.0; Float64 false_easting 0.0; Float64 false_northing 0.0; } } } In this example the raster is named 'dsp_band_1' (the data source may be found at: http://dods.gso.uri.edu/cgi-bin/nph-dods/DODS/pathfdr/ 1986/12/k86336175759.hdf). To describe the latitude and longitude bounding box, use the four attributes "Northernmost_Latitude", "Southernmost_Latitude", "Easternmost_Longitude", and "Westernmost_Longitude". To specify the projection information, use the attributes "ProjectionCS" and "GeographicCS". The GeographicCS attribute is used to name the geographic coordinate system. The value may be any string which can be passed to the OGRSpatialReference::SetWellKnownGeogCS method. These include WGS84, WGS73, NAD27, NAD83 and various EPSG numbers using the syntax 'EPSG:n' where 'n' is the EPSG number. The ProjectionCS attribute is used to provide the projection coordinate system. You can find a list of the projection coordinate systems that GDAL supports at: http://remotesensing.org/geotiff/proj_list/. Finally, the attribute container Norm_Proj_Param is used to provide the normalized values of the parameters for the projection named by ProjectionCS. These parameter names can be found at http://remotesensing.org/geotiff/proj_list/. The parameter names should be the OGC WKT names. For a nice tutorial on projections and information about GDAL's support for specific ones, see: http://gdal.velocet.ca/projects/opengis/ogrhtml/osr_tutorial.html. Note that you must be careful about enclosing attribute names in double quotes. In general, the quotes will be included in the value of a String attribute. That means that a value of WSG84 is different from a value of "WGS84" and the latter won't be recognized by GDAL. It's often possible to figure out the bounding box and projection information by examining the data source. You can do this using the OPeNDAP server's HTML interface. This will provide you with a HTML form which lists all the variables and their attributes for a given data source. You can query the data source and get back ASCII data dumps, too. To access the HTML interface fora given data source, append the suffix '.html' to the data source's URL. * Adding a DAS to the AIS database The AIS uses a simple flat-file database to match specific DAS files with data sources. The file is easy to create with a text editor and it's easy to specify that one DAS file should be used with a group of data sources. For example, one DAS file is used with the thousands of images in the URI Pathfinder data set. Here's the text of the AIS data base: Each entry in the database uses one element. Inside each there is a a element which holds information about a data source or group of data sources. Use the "url" attribute to name a specific data source or the "regex" attribute to name a group using a regular expression. Also in an is a element which names the DAS file. Note that the file name is given as the value of the element's "url" attribute. It may be a local file name (as in this example) or it maybe a remote file accessible using http. Important: When you start adding you own DAS files, you should make the filenames absolute paths, not relative. Because the pathnames are relative they work only when running programs in the 'gdal/frmts/dods' directory (I used relative paths because I cannot control where the code will be unpacked). * Using a particular AIS database To get the driver to use a particular database, add the pathname to the data base in the .dodsrc file as the value of the parameter "AIS_DATABASE". In the example_conf.d directory there is a sample .dodsrc file named "dodsrc" which is confgiured to used the ais database in that directory (both the dodsrc and ais_database.xml files are used by the test software). The .dodsrc file can be kept the a users home directory or in a location named by the DODS_CONF environment variable. The dodsrc file canbe used to configure a number of other things about the GDAL driver. The USE_CACHE parameter determines whether the driver will cache responses. Turning caching on may improve performance dramatically. Setting teh DEFLATE parameter to "1" will cause the driver to request that servers compress responses before sending them, also improving performance. * Data sources with many variables Some data sources (e.g., some MODIS level 3) have many variables. It would be tedious to write a DAS file for each variable in a data source when all of the variables use the same projection. In this case the projection information maybe included in a global attribute container named "opendap_org_gdal". Here's an example for MODIS Level 3 data hosted at Goddard Space Flight Center: Attributes { opendap_org_gdal { Float64 Northernmost_Latitude 89.5; Float64 Southernmost_Latitude -89.5; Float64 Easternmost_Longitude 179.5; Float64 Westernmost_Longitude -179.5; String ProjectionCS Equirectangular; String GeographicCS WGS84; Norm_Proj_Param { Float64 latitude_of_origin 0.0; Float64 central_meridian 0.0; Float64 false_easting 0.0; Float64 false_northing 0.0; } } } Note that you can include both the global container and a variable-specific one in the same DAS file. In that case the driver will use the variable specific attributes preferentially over the global ones. This enables one DAS file to include a default set of projection values and then override those for a handful of variables. Also note that since the DAS file can be accessed using HTTP, you can share the files among several sites. --------------------------------------------------------------------------- 3. Testing the driver Unit tests The make target dodsdataset_test builds the unit tests which you can run as "./dodsdataset_test". You'll need both DODS and CPPUNIT installed on your computer for the unit tests to build and run. Be aware that some of the unit tests take a while and may fail if some of the remote servers they use are off line. Test application The make target "using_dods" builds a simple application that uses the driver to read and print from a DODS data source. You'll need DODS installed to build this target (but you don't need CPPUNIT). To run the program pass it a DODS data source URL with the variable name and bracket expression appended as described above in Section 2. Here's an example using one of the DODS test data sets: [jimg@comet dods]$ ./using_dods http://dodsdev.gso.uri.edu/dods-test/ nph-dods/data/nc/fnoc1.nc?u[1-4][lat][lon] This should print the following: Opening the dataset. Band Number: 1 Block = 21x17 Type = Int16 Min = -2352, Max = 3694 -146 -422 -1033 -1668 -2000 -2307 -2216 -1757 -1231 -294 861 1807 2561 followed by the remaining output (which is fairly long since it'll write out data for the four bands) Note that if you want to use the DODS/OPeNDAP HTML form interface of the data source to check on these values, remember that the DAP uses zero-based indexing. Be sure to adjust your band numbers accordingly! You can also try using gdalinfo to open one of the rasters. Here's an example using the pathfinder data (if you run this make sure to set the environment variable DODS_CONF so that it points to the file 'dodsrc' in the subdirectory 'example_conf.d'. ---------------------------------------------------------------------------- $Id$