GML - Geography Markup Language

OGR has limited support for GML reading and writing. Update of existing files is not supported.

Supported GML flavors :
OGR versionReadWrite
OGR >= 1.8.0 GML2 and GML3 that can
be translated into simple feature model
GML 2.1.2 or GML 3 SF-0
(GML 3.1.1 Compliance level SF-0)
OGR < 1.8.0GML2 and limited GML3GML 2.1.2

Parsers

The reading part of the driver only works if OGR is built with Xerces linked in. Starting with OGR 1.7.0, when Xerces is unavailable, read support also works if OGR is built with Expat linked in. XML validation is disabled by default. GML writing is always supported, even without Xerces or Expat.

Note: starting with OGR 1.9.0, if both Xerces and Expat are available at build time, the GML driver will preferentially select at runtime the Expat parser for cases where it is possible (GML file in a compatible encoding), and default back to Xerces parser in other cases. However, the choice of the parser can be overriden by specifying the GML_PARSER configuration option to EXPAT or XERCES.

CRS support

Since OGR 1.8.0, the GML driver has coordinate system support. This is only reported when all the geometries of a layer have a srsName attribute, whose value is the same for all geometries. For srsName such as "urn:ogc:def:crs:EPSG:", for geographic coordinate systems (as returned by WFS 1.1.0 for example), the axis order should be (latitude, longitude) as required by the standards, but this is unusual and can cause issues with applications unaware of axis order. So by default, the driver will swap the coordinates so that they are in the (longitude, latitude) order and report a SRS without axis order specified. It is possible to get the original (latitude, longitude) order and SRS with axis order by setting the configuration option GML_INVERT_AXIS_ORDER_IF_LAT_LONG to NO.

There also situations where the srsName is of the form "EPSG:XXXX" (whereas "urn:ogc:def:crs:EPSG::XXXX" would have been more explicit on the intent) and the coordinates in the file are in (latitude, longitude) order. By default, OGR will not consider the EPSG axis order and will report the coordinates in (latitude,longitude) order. However, if you set the configuration option GML_CONSIDER_EPSG_AS_URN to YES, the rules explained in the previous paragraph will be applied.

Schema

In contrast to most GML readers, the OGR GML reader does not require the presence of an XML Schema definition of the feature classes (file with .xsd extension) to be able to read the GML file. If the .xsd file is absent or OGR is not able to parse it, the driver attempts to automatically discover the feature classes and their associated properties by scanning the file and looking for "known" gml objects in the gml namespace to determine the organization. While this approach is error prone, it has the advantage of working for GML files even if the associated schema (.xsd) file has been lost.

The first time a GML file is opened, if the associated .xsd is absent or could not been parsed correctly, it is completely scanned in order to determine the set of featuretypes, the attributes associated with each and other dataset level information. This information is stored in a .gfs file with the same basename as the target gml file. Subsequent accesses to the same GML file will use the .gfs file to predefine dataset level information accelerating access. To a limited extent the .gfs file can be manually edited to alter how the GML file will be parsed. Be warned that the .gfs file will be ignored if the associated .gml file has a newer timestamp.

When prescanning the GML file to determine the list of feature types, and fields, the contents of fields are scanned to try and determine the type of the field. In some applications it is easier if all fields are just treated as string fields. This can be accomplished by setting the configuration option GML_FIELDTYPES to the value ALWAYS_STRING.

OGR 1.8.0 adds support for detecting feature attributes in nested GML elements (non-flat attribute hierarchy) that can be found in some GML profiles such as UK Ordnance Survey MasterMap. OGR 1.8.0 also brings support for reading IntegerList, RealList and StringList field types when a GML element has several occurences.

Since OGR 1.8.0, a specialized GML driver - the NAS driver - is available to read German AAA GML Exchange Format (NAS/ALKIS).

Configuration options can be set via the CPLSetConfigOption() function or as environment variables.

Geometry reading

When reading a feature, the driver will by default only take into account the last recognized GML geometry found (in case they are multiples) in the XML subtree describing the feature.

Starting with OGR 1.8.0, the user can change the .gfs file to select the appropriate geometry by specifying its path with the <GeometryElementPath> element. See the description of the .gfs syntax below.

OGR 1.8.0 adds support for more GML geometries including TopoCurve, TopoSurface, MultiCurve. The TopoCurve type GML geometry can be interpreted as either of two types of geometries. The Edge elements in it contain curves and their corresponding nodes. By default only the curves, the main geometries, are reported as OGRMultiLineString. To retrieve the nodes, as OGRMultiPoint, the configuration option GML_GET_SECONDARY_GEOM should be set to the value YES. When this is set only the secondary geometries are reported.

gml:xlink resolving

OGR 1.8.0 adds support for gml:xlink resolving. When the resolver finds an element containing the tag xlink:href, it tries to find the corresponding element with the gml:id in the same gml file, other gml file in the file system or on the web using cURL. Set the configuration option GML_SKIP_RESOLVE_ELEMS to NONE to enable resolution.

By default the resolved file will be saved in the same directory as the original file with the extension ".resolved.gml", if it doesn't exist already. This behaviour can be changed using the configuration option GML_SAVE_RESOLVED_TO. Set it to SAME to overwrite the original file. Set it to a filename ending with .gml to save it to that location. Any other values are ignored. If the resolver cannot write to the file for any reason, it will try to save it to a temperary file generated using CPLGenerateTempFilename("ResolvedGML"); if it cannot, resolution fails.

Note that the resolution algorithm is not optimised for large files. For files with more than a couple of thousand xlink:href tags, the process can go beyond a few minutes. A rough progress is displayed through CPLDebug() for every 256 links. It can be seen by setting the environment variable CPL_DEBUG. The resolution time can be reduced if you know any elements that won't be needed. Mention a comma seperated list of names of such elements with the configuration option GML_SKIP_RESOLVE_ELEMS. Set it to ALL to skip resolving altogether (default action). Set it to NONE to resolve all the xlinks.

Starting since OGR 1.9.0 an alternative resolution method is available. This alternative method will be activated using the configuration option GML_SKIP_RESOLVE_ELEMS HUGE. In this case any gml:xlink will be resolved using a temporary SQLite DB so to identify any corresponding gml:id relation. At the end of this SQL-based process, a resolved file will be generated exactly as in the NONE case but without their limits. The main advantages in using an external (temporary) DBMS so to resolve gml:xlink and gml:id relations are the followings:

TopoSurface interpretation rules [polygons and internal holes]

Starting since OGR 1.9.0 the GML driver is able to recognize two different interpretation rules for TopoSurface when a polygon contains any internal hole:

The newest interpretation seems to fully match GML 3 standard recommendations; so this latest is now assumed to be the default interpretation supported by OGR.

NOTE : Using the newest interpretation requires GDAL/OGR to be built against the GEOS library.

Using the GML_FACE_HOLE_NEGATIVE configuration option you can anyway select the actual interpretation to be applied when parsing GML 3 Topologies:

Encoding issues

Expat library supports reading the following built-in encodings : When used with Expat library, OGR 1.8.0 adds supports for Windows-1252 encoding ( for previous versions, altering the encoding mentionned in the XML header to ISO-8859-1 might work in some cases).

The content returned by OGR will be encoded in UTF-8, after the conversion from the encoding mentionned in the file header is.

If the GML file is not encoded in one of the previous encodings and the only parser available is Expat, it will not be parsed by the GML driver. You may convert it into one of the supported encodings with the iconv utility for example and change accordingly the encoding parameter value in the XML header.

When writing a GML file, the driver expects UTF-8 content to be passed in.

Feature id (fid / gml:id)

Starting with OGR 1.8.0, the driver exposes the content of the gml:id attribute as a string field called gml_id, when reading GML WFS documents. When creating a GML3 document, if a field is called gml_id, its content will also be used to write the content of the gml:id attribute of the created feature.

Starting with OGR 1.9.0, the driver autodetects the presence of a fid (GML2) (resp. gml:id (GML3)) attribute at the beginning of the file, and, if found, exposes it by default as a fid (resp. gml_id) field. The autodetection can be overriden by specifying the GML_EXPOSE_FID or GML_EXPOSE_GML_ID configuration option to YES or NO.

Starting with OGR 1.9.0, when creating a GML2 document, if a field is called fid, its content will also be used to write the content of the fid attribute of the created feature.

Performance issues with large multi-layer GML files.

There is only one GML parser per GML datasource shared among the various layers. By default, the GML driver will restart reading from the beginning of the file, each time a layer is accessed for the first time, which can lead to poor performance with large GML files.

Starting with OGR 1.9.0, the GML_READ_MODE configuration option can be set to SEQUENTIAL_LAYERS if all features belonging to the same layer are written sequentially in the file. The reader will then avoid unnecessary resets when layers are read completely one after the other. To get the best performance, the layers must be read in the order they appear in the file.

If no .xsd and .gfs files are found, the parser will detect the layout of layers when building the .gfs file. If the layers are found to be sequential, a <SequentialLayers>true</SequentialLayers> element will be written in the .gfs file, so that the GML_READ_MODE will be automatically initialized to MONOBLOCK_LAYERS if not explicitely set by the user.

Starting with OGR 1.9.0, the GML_READ_MODE configuration option can be set to INTERLEAVED_LAYERS to be able to read a GML file whose features from different layers are interleaved. In the case, the semantics of the GetNextFeature() will be slightly altered, in a way where a NULL return does not necessarily mean that all features from the current layer have been read, but it could also mean that there is still a feature to read, but that belongs to another layer. In that case, the file should be read with code similar to the following one :

    int nLayerCount = poDS->GetLayerCount();
    int bFoundFeature;
    do
    {
        bFoundFeature = FALSE;
        for( int iLayer = 0; iLayer < nLayerCount; iLayer++ )
        {
            OGRLayer   *poLayer = poDS->GetLayer(iLayer);
            OGRFeature *poFeature;
            while((poFeature = poLayer->GetNextFeature()) != NULL)
            {
                bFoundFeature = TRUE;
                poFeature->DumpReadable(stdout, NULL);
                OGRFeature::DestroyFeature(poFeature);
            }
        }
    } while (bInterleaved && bFoundFeature);

Creation Issues

On export all layers are written to a single GML file all in a single feature collection. Each layer's name is used as the element name for objects from that layer. Geometries are always written as the ogr:geometryProperty element on the feature.

The GML writer supports the following dataset creation options:

VSI Virtual File System API support

(Some features below might require OGR >= 1.9.0)

The driver supports reading and writing to files managed by VSI Virtual File System API, which include "regular" files, as well as files in the /vsizip/ (read-write) , /vsigzip/ (read-write) , /vsicurl/ (read-only) domains.

Writing to /dev/stdout or /vsistdout/ is also supported. Note that in that case, only the content of the GML file will be written to the standard output (and not the .xsd). The <boundedBy> element will not be written. This is also the case if writing in /vsigzip/

Syntax of .gfs file by example

Let's consider the following test.gml file :
<?xml version="1.0" encoding="UTF-8"?>
<gml:FeatureCollection xmlns:gml="http://www.opengis.net/gml">
  <gml:featureMember>
    <LAYER>
      <attrib1>attrib1_value</attrib1>
      <attrib2container>
        <attrib2>attrib2_value</attrib2>
      </attrib2container>
      <location1container>
        <location1>
            <gml:Point><gml:coordinates>3,50</gml:coordinates></gml:Point>
        </location1>
      </location1container>
      <location2>
        <gml:Point><gml:coordinates>2,49</gml:coordinates></gml:Point>
      </location2>
    </LAYER>
  </gml:featureMember>
</gml:FeatureCollection>
and the following associated .gfs file.
<GMLFeatureClassList>
  <GMLFeatureClass>
    <Name>LAYER</Name>
    <ElementPath>LAYER</ElementPath>
    <GeometryElementPath>location1container|location1</GeometryElementPath>
    <PropertyDefn>
      <Name>attrib1</Name>
      <ElementPath>attrib1</ElementPath>
      <Type>String</Type>
      <Width>13</Width>
    </PropertyDefn>
    <PropertyDefn>
      <Name>attrib2</Name>
      <ElementPath>attrib2container|attrib2</ElementPath>
      <Type>String</Type>
      <Width>13</Width>
    </PropertyDefn>
  </GMLFeatureClass>
</GMLFeatureClassList>
Note the presence of the '|' character in the <ElementPath> and <GeometryElementPath> elements to specify the wished field/geometry element that is a nested XML element. Nested field elements are only supported from OGR 1.8.0, as well as specifying <GeometryElementPath> If GeometryElementPath is not specified, the GML driver will use the last recognized geometry element.

The output of ogrinfo test.gml -ro -al is:

Layer name: LAYER
Geometry: Unknown (any)
Feature Count: 1
Extent: (3.000000, 50.000000) - (3.000000, 50.000000)
Layer SRS WKT:
(unknown)
Geometry Column = location1container|location1
attrib1: String (13.0)
attrib2: String (13.0)
OGRFeature(LAYER):0
  attrib1 (String) = attrib1_value
  attrib2 (String) = attrib2_value
  POINT (3 50)

Example

The ogr2ogr utility can be used to dump the results of a Oracle query to GML:
ogr2ogr -f GML output.gml OCI:usr/pwd@db my_feature -where "id = 0"

The ogr2ogr utility can be used to dump the results of a PostGIS query to GML:

ogr2ogr -f GML output.gml PG:'host=myserver dbname=warmerda' -sql "SELECT pop_1994 from canada where province_name = 'Alberta'"

See Also

Credits