.. _vector_optimization: ***************************************************************************** Vector ***************************************************************************** :Author: HostGIS :Revision: $Revision$ :Date: $Date$ :Last Updated: 2008/08/08 .. contents:: Table of Contents :depth: 2 :backlinks: top Splitting your data ------------------- If you find yourself making several layers, all of them using the same dataset but filtering to only use some of the records, you could probably do it better. If the criteria are static, one approach is to pre-split the data. The *ogr2ogr* utility can select on certain features from a datasource, and save them to a new data source. Thus, you can split your dataset into several smaller ones that are already effectively filtered, and remove the FILTER statement. Shapefiles ---------- Use :ref:`shptree` to generate a spatial index on your shapefile. This is quick and easy ("shptree foo.shp") and generates a .qix file. MapServer will automagically detect an index and use it. .. note: :ref:`tileindex` :ref:`shapefiles` can be indexed with :ref:`shptree`. MapServer also comes with the :ref:`sortshp` utility. This reorganizes a shapefile, sorting it according to the values in one of its columns. If you're commonly filtering by criteria and it's almost always by a specific column, this can make the process slightly more efficient. Although shapefiles are a very fast data format, :ref:`PostGIS ` is pretty speedy as well, especially if you use indexes well and have memory to throw at caching. PostGIS ------- The single biggest boost to performance is indexing. Make sure that there's a GIST index on the geometry column, and each record should also have an indexed primary key. If you used shp2pgsql, then these statements should create the necessary indexes: .. code-block:: sql ALTER TABLE table ADD PRIMARY KEY (gid); CREATE INDEX table_the_geom ON table (the_geom) USING GIST; PostgreSQL also supports reorganizing the data in a table, such that it's physically sorted by the index. This allows PostgreSQL to be much more efficient in reading the indexed data. Use the CLUSTER command, e.g. .. code-block:: sql CLUSTER the_geom ON table; Then there are numerous optimizations one can perform on the database server itself, aside from the geospatial component. The easiest is to increase *max_buffers* in the *postgresql.conf* file, which allows PostgreSQL to use more memory for caching. More information can be found at the `PostgreSQL website`_. Databases in General (PostGIS, Oracle, MySQL) --------------------------------------------- By default, MapServer opens and closes a new database connection for each database-driven layer in the mapfile. If you have several layers reading from the same database, this doesn't make a lot of sense. And with some databases (Oracle) establishing connections takes enough time that it can become significant. Try adding this line to your database layers: .. code-block:: mapfile PROCESSING "CLOSE_CONNECTION=DEFER" This causes MapServer to not close the database connection for each layer until after it has finished processing the mapfile and this may shave a few seconds off of map generation times. .. #### rST Link Section #### .. _`PostgreSQL website`: http://www.postgresql.org/