.TH m.clump .SH NAME \fIm.clump\fR \- Aggregates point data into clusters of like data using a Voronoi tesselation. .SH SYNOPSIS \fBm.clump\fR .br \fBm.clump help\fR .br .in +3 .ti -3 .na \fBm.clump\fR [\fB\-rq\fR] \fBinput=\fIname\fR \fBoutput=\fIname\fR [\fBfs=\fIchar\fR] [\fBattributes=\fIfield#\fR[\fI,field#,...\fR]] [\fBbarriers=\fIvectorfile\fR[\fI,vectorfile,...\fR]] .in .ad .SH DESCRIPTION \fIm.clump\fR clusters points together based on points' proximity, point attributes, and the presence of physical constraints (vector barriers) dividing such clusters. It first triangulates the points using a Voronoi tesselation to determine the proximity of points to one another. Connections among points are maintained where adjacent points have same attribute values; connections are broken where adjacent points have different values for a given attribute (field). Connections between adjacent points will also be broken where points fall on different sides of arcs in user-specified vector maps. .SH OPTIONS The user can run the program by specifying input and output file names and any desired options on the command line, using the form: .LP .in +3 .ti -3 .na \fBm.clump\fR [\fB\-rq\fR] \fBinput=\fIname\fR \fBoutput=\fIname\fR [\fBfs=\fIchar\fR] [\fBattributes=\fIfield#\fR[\fI,field#,...\fR]] [\fBbarriers=\fIvectorfile\fR[\fI,vectorfile,...\fR]] .in .ad .LP where parameters and flags have the meanings given below. .LP \fBFlags:\fR .IP \fI-r\fR Only process points in the input file that fall within the user's current geographic region. .IP \fI-q\fR Run quietly (without sending comments on program progress to stdout). .LP \fBParameters:\fR .IP "\fBinput\*=\fIname\fR" 26 Name of an existing file containing, minimally, the easting and northing of the points to be formed into clusters, and having the format given in section "INPUT FILE FORMAT". .IP "\fBoutput\*=\fIname\fR" 26 Name to be assigned the file to contain program output. Output will have the format specified in section "OUTPUT FILE FORMAT". .IP "\fBfs\*=\fIcharacter\fR" 26 A single character, specifying the field separator used in the \fIinput\fR file (also used in the \fIoutput\fR file). The default delimiter used, if unspecified, is any white space. .IP "\fBattributes\*=\fIfield#\fR[\fI,field#,...\fR]" 26 One or more attributes to be compared in the input file, to determine which data points are to be grouped. This is a list of field numbers (columns) in the input file which are to be used when forming clumps. Different points which do not have the same attributes in all fields specified will be placed in distinct clumps. Fields are numbered starting with 1. (For example, the x,y coordinates are in fields 1 and 2 respectively.) .IP "\fBbarriers\*=\fIvectorfile\fR[\fI,vectorfile,...\fR]" 26 One or more vector files to constrain points from joining the same clump. Points which appear on different sides of any line or area edges in a user-specified \fIvectorfile\fR will be placed in distinct clumps in the \fIoutput\fR file. .SH "INPUT FILE FORMAT" Each line of the input file minimally should have the format: .LP .nf .RS x,y[,text,text,...] .RE .fi .LP The input file is required only to contain the easting (x) and northing (y) values for each point, unless the user has specified use of the \fIattributes\fR parameter. The field delimiter (indicated here by a comma) between x and y and between y and text can be any single character as specified by the 'fs' parameter. The default delimiter is white space if 'fs' is not specified. Additional data fields (columns) may also be present in the input file, and will be preserved in the output. .LP Leading spaces in the input are automatically removed. Blanks lines and lines starting with # are treated as comment and ignored. .SH "OUTPUT FILE FORMAT" The output file has the general format: .LP .nf .RS x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] x,y[,text,text,...] .RE .fi The comma here represents the field delimiter and will be the same character as the delimiter specified to be used in the input file. .LP The output format is structured. Lines with the 'x' at the left margin are the original input points. Lines with the 'x' indented one space are the points that are 'neighbors' of the non-indented point. Empty lines indicate the end of a clump. .LP Clumps are groups of points either or both: (1) having the same attribute(s) values in the data field(s) specified by the user with the \fIattributes\fR parameter, and/or (2) falling within polygons formed by the vector barriers specified by the user with the \fIbarriers\fR parameter. .SH EXAMPLES In the following example the comma-delimited input file \fItreepecker\fR is of the form: .LP .nf .RS # x,y,tree_id,tree_spp,woodpecker_suit,woodpecker_use 432222.22,4651095.23,8074,loblolly pine,high,0 432618.65,4651156.30,8075,loblolly pine,medium,0 432702.67,4651169.82,8076,sugar maple,low,0 432702.63,4651165.72,8077,loblolly pine,high,1 432702.57,4651159.61,8078,loblolly pine,high,1 432702.79,4651173.82,8079,loblolly pine,high,1 432177.53,4651072.01,8080,peach,low,0 432181.50,4650466.25,8081,loblolly pine,high,0 432169.82,4650466.03,8082,loblolly pine,low,0 432235.76,4650467.18,8083,loblolly pine,high,1 432274.53,4650467.81,8084,loblolly pine,medium,1 432216.47,4650225.19,8085,loblolly pine,medium,0 432381.46,4651077.28,8086,loblolly pine,low,1 432640.08,4651005.86,8087,loblolly pine,low,0 432972.11,4651095.98,8088,loblolly pine,high,1 .RE .fi where: .nf .RS field 1 = x (easting) field 2 = y (northing) field 3 = tree identification number field 4 = tree species field 5 = suitability for use as red-cockaded woodpecker habitat field 6 = current red-cockaded woodpecker nesting site .RE .fi Assume constraints imposed on clustering include a vector map of roadways and a vector map of waterways. .LP The following command will produce an output file in which all trees having both the same suitability for use as wookpecker habitat and the same nesting use status, bounded by roads and waterways, will appear as clusters of points in the output. .LP .RS \fBm.clump input=treepecker output=treepecker.clumps fs=, attributes=5,6 barriers=roads,waters \fR .RE .LP In this case, program output might look like: .LP .nf .RS 432222.22,4651095.23,8074,loblolly pine,high,0 432618.65,4651156.30,8075,loblolly pine,medium,0 432702.67,4651169.82,8076,sugar maple,low,0 432702.63,4651165.72,8077,loblolly pine,high,1 432972.11,4651095.98,8088,loblolly pine,high,1 432702.57,4651159.61,8078,loblolly pine,high,1 432972.11,4651095.98,8088,loblolly pine,high,1 432702.79,4651173.82,8079,loblolly pine,high,1 432702.63,4651165.72,8077,loblolly pine,high,1 432702.57,4651159.61,8078,loblolly pine,high,1 432702.79,4651173.82,8079,loblolly pine,high,1 432972.11,4651095.98,8088,loblolly pine,high,1 432702.57,4651159.61,8078,loblolly pine,high,1 432972.11,4651095.98,8088,loblolly pine,high,1 432702.63,4651165.72,8077,loblolly pine,high,1 432177.53,4651072.01,8080,peach,low,0 432169.82,4650466.03,8082,loblolly pine,low,0 432169.82,4650466.03,8082,loblolly pine,low,0 432177.53,4651072.01,8080,peach,low,0 432181.50,4650466.25,8081,loblolly pine,high,0 432235.76,4650467.18,8083,loblolly pine,high,1 432274.53,4650467.81,8084,loblolly pine,medium,1 432216.47,4650225.19,8085,loblolly pine,medium,0 432381.46,4651077.28,8086,loblolly pine,low,1 432640.08,4651005.86,8087,loblolly pine,low,0 .RE .fi .SH UTILITIES The user can display program output using GRASS display functions like \fId.mapgraph\fR and \fId.points\fR. The following Bourne shell script allows the user to graph the clustering of points output by \fIm.clump\fR. .nf .RS : ${GISRC?} file= label=0 for arg do case "$arg" in fs=*) F=-F"`echo $arg|sed s/fs=//`";; label=*) eval $arg ;; file=*) eval $arg;; *) echo "Usage: $0 [fs=c] file=filename [label=#]" >& 2 exit 1 ;; esac done if [ "$file" = "" ] then echo "Usage: $0 [fs=c] file=filename [label=#]" >& 2 exit 1 fi awk "$F" "BEGIN {label=$label}"' NF == 0 {next} /^ / {next} {if (label!=0) print $1,$2,$label else print $1,$2} ' $file | d.points size=10 awk "$F" ' NF == 0 {next} /^ /{ print "move",east,north; print "draw",$1,$2; next} { east=$1; north=$2} ' $file | d.mapgraph color=red .RE .fi .SH NOTES If the user specifies neither 'attributes' nor 'barriers' parameters, the resultant output file will have only one clump (because there will be no basis for breaking any proximity connections among points). .LP Input lines that \fIm.clump\fR doesn't understand are ignored. This means that if a line in the \fIinput\fR file is not a comment but doesn't have (or doesn't appear to have) an x,y coordinate-pair as its first two fields, the line will be ignored. The most common cause of ignored lines will be user error (e.g., the user's failure to specify the input file field separator). If unrecognized lines in the input file exist, \fIm.clump\fR will print one message (to stderr) noting that some unrecognized lines were found. .SH BUGS Input lines which are longer that 4095 characters will be silently truncated. Fields which are longer than 1023 characters will probably cause \fIm.clump\fR to core dump (at best) or to produce invalid results (not so great). .SH "SEE ALSO" UNIX Manual entries for \fIawk\fR and \fIsed\fR. .LP .I d.mapgraph, .I d.points, .I g.region, .I s.geom, .I v.geom, and .I parser Example Bourne-shell scripts which process the output from \fIm.clump\fR can be found with the source code for \fIm.clump\fR: .IP \fImapgraph.sh\fR: Tool to graph the connections among data points found by \fIm.clump\fR. .IP \fIpoints.sh\fR: Tool to display the centroids of clumps created by \fIm.clump\fR. .IP \fIarea.sh\fR: Tool to sum the area of points associated with each clump created by \fIm.clump\fR, using data stored in a user-specified field of the input data file. .SH AUTHOR Michael Shapiro, U.S. Army Construction Engineering Research Laboratories