.TH s.kcv .SH NAME \fIs.kcv\fR \- Randomly partition sites into test/train sets. .br .I (GRASS Sites Program) .SH SYNOPSIS \fBs.kcv\fR .br \fBs.kcv help\fR .br \fBs.kcv \fR[\fB-dq\fR] \fBk\*=value\fR \fBsites\*=\fIname\fR .SH DESCRIPTION .I s.kcv randomly divides a sites lists into .I k sets of test/train data (for \fBk\fR-fold \fBc\fRross \fBv\fRalidation). Test partitions are mutually exclusive. That is, a site will appear in only one test partition and .I k-1 training partitions. .LP The program generates a random point using the selected random number generator and then finds the closest site to it. This site is removed from the candidate list (meaning that it will not be selected for any other test set) and saved in the first test partition file. This is repeated until enough points have been selected for the test partition. The number of sites chosen for test partitions depends upon the number of sites available and the number of partitions chosen (this number is made as consistent as possible while ensuring that all sites will be chosen for testing). This process of filling up a test partition is done .I k times. .LP \fBFlags:\fR .IP \fB-d\fR 18 Use .I drand48() (default is .I rand()). .IP \fB-q\fR 18 Quiet. Don't report progress. .LP \fBParameters:\fR .IP \fBk\*=\fIvalue\fR 18 Positive integer value indicating the number of partitions. .LP .IP \fBsites\*=\fIname\fR 18 Name of a sites file to store random points in. .LP Test/train pairs are saved as sites list using .I name as a basename. Test sites are saved in \fIname\fR-test.\fIi\fR while training sites are saved in \fIname\fR-train.\fIi\fR, where .I i ranges from zero to .I k. .SH NOTES Existing files are silently overwritten. .LP An ideal random sites generator will follow a Poisson distribution. The merits of .I rand() versus .I drand48() in this respect are unknown. Train/test partitions, though, can only be as random as the original sites. This program simply divides sites up in a random manner. .LP Be warned that random number generation occurs over the intervals defined by the current region. .LP This program may not work properly with Lat-long data. .LP .SH SEE ALSO .I rand(3), .I drand48(3), .I s.rand and .I g.region .SH BUGS Please send all bug fixes and comments to the author. .SH AUTHOR James Darrell McCauley, Purdue University .if n .br (mccauley@ecn.purdue.edu)