s.kcv GRASS Reference Manual s.kcv NAME s.kcv - Randomly partition sites into test/train sets. (GRASS Sites Program) SYNOPSIS s.kcv s.kcv help s.kcv [-dq] k=value sites=name DESCRIPTION s.kcv randomly divides a sites lists into k sets of test/train data (for k-fold cross validation). Test partitions are mutually exclusive. That is, a site will appear in only one test partition and k-1 training partitions. The program generates a random point using the selected random number generator and then finds the closest site to it. This site is removed from the candidate list (meaning that it will not be selected for any other test set) and saved in the first test partition file. This is repeated until enough points have been selected for the test partition. The number of sites chosen for test partitions depends upon the number of sites available and the number of partitions chosen (this number is made as consistent as possible while ensuring that all sites will be chosen for testing). This process of filling up a test partition is done k times. Flags: -d Use drand48() (default is rand()). -q Quiet. Don't report progress. Parameters: k=value Positive integer value indicating the number of partitions. sites=name Name of a sites file to store random points in. Test/train pairs are saved as sites list using name as a basename. Test sites are saved in name-test.i while training sites are saved in name-train.i, where i ranges from zero to k. NOTES Existing files are silently overwritten. GRASS 5.0beta7 GRASS Development Team 1 s.kcv GRASS Reference Manual s.kcv An ideal random sites generator will follow a Poisson distribution. The merits of rand() versus drand48() in this respect are unknown. Train/test partitions, though, can only be as random as the original sites. This program simply divides sites up in a random manner. Be warned that random number generation occurs over the intervals defined by the current region. This program may not work properly with Lat-long data. SEE ALSO rand(3), drand48(3), s.rand and g.region BUGS Please send all bug fixes and comments to the author. AUTHOR James Darrell McCauley, Purdue University (mccauley@ecn.purdue.edu) NOTICE This program is part of the contrib section of the GRASS distribution. As such, it is externally contributed code that has not been examined or tested by the Office of GRASS Integration. 2 GRASS Development Team GRASS 5.0beta7