1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
|
* Overview
The "TwoDimensionTwoGaussian.dat" file has 2000 measurement
vectors. Each measurement vector consist of two components of
double. The measurement vectors are generated using GNU R
statistical package (http://www.r-project.org).
Half of the measurement vectors are generated using the the
"First Gaussian" class statistics and the other half are
generated using the "Second Gaussian" class statistics in the
following section. First 1000 rows in the file belong to the
first class and the rest belong to the second class.
The data is generated by a multivariate random normal variate
generator (rmvnorm function in mvtnorm pacakge). The exact GNU
R script for genation is in the "GNU R script" section of this
document.
* Class Statistics
1) class statistics for random sample generation
First Gaussian class:
mean
100 100
covariance
800 0
0 800
Second Gaussian class:
mean
200 200
covariance
800 0
0 800
2) class statistics of the generated samples
First Gaussian class:
mean
99.261 100.078
covariance
814.95741 38.40308
38.40308 817.64446
Second Gaussian class:
mean
200.1 201.3
covariance
859.785295 -3.617316
-3.617316 848.991508
* GNU R Script
library(mvtnorm)
mean1 = c(100, 100)
mean2 = c(200, 200)
sigma = matrix(data=c(800, 0, 0, 800), ncol=2, nrow=2, byrow=T)
gauss1 = rmvnorm(1000, mean1, sigma)
gauss2 = rmvnorm(1000, mean2, sigma)
mydata = rbind(gauss1, gauss2)
write.table(mydata, file="TwoDimensionTwoGaussian.dat", quote=F, row.names=F, col.names=F)
|