Welcome to the L3 Harris Geospatial documentation center. Here you will find reference guides and help documents.
﻿
>  Docs Center  >  IDL Reference  >  Advanced Math and Stats  >  IMSL_KOLMOGOROV2

IMSL_KOLMOGOROV2

IMSL_KOLMOGOROV2

The IMSL_KOLMOGOROV2 function performs a Kolmogorov-Smirnov two- sample test.

The IMSL_KOLMOGOROV2 function computes Kolmogorov-Smirnov two-sample test statistics for testing that two continuous cumulative distribution functions (CDF's) are identical based upon two random samples. One- or two-sided alternatives are allowed. If n_observations_x = N_ELEMENTS(x) and n_observations_y = N_ELEMENTS(y), then the exact p-values are computed for the two-sided test when n_observations_x * n_observations_y is less than 104.

Let Fn(x) denote the empirical CDF in the X sample, let Gm(y) denote the empirical CDF in the Y sample, where n = n_observations_x- NMISSINGX and m = n_observations_y NMISSINGY, and let the corresponding population distribution functions be denoted by F(x) and G(y), respectively. Then, the hypotheses tested by IMSL_KOLMOGOROV2 are as follows:

• H0 : F (x) = G (x)     H1 :F (x) ≠ G (x)
• H0 : F (x) ≥ G (x)     H1 : F (x) < G (x)
• H0 : F (x) ≤ G (x)     H1 : F (x) > G (x)

The test statistics are given as follows:

Asymptotically, the distribution of the statistic

(returned in Result (0)) converges to a distribution given by Smirnov (1939).

Exact probabilities for the two-sided test are computed when m * n is less than or equal to 104, according to an algorithm given by Kim and Jennrich (1973;). When m * n is greater than 104, the very good approximations given by Kim and Jennrich are used to obtain the two-sided p-values. The one-sided probability is taken as one half the two-sided probability. This is a very good approximation when the p-value is small (say, less than 0.10) and not very good for large p-values.

Example

The following example illustrates the IMSL_KOLMOGOROV2 routine with two randomly generated samples from a uniform(0,1) distribution. Since the two theoretical distributions are identical, we would not expect to reject the null hypothesis.

`IMSL_RANDOMOPT, set	=	123457`
`x	=	IMSL_RANDOM(100, /Uniform)`
`y	=	IMSL_RANDOM(60, /Uniform)`
`stats	=	IMSL_KOLMOGOROV2(x, y, DIFFERENCES = d, \$`
`  NMISSINGX = nmx, NMISSINGY = nmy)`
`PRINT, 'D	=', d(0)`
`PRINT, 'D+ =', d(1) PRINT, 'D- =', d(2)`
`PRINT, 'Z	=', stats(0)`
`PRINT, 'Prob greater D one sided =', stats(1)`
`PRINT, 'Prob greater D two sided =', stats(2)`
`PRINT, 'Missing X =', nmx`
`PRINT, 'Missing Y =', nmy`
` `
`D	=    0.180000`
`D+	=   0.180000`
`D-	=   0.0100001`
`Z	=    1.10227`
`Prob greater D one sided =    0.0720105`
`Prob greater D two sided =    0.144021`
`Missing X =    0`
`Missing Y =    0`

Syntax

Result = KOLMORGOROV2(X, Y [, DIFFERENCES=variable] [, /DOUBLE] [, NMISSINGX=variable] [, NMISSINGY=variable])

Return Value

One-dimensional array of length 3 containing Z, p1, and p2.

Arguments

X

One-dimensional array containing the observations from sample one.

Y

One-dimensional array containing the observations from sample two.

Keywords

DIFFERENCES (optional)

Named variable into which a one-dimensional array containing Dn, Dn, Dn is stored.

DOUBLE (optional)

If present and nonzero, then double precision is used.

NMISSINGX (optional)

Named variable into which the number of missing values in the x sample is stored.

NMISSINGY (optional)

Named variable into which the number of missing values in the y sample is stored.

Version History

 6.4 Introduced