City Research Online

Computing the Kolmogorov-Smirnov Distribution when the Underlying cdf is Purely Discrete, Mixed or Continuous

Dimitrova, D. S., Kaishev, V. K. & Tan, S. (2020). Computing the Kolmogorov-Smirnov Distribution when the Underlying cdf is Purely Discrete, Mixed or Continuous. Journal of Statistical Software, 95(10), pp. 1-42. doi: 10.18637/jss.v095.i10

Abstract

The distribution of the Kolmogorov-Smirnov (K-S) test statistic has been widely studied under the assumption that the underlying theoretical cdf, F(x), is continuous. However, there are many real-life applications in which fitting discrete or mixed distributions is required. Nevertheless, due to inherent difficulties, the distribution of the K-S statistic when F(x) has jump discontinuities has been studied to a much lesser extent and no exact and efficient computational methods have been proposed in the literature. In this paper, we provide a fast and accurate method to compute the (complementary) cdf of the K-S statistic when F(x) is discontinuous, and thus obtain exact p values of the K-S test. Our approach is to express the complementary cdf through the rectangle probability for uniform order statistics, and to compute it using Fast Fourier Transform(FFT). Secondly, we provide a C++ and an R implementation of the proposed method, which fills in the existing gap in statistical software. We give also a useful extension of the Schmid’s asymptotic formula for the distribution of the K-S statistic, relaxing his requirement for F(x) to be increasing between jumps and thus allowing for any general mixed or purely discrete F(x). The numerical performance of the proposed FFT-based method, implemented both in C++ and in the R package KSgeneral, is illustrated when F(x) is mixed, purely discrete, and continuous. The performance of the general asymptotic formula is also studied.

Publication Type: Article
Additional Information: Published under a Creative Commons Attribution License (CC-BY).
Publisher Keywords: Kolmogorov-Smirnov test statistic; discontinuous (discrete or mixed) distribution; Fast Fourier Transform; double boundary non-crossing; rectangle probability for uniform order statistics
Departments: Bayes Business School > Actuarial Science & Insurance
SWORD Depositor:
[thumbnail of v95i10.pdf]
Preview
Text - Published Version
Available under License Creative Commons: Attribution 3.0.

Download (756kB) | Preview
[thumbnail of Dimitrova%2C Kaishev%2C Tan %282017%29 KSr1.pdf]
Preview
Text - Accepted Version
Download (638kB) | Preview
[thumbnail of KSgeneral C++ code and Replication material] Other (KSgeneral C++ code and Replication material) - Supplemental Material
Available under License Software: Creative Commons: GNU GPL 2.0.

Download (12MB)

Export

Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email

Downloads

Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login