(Convolution) kernel density estimation

A nonparametric method of approximating something from data
by assuming that it’s close to the data distribution
convolved with some kernel.

This is especially popular the target is a probability density function;
Then you are working with a kernel density estimator.

Bandwidth/kernel selection in density estimation

Bernacchia (BePi11) has a neat hack:
“self consistency” for simultaneous
kernel and distribution inference,
i.e. simultaneous deconvolution and
bandwidth selection.
The idea is removing bias by using simple spectral methods,
thereby estimating a kernel which in a certain sense would generate the data that you just observed.
The results look similar to finite-sample corrections for Gaussian scale parameter estimates, but are not quite Gaussian.

Question: could it work with mixture models too?

Mixture models

Where the number of kernels does not grow as fast as the number of data points,
this becomes a mixture model; Or, if you’d like, kernel density estimates are a limiting case of mixture model estimates,

They are so clearly similar that I think it best we not make them both
feel awkward by dithering about where the free parameters are.
Anyway, they are filed separately.
BaLi13, ZeMe97 and Geer96 discuss some useful common to various convex combination estimators.

Does this work with uncertain point locations?

The fact we can write the kernel density estimate as an integral with
a convolution of Dirac deltas immediately suggests
that we could write it as a convolution of something else, such as Gaussians.
Can we recover well-behaved estimates in that case?
This would be a kind of hierarchical model, possibly a very normal Bayesian one.

Does this work with asymmetric kernels?

Almost all the kernel estimates I’ve seen require KDEs to be symmetric, because of
Daren Cline’s argument that asymmetric kernels are inadmissible
in a decision-theoretic context in the class of all
(possibly multivariate) densities.
Presumably this implies $$\mathcal(C)_1$$ distributions,
i.e. once-differentiable ones, without atoms.
In particular admissible kernels are those
which have “nonnegative Fourier transforms bounded by 1”,
which implies symmetry about the axis.
If we have a constrained class of densities, this might not apply.

BaHu86
Barnes, J., & Hut, P. (1986) A hierarchical O(N log N) force-calculation algorithm. Nature, 324(6096), 446–449. DOI.
BaHy01
Bashtannyk, D. M., & Hyndman, R. J.(2001) Bandwidth selection for kernel conditional density estimation. Computational Statistics & Data Analysis, 36(3), 279–298. DOI.
BaLi13
Battey, H., & Liu, H. (2013) Smooth projected density estimation. arXiv:1308.3968 [Stat].
BePi11
Bernacchia, A., & Pigolotti, S. (2011) Self-consistent method for density estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73(3), 407–422.
BoGK10
Botev, Z. I., Grotowski, J. F., & Kroese, D. P.(2010) Kernel density estimation via diffusion. The Annals of Statistics, 38(5), 2916–2957. DOI.
CrMí14
Crisan, D., & Míguez, J. (2014) Particle-kernel estimation of the filter density in state-space models. Bernoulli, 20(4), 1879–1929. DOI.
DoHa15
Doosti, H., & Hall, P. (2015) Making a non-parametric density estimator more attractive, and more accurate, by data perturbation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78(2), 445–462. DOI.
ElDD03
Elgammal, A., Duraiswami, R., & Davis, L. S.(2003) Efficient kernel density estimation using the fast gauss transform with applications to color modeling and tracking. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 25(11), 1499–1504. DOI.
Elli91
Ellis, S. P.(1991) Density estimation for point processes. Stochastic Processes and Their Applications, 39(2), 345–358. DOI.
Geen14
Geenens, G. (2014) Probit Transformation for Kernel Density Estimation on the Unit Interval. Journal of the American Statistical Association, 109(505), 346–358. DOI.
Gisb03
Gisbert, F. J. G.(2003) Weighted samples, kernel density estimators and convergence. Empirical Economics, 28(2), 335–351. DOI.
GrSt91
Greengard, L., & Strain, J. (1991) The Fast Gauss Transform. SIAM Journal on Scientific and Statistical Computing, 12(1), 79–94. DOI.
Hall87
Hall, P. (1987) On Kullback-Leibler Loss and Density Estimation. The Annals of Statistics, 15(4), 1491–1519. DOI.
HaPa02
Hall, P., & Park, B. U.(2002) New Methods for Bias Correction at Endpoints and Boundaries. The Annals of Statistics, 30(5), 1460–1479. DOI.
KoMi06
Koenker, R., & Mizera, I. (2006) Density estimation by total variation regularization. Advances in Statistical Modeling and Inference, 613–634.
MaSc14
Malec, P., & Schienle, M. (2014) Nonparametric kernel density estimation near the boundary. Computational Statistics & Data Analysis, 72, 57–76. DOI.
MaHa10
Marshall, J. C., & Hazelton, M. L.(2010) Boundary kernels for adaptive density estimators on regions with irregular boundaries. Journal of Multivariate Analysis, 101(4), 949–963. DOI.
OKCC16
O’Brien, T. A., Kashinath, K., Cavanaugh, N. R., Collins, W. D., & O’Brien, J. P.(2016) A fast and objective multidimensional kernel density estimation method: fastKDE. Computational Statistics & Data Analysis, 101, 148–160. DOI.
PaKo12
Panaretos, V. M., & Konis, K. (2012) Nonparametric Construction of Multivariate Kernels. Journal of the American Statistical Association, 107(499), 1085–1095. DOI.
PJJK03
Park, B. U., Jeong, S.-O., Jones, M. C., & Kang, K.-H. (2003) Adaptive variable location kernel density estimators with good performance at boundaries. Journal of Nonparametric Statistics, 15(1), 61–75. DOI.
Raykar, V. C., & Duraiswami, R. (n.d.) The improved fast Gauss transform with applications to machine learning.
Silv82
Silverman, B. W.(1982) On the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method. The Annals of Statistics, 10(3), 795–810. DOI.
SmLe05
Smith, E., & Lewicki, M. S.(2005) Efficient Coding of Time-Relative Structure Using Spikes. Neural Computation, 17(1), 19–45. DOI.
Geer96
van de Geer, S. (1996) Rates of convergence for the maximum likelihood estimator in mixture models. Journal of Nonparametric Statistics, 6(4), 293–310. DOI.
WaWa07
Wang, B., & Wang, X. (2007) Bandwidth selection for weighted kernel density estimation. arXiv Preprint arXiv:0709.1616.
WeWu15
Wen, K., & Wu, X. (2015) An Improved Transformation-Based Kernel Estimator of Densities on the Unit Interval. Journal of the American Statistical Association, 110(510), 773–783. DOI.
YDGD03
Yang, C., Duraiswami, R., Gumerov, N. A., & Davis, L. (2003) Improved Fast Gauss Transform and Efficient Kernel Density Estimation. In Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2 (p. 464–). Washington, DC, USA: IEEE Computer Society DOI.
ZeMe97
Zeevi, A. J., & Meir, R. (1997) Density Estimation Through Convex Combinations of Densities: Approximation and Estimation Bounds. Neural Networks: The Official Journal of the International Neural Network Society, 10(1), 99–109. DOI.
ZhKa10
Zhang, S., & Karunamuni, R. J.(2010) Boundary performance of the beta kernel estimators. Journal of Nonparametric Statistics, 22(1), 81–104. DOI.

See original: (Convolution) kernel density estimation