Measuring different types of Well-being

Well-being is a very broad construct. Any good measure of well-being is likely to be multi-dimensional and consist of many factors or scales. A recent review of well-being measures grouped the well-being measures into four broad groups of Hedonic, Eudiamonic, Quality of Life and Wellness [...]

Rating: 0.0/10 (0 votes cast)

See original: The Mouse Trap Measuring different types of Well-being

The BioPsychoSocioEnvironmental model

Most of us have heard about the BioPsychoSocial model of mental illnesses and have also heard about the stress-diathesis model. Today as I was contemplating the two, taking cue from my ABCD model of psychology, I tried combining the two and find quite some merit in that approach.
To recap, [...]

Rating: 0.0/10 (0 votes cast)

See original: The Mouse Trap The BioPsychoSocioEnvironmental model

Pourquoi et comment faut-il sauver la sécurité hydrique ?

Mobilisant une approche d’écologie politique, la démarche exposée dans cet article invite à substituer à la notion conventionnelle de sécurité hydrique celle de sécurité hydro-sociale afin de répondre aux principaux enjeux hydriques contemporains, au premier rang desquels figure l’adaptation au changement climatique. Il s’agit alors non plus de procéder à une sociétalisation marchandisée des risques hydriques mais bien d’oeuvrer à une démocratisation des cycles hydro-sociaux afin de consolider le processus de co-construction territorialisée de trajectoires de développement soutenables. Dans une telle perspective, la mise en politique de la gestion des incertitudes scientifiques et sociétales inhérentes aux services écosystémiques représente un pré-requis méthodologique incontournable devant permettre d’aboutir, via un processus de négociation fondé sur l’évaluation monétaire démocratique, à l’élaboration de compromis institutionnalisés « sécurisants ».

See original: VertigO - la revue électronique en sciences de l'environnement Pourquoi et comment faut-il sauver la sécurité hydrique ?

Understanding restoration and transaction costs in a payment for ecosystem service water quality market in Oregon, USA

In Oregon, USA, water temperature in many streams and rivers is a regulated condition that is being addressed in some contexts through a payment for ecosystem services (PES) approach. Through this approach, regulated utilities that discharge water into streams and rivers that is warmer than allowable are able to pay for riparian forest restoration and the cooling benefits that its shade brings in exchange for regulatory compliance. Mandatory water quality monitoring and reporting activities conducted by entities that make point source emissions provide the basis upon which an ecosystem service (riparian forest shade) is quantified (in kilocalories) and sold (per thermal credit). The City of Medford’s wastewater treatment facility in southern Oregon, USA, is the first regulated entity in Oregon to adopt a PES approach for regulatory compliance. Through analysis of verified trades and numerous other public documents, transaction costs associated with this program are estimated at 85 p...

See original: VertigO - la revue électronique en sciences de l'environnement Understanding restoration and transaction costs in a payment for ecosystem service water quality market in Oregon, USA

A five-step assessment of river ecosystem services to inform conflictive water-flows management – the Ter River case

River conflicts have been a matter of abundant intellectual production. However, analysis on their relation to the appropriation of instream flows–related ecosystem services (ES) is missing. Such analysis, undertaken with a proper account for stakeholders’ views and interests, is the aim of this paper. As happens in other Mediterranean contexts, multiple water withdrawals from the Ter River (NE Catalonia, Spain), e.g., for hydropower and inter-basin water supply, disrupt instream flows throughout the entire course of the river. Traditionally, this situation has triggered local and regional disputes, becoming a main issue for water management in the basin district. Our methodology entails a five-step assessment for the study of the ES production related to both diverted and instream flows in the Ter River basin. The steps are: identification, characterization, localization, quantification and valuation. A key aspect of the methodology is the engagement of stakeholders; both key infor...

See original: VertigO - la revue électronique en sciences de l'environnement A five-step assessment of river ecosystem services to inform conflictive water-flows management – the Ter River case

L’évaluation par les services écosystémiques des rivières ordinaires est-elle durable ?

L’évaluation par les services écosystémiques s’est fortement développée depuis le début des années 2000 et l’évaluation des écosystèmes pour le millénaire. Alors que les démarches entreprises pour promouvoir une gestion écologique des cours d’eau se traduisent dans l’Ouest de la France par de nombreuses opérations de restauration écologique, les conflits entre gestionnaires de l’environnement, élus, propriétaires d’ouvrages en travers (seuils, barrages) et population locale se multiplient. Face à cette situation, élus et gestionnaires expriment sur le terrain le besoin de développer des outils ou méthodes permettant de rendre plus légitimes leurs choix, de réduire les incertitudes voire de structurer le débat localement. Leur demande rencontre ainsi l’approche par les services écosystémiques promue par la Directive Cadre sur l’eau (2000) et de nombreuses institutions responsables de la gestion de l’eau. Au-delà des incertitudes méthodologiques et des difficultés techniques, cette st...

See original: VertigO - la revue électronique en sciences de l'environnement L’évaluation par les services écosystémiques des rivières ordinaires est-elle durable ?

Origine et usages de la notion de services écosystémiques : éclairages sur son apport à la gestion des hydrosystèmes

La popularité considérable de l’expression « services écosystémiques » rend nécessaire un travail fin de délimitation de ses domaines de validité. Il convient ainsi de s’interroger sur la vocation d’une telle notion, mais aussi sur ses effets réels en matière de gestion des milieux naturels. L’objet de cette contribution est de discuter les apports et les limites que la notion de services écosystémiques représente pour la gestion des cours d’eau. L’analyse de la généalogie et de la diffusion de cette notion d’origine scientifique montre comment d’un compromis métaphorique elle est en passe de devenir un dispositif (dans le sens de Foucault) de gouvernementalité. Dire cela limite forcément le périmètre de validité des services écosystémiques : inventée pour convaincre certains acteurs à certaines échelles scalaires (notamment celle de gouvernance internationale), cette notion n’a pas forcément une portée universelle, parfaitement applicable en tous lieux. Concrètement, l’étude de la ...

See original: VertigO - la revue électronique en sciences de l'environnement Origine et usages de la notion de services écosystémiques : éclairages sur son apport à la gestion des hydrosystèmes

2 Practical Workshops in the Magnetite nanoparticles preparation and Nanobiophotonics

For the first Time in Egypt: 2 workshops in 1 workshop for two days;30 and 31 July 2016
The first workshop is about practical preparation of magnetite nanopartiles and the other is a practical application of Nanobiophotonics in Quantum healing and bioresonance using EMAGO device.
Details of the first day is as follows:
Naqaa Nanotechnology Network is organizing a Practical Workshop in the Magnetite nanoparticles preparation for one day on Saturday 30 July from
10:30 am till 3:30pm which will contains lectures about different applications of magnetite nanoparticles and practical preparation of
Magnetite Nanoparticles
Important: Don't forget to get your lab coat with you for the practical part
Spaces will be limited to 15 participants, so we ask attendees to register ahead of time
The Second day about practical application of nanobiophotonics and will be presented and demonstrated by Engineer Tarek ElAfandy, Former IT manager at AUC
Fees is 300 EGP for Naqaa members

Maximum processes

Processes which can be represented as the maximum value of some underlying process.

An interestingly mathematically tractable way of getting interesting behaviour from borining variables, even IID ones.
The other mathematically conveneint way of handing monotonic processes apart from branching processes and affiliated counting processes.

I had my interest in these rekindled recently by Peter Straka, after first running into them in a lecture by Paul Embrechts in terms of risk management.

My former co-supervisor Sara van de Geer then introduced another class of them to me where the maximum is not take ove rthstate space of a scalar ranum variable, but maximum deviation inequalities for convergence of empirical distributions; These latter ones are not so tractable, which is why I strategically retreated.

Peter assures me that if I read Ressel I will be received diviends.
Supposedly the time transform is especially rich, and the semigroup structure especially convenient?

Obviously this needs to be made precise, which may happen if it turns out to actually help.

Refs

EmKM97
Embrechts, P., Klüppelberg, C., & Mikosch, T. (1997) Risk Theory. In Modelling Extremal Events (pp. 21–57). Springer Berlin Heidelberg
Laur12
Lauritzen, S. L.(2012) Extremal Families and Systems of Sufficient Statistics. . Springer Science & Business Media
McFE05
McNeil, A. J., Frey, R., & Embrechts, P. (2005) Quantitative risk management : concepts, techniques and tools. . Princeton: Princeton Univ. Press
Ress91
Ressel, P. (1991) Semigroups in Probability Theory. In H. Heyer (Ed.), Probability Measures on Groups X (pp. 337–363). Springer US
Ress11
Ressel, P. (2011) A revision of Kimberling’s results — With an application to max-infinite divisibility of some Archimedean copulas. Statistics & Probability Letters, 81(2), 207–211. DOI.

See original: The Living Thing / Notebooks Maximum processes

Tensorflow

A C++/Python neural network toolkit by Google.
I am using it for solving general machine-learning problems.

The construction of graphs is more explicit than in Theano, which I find easier to understand, although this means that you use the near-python syntax of Theano.
Also claims to compile to smartphones etc, although that looks buggy atm.

HOWTOs

See original: The Living Thing / Notebooks Tensorflow

Penalised regression

On regression estimation with penalties on the model.
Practically this means choosing appropriate smoothing to do good model selection, and possibly using some clever optimisation method.
Related to compressed sensing but here we consider sampling complexity,
the effect of measurement noise, and more general penalties than just \(\ell_1\).

See also matrix factorisations,
optimisation,
multiple testing,
concentration inequalities,
sparse flavoured icecream.

To discuss:

LARS, LASSO, Group LASSO, de-biassed LASSO, Elastic net, etc.

In nonparametric statistics we might estimate simultaneously what look like
many, many parameters, which we constrain in some clever fashion,
which usually boils down to something we can interpret as a “smoothing”
parameters, controlling how many parameters we still have to model
from a subset of the original.

The “regularisation” nomenclature claims descent from Tikhonov, (eg TiGl65 etc) who wanted to solve ill-conditioned integral and differential equations, so it’s slightly more general.
“Smoothing” seems to be common in the
spline and
kernel estimate communities of
Wahba (Wahb90) and Silverman (Silv82) et al,
who usually actually want to smooth curves.

Penalization” has a geneology unknown to me, but is probably the least abstruse for common usage.

These are, AFAICT, more or less the same thing.
“smoothing” is more common in my communities which is fine,
but we have to remember that “smoothing” an estimator might not always infer smooth dynamics in the estimand;
it could be something else being smoothed, such as variance in the estimate of parameters of a rough function.

In every case, you wish to solve an ill-conditioned inverse problem, so you tame it by adding a penalty to solutions you feel one should be reluctant to accept.

TODO: make comprehensible

TODO: examples

TODO: discuss connection with model selection

TODO: discuss connection with compressed sensing.

The real classic approach here is spline smoothing of functional data.
More recent approaches are things like sparse regression.

Debiassed LASSO

See GBRD14 and Geer14c.

Implementations

I’m not going to mention LASSO in (generalised) linear regression,
since everything does that these days (Oh alright,
Jerome Friedman’s glmnet for R is the fastest,
and has a MATLAB version.

But SPAMS (C++, MATLAB, R, python) by Mairal himself, looks interesting.
It’s an optimisation library for many various in sparse problems.

Sparse Filtering in Theano.

Refs

ABDJ06
Abramovich, F., Benjamini, Y., Donoho, D. L., & Johnstone, I. M.(2006) Adapting to unknown sparsity by controlling the false discovery rate. The Annals of Statistics, 34(2), 584–653. DOI.
AzKS15
Azizyan, M., Krishnamurthy, A., & Singh, A. (2015) Extreme Compressive Sampling for Covariance Estimation. arXiv:1506.00898 [Cs, Math, Stat].
Bach00
Bach, F. (n.d.) Model-Consistent Sparse Estimation through the Bootstrap.
BJMO12
Bach, F., Jenatton, R., Mairal, J., & Obozinski, G. (2012) Optimization with Sparsity-Inducing Penalties. Found. Trends Mach. Learn., 4(1), 1–106. DOI.
BaRo14
Bahmani, S., & Romberg, J. (2014) Lifting for Blind Deconvolution in Random Mask Imaging: Identifiability and Convex Relaxation. arXiv:1501.00046 [Cs, Math, Stat].
BCFS14
Banerjee, A., Chen, S., Fazayeli, F., & Sivakumar, V. (2014) Estimation with Norm Regularization. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 27 (pp. 1556–1564). Curran Associates, Inc.
Barb15
Barbier, J. (2015) Statistical physics and approximate message-passing algorithms for sparse linear estimation problems in signal processing and coding theory. arXiv:1511.01650 [Cs, Math].
BaSB10
Baron, D., Sarvotham, S., & Baraniuk, R. G.(2010) Bayesian compressive sensing via belief propagation. Signal Processing, IEEE Transactions on, 58(1), 269–280. DOI.
Batt92
Battiti, R. (1992) First-and second-order methods for learning: between steepest descent and Newton’s method. Neural Computation, 4(2), 141–166. DOI.
BaMo12
Bayati, M., & Montanari, A. (2012) The LASSO Risk for Gaussian Matrices. IEEE Transactions on Information Theory, 58(4), 1997–2017. DOI.
BCCZ14
Borgs, C., Chayes, J. T., Cohn, H., & Zhao, Y. (2014) An $L^p$ theory of sparse graph convergence I: limits, sparse random graph models, and power law distributions. arXiv:1401.2906 [Math].
BrPK16
Brunton, S. L., Proctor, J. L., & Kutz, J. N.(2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences, 113(15), 3932–3937. DOI.
BüGe11
Bühlmann, P., & Geer, S. van de. (2011) Additive models and many smooth univariate functions. In Statistics for High-Dimensional Data (pp. 77–97). Springer Berlin Heidelberg
BüGe15
Bühlmann, P., & van de Geer, S. (2015) High-dimensional inference in misspecified linear models. arXiv:1503.06426 [Stat], 9(1), 1449–1473. DOI.
CaFe13
Candès, E. J., & Fernandez-Granda, C. (2013) Super-Resolution from Noisy Data. Journal of Fourier Analysis and Applications, 19(6), 1229–1254. DOI.
CaPl10
Candès, E. J., & Plan, Y. (2010) Matrix Completion With Noise. Proceedings of the IEEE, 98(6), 925–936. DOI.
CaRT06
Candès, E. J., Romberg, J. K., & Tao, T. (2006) Stable signal recovery from incomplete and inaccurate measurements. Communications on Pure and Applied Mathematics, 59(8), 1207–1223. DOI.
Carm13
Carmi, A. Y.(2013) Compressive system identification: Sequential methods and entropy bounds. Digital Signal Processing, 23(3), 751–770. DOI.
Carm14
Carmi, A. Y.(2014) Compressive System Identification. In A. Y. Carmi, L. Mihaylova, & S. J. Godsill (Eds.), Compressed Sensing & Sparse Filtering (pp. 281–324). Springer Berlin Heidelberg
CDHB09
Cevher, V., Duarte, M. F., Hegde, C., & Baraniuk, R. (2009) Sparse Signal Recovery Using Markov Random Fields. In Advances in Neural Information Processing Systems (pp. 257–264). Curran Associates, Inc.
CSPW10
Chen, M., Silva, J., Paisley, J., Wang, C., Dunson, D., & Carin, L. (2010) Compressive Sensing on Manifolds Using a Nonparametric Mixture of Factor Analyzers: Algorithm and Performance Bounds. IEEE Transactions on Signal Processing, 58(12), 6140–6155. DOI.
ChWa00
Chen, Y.-C., & Wang, Y.-X. (n.d.) Discussion on “Confidence Intervals and Hypothesis Testing for High-Dimensional Regression”.
DGSS14
Daneshmand, H., Gomez-Rodriguez, M., Song, L., & Schölkopf, B. (2014) Estimating Diffusion Network Structures: Recovery Conditions, Sample Complexity & Soft-thresholding Algorithm. In ICML.
DiFr84
Diaconis, P., & Freedman, D. (1984) Asymptotics of Graphical Projection Pursuit. The Annals of Statistics, 12(3), 793–815.
EHJT04
Efron, B., Hastie, T., Johnstone, I., & Tibshirani, R. (2004) Least angle regression. The Annals of Statistics, 32(2), 407–499. DOI.
EwSc15
Ewald, K., & Schneider, U. (2015) Confidence Sets Based on the Lasso Estimator. arXiv:1507.05315 [Math, Stat].
FlHS13
Flynn, C. J., Hurvich, C. M., & Simonoff, J. S.(2013) Efficiency for Regularization Parameter Selection in Penalized Likelihood Estimation of Misspecified Models. arXiv:1302.2068 [Stat].
GHIK13
Ghazi, B., Hassanieh, H., Indyk, P., Katabi, D., Price, E., & Shi, L. (2013) Sample-optimal average-case sparse Fourier Transform in two dimensions. In 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 1258–1265). DOI.
GiSB14
Giryes, R., Sapiro, G., & Bronstein, A. M.(2014) On the Stability of Deep Networks. arXiv:1412.5896 [Cs, Math, Stat].
GuLi05
Gui, J., & Li, H. (2005) Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics, 21(13), 3001–3008. DOI.
HaLB15
Hallac, D., Leskovec, J., & Boyd, S. (2015) Network Lasso: Clustering and Optimization in Large Graphs. arXiv:1507.00280 [Cs, Math, Stat]. DOI.
HaRR15
Hansen, N. R., Reynaud-Bouret, P., & Rivoirard, V. (2015) Lasso and probabilistic inequalities for multivariate point processes. Bernoulli, 21(1), 83–143. DOI.
HaTW15
Hastie, T. J., Tibshirani, Rob, & Wainwright, M. J.(2015) Statistical Learning with Sparsity: The Lasso and Generalizations. . Boca Raton: Chapman and Hall/CRC
HaKD13
Hawe, S., Kleinsteuber, M., & Diepold, K. (2013) Analysis operator learning and its application to image reconstruction. IEEE Transactions on Image Processing, 22(6), 2138–2150. DOI.
HeRP14
He, D., Rish, I., & Parida, L. (2014) Transductive HSIC Lasso. In M. Zaki, Z. Obradovic, P. N. Tan, A. Banerjee, C. Kamath, & S. Parthasarathy (Eds.), Proceedings of the 2014 SIAM International Conference on Data Mining (pp. 154–162). Philadelphia, PA: Society for Industrial and Applied Mathematics
HeGe11
Hebiri, M., & Geer, S. van de. (2011) The Smooth-Lasso and other ℓ1+ℓ2-penalized methods. Electronic Journal of Statistics, 5, 1184–1226. DOI.
HeIS15
Hegde, C., Indyk, P., & Schmidt, L. (2015) A nearly-linear time framework for graph-structured sparsity. In Proceedings of the 32nd International Conference on Machine Learning (ICML-15) (pp. 928–937).
HCMF08
Hesterberg, T., Choi, N. H., Meier, L., & Fraley, C. (2008) Least angle and ℓ1 penalized regression: A review. Statistics Surveys, 2, 61–93. DOI.
HoKe70
Hoerl, A. E., & Kennard, R. W.(1970) Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics, 12(1), 55–67. DOI.
HRLV10
Hormati, A., Roy, O., Lu, Y. M., & Vetterli, M. (2010) Distributed Sampling of Signals Linked by Sparse Filtering: Theory and Applications. IEEE Transactions on Signal Processing, 58(3), 1095–1109. DOI.
HuPC15
Hu, T., Pehlevan, C., & Chklovskii, D. B.(2015) A Hebbian/Anti-Hebbian Network for Online Sparse Dictionary Learning Derived from Symmetric Matrix Factorization. arXiv:1503.00690 [Cs, Q-Bio, Stat].
JaMo14
Javanmard, A., & Montanari, A. (2014) Confidence Intervals and Hypothesis Testing for High-dimensional Regression. Journal of Machine Learning Research, 15(1), 2869–2909.
Kabá14
Kabán, A. (2014) New Bounds on Compressive Linear Least Squares Regression. (pp. 448–456). Presented at the Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics
LaLZ09
Langford, J., Li, L., & Zhang, T. (2009) Sparse Online Learning via Truncated Gradient. In D. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.), Advances in Neural Information Processing Systems 21 (pp. 905–912). Curran Associates, Inc.
LSST13
Lee, J. D., Sun, D. L., Sun, Y., & Taylor, J. E.(2013) Exact post-selection inference, with application to the lasso. arXiv:1311.6238 [Math, Stat].
LTTT14
Lockhart, R., Taylor, J., Tibshirani, R. J., & Tibshirani, R. (2014) A significance test for the lasso. The Annals of Statistics, 42(2), 413–468. DOI.
MeGB08
Meier, L., van de Geer, S., & Bühlmann, P. (2008) The group lasso for logistic regression. Group, 70(Part 1), 53–71.
Mont12
Montanari, A. (2012) Graphical models concepts in compressed sensing. Compressed Sensing: Theory and Applications, 394–438.
MüGe15
Müller, P., & van de Geer, S. (2015) Censored linear model in high dimensions: Penalised linear regression on high-dimensional data with left-censored response variable. TEST. DOI.
NaGr12
Nam, S., & Gribonval, R. (2012) Physics-driven structured cosparse modeling for source localization. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5397–5400). DOI.
NeTr08
Needell, D., & Tropp, J. A.(2008) CoSaMP: Iterative signal recovery from incomplete and inaccurate samples. arXiv:0803.2392 [Cs, Math].
NCBK11
Ngiam, J., Chen, Z., Bhaskar, S. A., Koh, P. W., & Ng, A. Y.(2011) Sparse Filtering. In J. Shawe-Taylor, R. S. Zemel, P. L. Bartlett, F. Pereira, & K. Q. Weinberger (Eds.), Advances in Neural Information Processing Systems 24 (pp. 1125–1133). Curran Associates, Inc.
NiGe13
Nickl, R., & van de Geer, S. (2013) Confidence sets in sparse regression. The Annals of Statistics, 41(6), 2852–2876. DOI.
PeEE10
Peleg, T., Eldar, Y. C., & Elad, M. (2010) Exploiting Statistical Dependencies in Sparse Representations for Signal Recovery. IEEE Transactions on Signal Processing, 60(5), 2286–2303. DOI.
PGKY15
Peng, Z., Gurram, P., Kwon, H., & Yin, W. (2015) Optimal Sparse Kernel Learning for Hyperspectral Anomaly Detection. arXiv:1506.02585 [Cs].
PoHo15
Pouget-Abadie, J., & Horel, T. (2015) Inferring Graphs from Cascades: A Sparse Recovery Framework. In Proceedings of The 32nd International Conference on Machine Learning.
RaRe09
Rahimi, A., & Recht, B. (2009) Weighted Sums of Random Kitchen Sinks: Replacing minimization with randomization in learning. In Advances in neural information processing systems (pp. 1313–1320). Curran Associates, Inc.
RaBr15a
Ravishankar, S., & Bresler, Y. (2015a) Efficient Blind Compressed Sensing Using Sparsifying Transforms with Convergence Guarantees and Application to MRI. arXiv:1501.02923 [Cs, Stat].
RaBr15b
Ravishankar, S., & Bresler, Y. (2015b) Sparsifying Transform Learning With Efficient Optimal Updates and Convergence Guarantees. IEEE Transactions on Signal Processing, 63(9), 2389–2404. DOI.
RiGr14
Rish, I., & Grabarnik, G. (2014) Sparse Signal Recovery with Exponential-Family Noise. In A. Y. Carmi, L. Mihaylova, & S. J. Godsill (Eds.), Compressed Sensing & Sparse Filtering (pp. 77–93). Springer Berlin Heidelberg
RiGr15
Rish, I., & Grabarnik, G. Y.(2015) Sparse modeling: theory, algorithms, and applications. . Boca Raton, FL: CRC Press, Taylor & Francis Group
ScBD11
Schelldorfer, J., Bühlmann, P., & De Geer, S. V.(2011) Estimation for High-Dimensional Linear Mixed-Effects Models Using ℓ1-Penalization. Scandinavian Journal of Statistics, 38(2), 197–214. DOI.
Silv82
Silverman, B. W.(1982) On the Estimation of a Probability Density Function by the Maximum Penalized Likelihood Method. The Annals of Statistics, 10(3), 795–810. DOI.
SFHT11
Simon, N., Friedman, J., Hastie, T., & Tibshirani, R. (2011) Regularization Paths for Cox’s Proportional Hazards Model via Coordinate Descent. Journal of Statistical Software, 39(5).
SFJJ15
Smith, V., Forte, S., Jordan, M. I., & Jaggi, M. (2015) L1-Regularized Distributed Optimization: A Communication-Efficient Primal-Dual Framework. arXiv:1512.04011 [Cs].
Stin04
Stine, R. A.(2004) Discussion of “Least angle regression” by Efron et al. The Annals of Statistics, 32(2), 407–499.
SuBC15
Su, W., Bogdan, M., & Candès, E. J.(2015) False Discoveries Occur Early on the Lasso Path. arXiv:1511.01957 [Cs, Math, Stat].
Tadd13
Taddy, M. (2013) One-step estimator paths for concave regularization. arXiv:1308.5623 [Stat].
ThAH15
Thrampoulidis, C., Abbasi, E., & Hassibi, B. (2015) The LASSO with Non-linear Measurements is Equivalent to One With Linear Measurements. arXiv:1506.02181 [Cs, Math, Stat].
Tibs96
Tibshirani, R. (1996) Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288.
Tibs14
Tibshirani, R. J.(2014) A General Framework for Fast Stagewise Algorithms. arXiv:1408.5801 [Stat].
TiGl65
Tikhonov, A. N., & Glasko, V. B.(1965) Use of the regularization method in non-linear problems. USSR Computational Mathematics and Mathematical Physics, 5(3), 93–107. DOI.
TrWr10
Tropp, J. A., & Wright, S. J.(2010) Computational Methods for Sparse Solution of Linear Inverse Problems. Proceedings of the IEEE, 98(6), 948–958. DOI.
Uema15
Uematsu, Y. (2015) Penalized Likelihood Estimation in High-Dimensional Time Series Models and its Application. arXiv:1504.06706 [Math, Stat].
Geer07
van de Geer, S. (2007) The deterministic Lasso.
Geer14a
van de Geer, S. (2014a) Statistical Theory for High-Dimensional Models. arXiv:1409.8557 [Math, Stat].
Geer14b
van de Geer, S. (2014b) Weakly decomposable regularization penalties and structured sparsity. Scandinavian Journal of Statistics, 41(1), 72–86. DOI.
Geer14c
van de Geer, S. (2014c) Worst possible sub-directions in high-dimensional models. In arXiv:1403.7023 [math, stat] (Vol. 131).
GeBZ11
van de Geer, S. A., Bühlmann, P., & Zhou, S. (2011) The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso). Electronic Journal of Statistics, 5, 688–749. DOI.
GBRD14
van de Geer, S., Bühlmann, P., Ritov, Y. ’acov, & Dezeure, R. (2014) On asymptotically optimal confidence regions and tests for high-dimensional models. The Annals of Statistics, 42(3), 1166–1202. DOI.
VeRo15
Veitch, V., & Roy, D. M.(2015) The Class of Random Graphs Arising from Exchangeable Random Measures. arXiv:1512.03099 [Cs, Math, Stat].
Wahb90
Wahba, G. (1990) Spline Models for Observational Data. . SIAM
WWZH15
Wang, D., Wu, P., Zhao, P., & Hoi, S. C. H.(2015) A Framework of Sparse Online Learning and Its Applications. arXiv:1507.07146 [Cs].
WuLa08
Wu, T. T., & Lange, K. (2008) Coordinate descent algorithms for lasso penalized regression. The Annals of Applied Statistics, 2(1), 224–244. DOI.
YNGD12
Yaghoobi, M., Nam, S., Gribonval, R., & Davies, M. E.(2012) Noise aware analysis operator learning for approximately cosparse signals. In 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (pp. 5409–5412). DOI.
YuTo09
Yun, S., & Toh, K.-C. (2009) A coordinate gradient descent method for ℓ 1-regularized convex minimization. Computational Optimization and Applications, 48(2), 273–307. DOI.
ZhZh14
Zhang, C.-H., & Zhang, S. S.(2014) Confidence intervals for low dimensional parameters in high dimensional linear models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 76(1), 217–242. DOI.
ZYJZ15
Zhang, L., Yang, T., Jin, R., & Zhou, Z.-H. (2015) Sparse Learning for Large-scale and High-dimensional Data: A Randomized Convex-concave Optimization Approach. arXiv:1511.03766 [Cs].
ZhTW11
Zhou, T., Tao, D., & Wu, X. (2011) Manifold elastic net: a unified framework for sparse dimension reduction. Data Mining and Knowledge Discovery, 22(3), 340–371.
Zou06
Zou, H. (2006) The Adaptive Lasso and Its Oracle Properties. Journal of the American Statistical Association, 101(476), 1418–1429. DOI.
ZoHa05
Zou, H., & Hastie, T. (2005) Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. DOI.
ZoHT07
Zou, H., Hastie, T., & Tibshirani, R. (2007) On the “degrees of freedom” of the lasso. The Annals of Statistics, 35(5), 2173–2192. DOI.

See original: The Living Thing / Notebooks Penalised regression

M-estimation

Estimating a quantity by choosing it to be the extremum of a function.

Very popular with machine learning, where loss-function based methods are ubiquitous.
In statistics we see this implicitly in maximum likelihood estimation
and robust estimation.

Refs

Barn83
Barndorff-Nielsen, O. (1983) On a formula for the distribution of the maximum likelihood estimator. Biometrika, 70(2), 343–365. DOI.
Bühl14
Bühlmann, P. (2014) Robust Statistics. In J. Fan, Y. Ritov, & C. F. J. Wu (Eds.), Selected Works of Peter J. Bickel (pp. 51–98). Springer New York
DoMo13
Donoho, D., & Montanari, A. (2013) High Dimensional Robust M-Estimation: Asymptotic Variance via Approximate Message Passing. arXiv:1310.7320 [Cs, Math, Stat].
Hamp74
Hampel, F. R.(1974) The Influence Curve and its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393. DOI.
Hube64
Huber, P. J.(1964) Robust Estimation of a Location Parameter. The Annals of Mathematical Statistics, 35(1), 73–101. DOI.
MoPe10
Mondal, D., & Percival, D. B.(2010) M-estimation of wavelet variance. Annals of the Institute of Statistical Mathematics, 64(1), 27–53. DOI.
Ronc00
Ronchetti, E. (2000) Robust Regression Methods and Model Selection. In A. Bab-Hadiashar & D. Suter (Eds.), Data Segmentation and Model Selection for Computer Vision (pp. 31–40). Springer New York
ThCl13
Tharmaratnam, K., & Claeskens, G. (2013) A comparison of robust versions of the AIC based on M-, S- and MM-estimators. Statistics, 47(1), 216–235. DOI.
Geer14
van de Geer, S. (2014) Worst possible sub-directions in high-dimensional models. In arXiv:1403.7023 [math, stat] (Vol. 131).

See original: The Living Thing / Notebooks M-estimation

Statistical learning theory

Given some amount of noisy data, how complex a model can I learn before I’m going to be failing to generalise to new data?
If I can answer this question a priori, I can fit a complex model with some messy hyperparameter and choose that hyperparameter without doing boring cross-validation.

Rademacher complexity, Gaussian complexity, Vapnik-Chernovenkis dimension.

Machine learning people always talk about this in terms of classification, which is what VC-dimension gives.

I don’t care about classification problems in general;
moreover, VC-dimensions seems to only be applicable analytically to limited classes.
Perhaps I can save time by going staight to Rademacher/Gaussian-style complexity and learn something about regression loss?

Modern results seem to avoid a lot of this by appealing to matrix concentration inequalities.

Percy Liang’s notes: CS229T/STAT231: Statistical Learning Theory (Winter 2014).

See also
function approximation for a different kind of approximation error, and
information criteria for one way to control it post hoc, or model selection for the statisticians’ approach to this problem in general.

Refs

BaMe02
Bartlett, P. L., & Mendelson, S. (2002) Rademacher and Gaussian Complexities: Risk Bounds and Structural Results. Journal of Machine Learning Research, 3(Nov), 463–482.
BEHW89
Blumer, A., Ehrenfeucht, A., Haussler, D., & Warmuth, M. K.(1989) Learnability and the Vapnik-Chervonenkis Dimension. J. ACM, 36(4), 929–965. DOI.
BoBL04
Bousquet, O., Boucheron, S., & Lugosi, G. (2004) Introduction to Statistical Learning Theory. In O. Bousquet, U. von Luxburg, & G. Rätsch (Eds.), Advanced Lectures on Machine Learning (pp. 169–207). Springer Berlin Heidelberg
Dsou04
D’Souza, A. A.(2004) Towards Tractable Parameter-free Statistical Learning. . University of Southern California, Los Angeles, CA, USA
GnSa08
Gnecco, G., & Sanguineti, M. (2008) Approximation Error Bounds via Rademacher’s Complexity. Applied Mathematical Sciences, 2(4), 153–176.
KCFH05
Krishnapuram, B., Carin, L., Figueiredo, M. A. T., & Hartemink, A. J.(2005) Sparse Multinomial Logistic Regression: Fast Algorithms and Generalization Bounds. IEEE Trans. Pattern Anal. Mach. Intell., 27(6), 957–968. DOI.
Lian00
Liang, P. (n.d.) CS229T/STAT231: Statistical Learning Theory (Winter 2014).
Nata89
Natarajan, B. K.(1989) On learning sets and functions. Machine Learning, 4(1), 67–97. DOI.
ScSm03
Schölkopf, B., & Smola, A. J.(2003) A Short Introduction to Learning with Kernels. In S. Mendelson & A. J. Smola (Eds.), Advanced Lectures on Machine Learning (pp. 41–64). Springer Berlin Heidelberg
Vapn10
Vapnik, V. (2010) The Nature of Statistical Learning Theory. (Softcover reprint of hardcover 2nd ed. 2000.). Springer
VaCh71
Vapnik, V., & Chervonenkis, A. (1971) On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities. Theory of Probability & Its Applications, 16(2), 264–280. DOI.
VaLC94
Vapnik, V., Levin, E., & Cun, Y. L.(1994) Measuring the VC-Dimension of a Learning Machine. Neural Computation, 6(5), 851–876. DOI.
LuSc08
von Luxburg, U., & Schölkopf, B. (2008) Statistical Learning Theory: Models, Concepts, and Results. arXiv:0810.4752 [Math, Stat].

See original: The Living Thing / Notebooks Statistical learning theory

Learning Gamelan

On online learning of
sparse basis dictionaries,
for music.
Blind IIR deconvolution with an unusual loss function.
or “shift invariant sparse coding”.

It seems like this would boil down to something like sparse dictionary
learning, with the sparse activations, and a dictionary
sparse in LPC components.

There are two ways to do this - time domain, and frequency domain.

For the latter, sparse time-domain activations are non local in Fourier components, but possibly simple to recover.

For the former, one could solve Durbin-Watson equations in the time domain, although we expect that to be unstable.
We could go for sparse simultaneous kernel inference in the time domain, which might be better, or directly infer the Horner-form.
Then we have a lot of simultaneous filter components and tedious inference for them.
Otherwise, we could do it directly in the FFT domain, although this makes MIMO harder, and excludes the potential for non-linearities.
The fact that I am expecting to identify many distinct systems in Fourier space as atoms complicates this slightly.

Thought: can I use HPSS to do this with the purely harmonic components?
And use the percussive components as priors for the activations?
How do you enforce causality for triggering in the FFT-transformed domain?

We have activations and components, but the activations are a KxT matrix, and
the K components the rows of a KxL matrix.
We wish the convolution of one with the other to approximately recover the
original signal with a certain loss function.

Why gamelan?
It’s tuned percussion, with a non-trivial tuning system, and no pitch bending.

Theory:
TBD

Other questions:
Infer chained biquads? Even restrict them to be bandpass?
Or sparse, high-order filters of some description?

See original: The Living Thing / Notebooks Learning Gamelan