Ergodic Theory

To do


To read

See original: The Living Thing Ergodic Theory


See also: pseudorandomness.

The dark enemy of stationarity. You know, that assumptions that statistics
papers start with, if they aren’t starting with independent, identically
. Many results that my meagre statistics skills leave me to
understand work for i.i.d. assumptions can be made to go if you know that the
distribution from which you sample is stationary or your sampling ergodic.
But what if you know otherwise, that your distribution is either not
stationary, or will take so long to sample the space on an ergodic walk that
you may as well not bother waiting?

I’m working on a number of projects right now about evolving and emerging
behaviours - of people, artificial agents, economies, what-have-you. Real and
simulated. Frequently I find myself wanting to know what they are doing over
time, to measure some statistic or other of these systems. However, I know, or
suspect, that, at least where they are doing the interesting things that I
care about, that they are not described by any stationary distribution.
Instead, they are suffused with path-dependence, long-range correlation,
trends, or possessed of ergodic walks that are just too long to be plausibly
computed [on my laptop|within my research CPU allocation|before the heat-death
of the universe].

Sampling from real, time-changing problems is an issue for real data, not
just toy experiments - can anyone furnish me with a large ensemble of earths
to check my global economic simulations against? - so I need to know how to do
it better. (See also: post-normal science)

I would like to know how well we can get by with samples from non-stationary
distributions. Or some techniques for working out how bad my approximations
are. How long-term is the cycle in my system? How path dependent is it? How
inhomogeneous? If I can’t run my simulation until I have a valid sample, what
can I calculate locally that will give me insight into the larger system? Do
I care that much of statistical machinery breaks down, or not? Should I be
relieved that the best I can do in a certain circumstance may be to fall back
upon plain old visual inspection of some graph or other?

What do machine learning people do with this? Innovation theorists? Finance
market modellers? Catchment hydrologists? And Dame Nature, with that
‘evolution’ trick of hers?

Hurst effect. Long-memory processes. Ergodic theorems.


  • A Yatchew, W Hardle. 2006. Nonparametric state price density estimation using constrained least squares and the bootstrap. Journal of Econometrics.
  • B LeBaron, W B Arthur, R Palmer. 1999. Time series properties of an artificial stock market. Journal of Economic Dynamics and Control. _. (Online)
  • C E Monteleoni, T Jaakkola. 2003. Online learning of non-stationary sequences.
  • Dinh-Tuan Pham, Jean-François Cardoso. 2001. Blind separation of instantaneous mixtures of nonstationary sources. Signal Processing, IEEE Transactions on. _.
  • D L Stein, C M Newman. 1995. Broken ergodicity and the geometry of rugged landscapes. Phys. Rev. E. _. (Online)
  • Eric M Delmelle, Pierre Goovaerts. 2009. Second-Phase Sampling Designs for Non-Stationary Spatial Variables.. Geoderma. _.
  • F J Breidt, N Crato, P De Lima. 1998. The detection and estimation of long memory in stochastic volatility. Journal of Econometrics.
  • F X Diebold, A Inoue. 2001. Long memory and regime switching. Journal of Econometrics.
  • Gavin E Crooks. 2007. Measuring Thermodynamic Length. Phys. Rev. Lett.. _.
  • G Baumann. 2010. Place, RF; Földes-Papp, Z. Meaningful interpretation of subdiffusive measurements in living cells (crowded environment) by fluorescence fluctuation miscroscopy. Curr. Pharm. Biotechnol.
  • Girish Nathan, Gemunu Gunaratne. 2005. Set of measures to analyze the dynamics of nonequilibrium structures. Phys. Rev. E. _.
  • H Künsch. 1986. Discrimination between monotonic trends and long-range dependence. Journal of applied Probability.
  • Hong Qian. 2001. Relative entropy: Free energy associated with equilibrium fluctuations and nonequilibrium deviations. Phys. Rev. E. _.
  • H R Künsch. 1989. The jackknife and the bootstrap for general stationary observations. The Annals of Statistics.
  • I Berkes, L Horváth, P Kokoszka, Q M Shao. 2006. On discriminating between long-range dependence and changes in mean. The Annals of Statistics.
  • Jan Beran. 1992. Statistical Methods for Data with Long-Range Dependence. Statistical Science.
  • Jan Beran. 2010. Long-range dependence. Wiley Interdisciplinary Reviews: Computational Statistics. (Online)
  • J Beran. 1994. Statistics for long-memory processes, vol. 61 of Monographs on Statistics and Applied Probability.
  • Jean Opsomer, Yuedong Wang, Yuhong Yang. 2001. Nonparametric Regression with Correlated Errors. Statistical Science. (Online)
  • Jean-René Chazottes. An introduction to fluctuations of observables in chaotic dynamical systems.
  • Jeffrey E Steif. 1997. Consistent estimation of joint distributions for sufficiently mixing random fields. The Annals of Statistics.
  • John C Mauro, Prabhat K Gupta, Roger J Loucks. 2007. Continuously broken ergodicity.. J Chem Phys. _.
  • John V Shebalin, Shebalin, JohnV.. Broken ergodicity in two-dimensional homogeneous magnetohydrodynamic turbulence. Physics of Plasmas. _.
  • John V Shebalin. 1996. Absolute equilibrium entropy. Journal of Plasma Physics. _.
  • John V Shebalin, Shebalin, JohnV.. 2007. Broken symmetries and magnetic dynamos. Physics of Plasmas. _.
  • J R Chazottes, D Gabrielli. 2005. Large deviations for empirical entropies of g-measures. Nonlinearity. _. (Online)
  • J V Shebalin. 2010. Broken ergodicity in two-dimensional homogeneous magnetohydrodynamic turbulence. Physics of Plasmas. _.
  • J W Fisher III, Alexander T Ihler, Paul A Viola. Learning Informative Statistics: A Nonparametric Approach. Learning Informative Statistics: A Nonparametric Approach.
  • Katakin Marton, Paul C Shields. 1994. Entropy and the consistent estimation of joint distributions. The Annals of Probability.
  • Lee Altenberg. 2004. Open Problems in the Spectral Analysis of Evolutionary Dynamics. Frontiers of Evolutionary Computation. _. (Online)
  • L Giraitis, D Surgailis. 1999. Central limit theorem for the empirical process of a linear sequence with long memory. Journal of statistical planning and inference.
  • L Horváth. 2001. Change-point detection in long-memory processes. Journal of Multivariate Analysis.
  • Marcel Ausloos, Janusz Miskiewicz. 2009. Introducing the q-Theil index. Braz. J. Phys. 39 .
  • Mark Fleischer. Transformations for Accelerating Mcmc Simulations With Broken Ergodicity.
  • Nobusumi Sagara. 2005. Nonparametric maximum-likelihood estimation of probability measures: existence and consistency. Journal of Statistical Planning and Inference. _. (Online)
  • O Rose. 1996. Estimation of the hurst parameter of long-range dependent time series. Research Report.
  • Oscar J Mesa, German Poveda. 1993. The Hurst effect: the scale of fluctuation approach. Wat. Resour. Res.
  • Patrick Alfred Pierce Moran. 1964. On the range of cumulative sums. Annals of the Institute of Statistical Mathematics. _. (Online)
  • P Doukhan, G Oppenheim, M S Taqqu. 2003. Theory and applications of long-range dependence.
  • P M Robinson. 2003. Time series with long memory.
  • Rainer Dahlhaus, Wolfgang Polonik. 2009. Empirical spectral processes for locally stationary time series. Bernoulli. _. (Online)
  • R B Davies, D S Harte. 1987. Tests for Hurst effect. Biometrika.
  • R Dahlhaus. 1996. On the Kullback-Leibler information divergence of locally stationary processes. Stochastic Processes and their Applications. _. (Online)
  • R G Palmer. 1982. Broken ergodicity. Advances in Physics. _. (Online)
  • R M Gray. 2009. Probability, random processes, and ergodic properties.
  • R N Bhattacharya, V K Gupta, E Waymire. 1983. The Hurst effect under trends. Journal of applied probability.
  • S Csörgö, J Mielniczuk. 1999. Random-design regression under long-range dependent errors. Bernoulli. _.
  • Sílvia Gonçalves, Halbert White. 2004. Maximum likelihood and the bootstrap for nonlinear dynamic models. Journal of Econometrics. _. (Online)
  • S N Lahiri. 2003. Resampling methods for dependent data.
  • Stefan Gheorghiu, Marc-Olivier Coppens. 2004. Heterogeneity explains features of “anomalous” thermodynamics and statistics. Proc Natl Acad Sci U S A. _. (Online)
  • T Gneiting. 2000. Power-law correlations, related models for long-range dependence and their simulation. Journal of Applied Probability.
  • Vincent Q Vu, Bin Yu, Robert E Kass. 2009. Information in the nonstationary case. Neural Comput. _.
  • Walter Nadler, Ulrich H E Hansmann. 2007. Generalized ensemble and tempering simulations: A unified view. Phys. Rev. E. _.
  • The traditional invocation of Cosma Shalizi

See original: The Living Thing Non-stationarity

Stream processing

Lazy placeholder for intersting approaches to processing and transforming possibly-infinte streams of data, from signals to parse trees.

To read

See original: The Living Thing Stream processing

Complex networks

Why are networks[#]_ such a fashionable model to research?

[1] shorthand prefixed “complex” to make sure one does not accidentally get sent to a an IT operations conference and have to talk to — brrr — engineers, while still sounding like it might be difficult and worthy.

Because there is no shame attached to publishing colour diagrams of networks! Hire a designer and submit your work to a glossy!

OK, I have some more opinions on that point,
but they are not exciting,
and posssibly more cynical than necessary because of the deep sense of tedium that much network research unleashes from somewhere in my gut.
But no, despite my damned gut, there are some really nice results in the abstract graph modelling world, and gut-tedium can strike in any field.

To mention:
Spectral methods, which is to say least "graph Laplacian" ones.
(For that see also matrix factorizations)
"Pagerank" et al.

To not mention:
Abstract, general graph clustering problems, because of that tedium thing.
Really not my bag.
Specific clustering problems, are OK.
Small word/scale-free/Erdős–Rényi, which are covered to the point of suffocation by those more passionate than I.


  • Belkin, M., & Niyogi, P. (2003). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation. Neural Computation, 15(6), 1373–1396. DOI. Online.

    Useful for some problems of mine.
    Use a graph Laplacian (possibly a contrived or arbitrary one)
    to construct "natural" Euclidean coordinates for your data,
    such that nodes that have much traffic between them in the Laplacian
    representation have a small Euclidean distance
    (The "Urban Traffic Planner Fantasy Transformation")
    Very quickly gives you a similarity measure on really non-Euclidean data.
    Under which metrics is it equivalent to multidimensional scaling?

See original: The Living Thing Complex networks

WAYS-UNESCO Science Forum in Latin America and the Caribbean

14/11/2014 23:49

Under the Ibero-American Congress on Science, Technology, Innovation and Education

Buenos Aires, Argentina, November 14, 2014, 15:00 to 18:00 h

On the last day of the Ibero-American Congress on Science, Technology, Innovation and Education, which brings together scientists, policymakers, science educators and administrators from Spanish and Portuguese speaking nations in Latin America, the Caribbean, and Europe, the voice of emerging Latin American scientists will be heard in two dynamic sessions featuring innovative projects that showcase science, culture, science advocacy and science diplomacy.

The Youth Science Forum is convened by WAYS (World Association of Young Scientists) and the Regional Office for Science in Latin America and the Caribbean of UNESCO (United Nations Education, Science, and Culture Organization), as the inaugural event of WAYS-LAC (World Association of Young Scientists-Latin America and the Caribbean).

Les mécanismes de contrôle du respect du Protocole de Kyoto sur les changements climatiques : entre diplomatie et droit

C’est dans le cadre de sa thèse de doctorat que Juliette Voïnov Kholer a écrit le présent ouvrage où elle relate les différentes étapes des négociations internationales ainsi que les différents accords signés pour répondre à la problématique du changement climatique.

Dans la partie introductive, l’auteure présente les caractéristiques juridiques du Protocole de Kyoto en passant par les Conventions internationales (Conférence mondiale sur l’environnement humain de Stockholm en 1972; la conférence des Nations Unies pour l’environnement et le développement à Rio de Janeiro en 1992; le Sommet mondial pour le développement durable de 2002) qui ont permis la prise de conscience des États de la problématique climatique et la nécessite d’entreprendre des actions concrètes pour y faire face (responsabilité partagée).

La première partie aborde le mécanisme de contrôle adopté pour la mise en œuvre du Protocole de Kyoto au sein de la Convention-cadre des Nations Unies sur les changements climati...

See original: VertigO - la revue électronique en sciences de l'environnement Les mécanismes de contrôle du respect du Protocole de Kyoto sur les changements climatiques : entre diplomatie et droit

La colonisation agricole au Viêt Nam : contribution à l’étude de la construction d’un État moderne, du bouleversement à l’intégration des Plateaux centraux

L’ouvrage recensé présente les fondements et les manifestations de la colonisation agricole au Viêt Nam en tant qu’un des éléments du processus de déforestation qui a contribué à la conversion des territoires forestiers vers un autre usage.

L’expansion des terres cultivées aux dépens des forêts constitue un phénomène qui a affecté, à divers degrés, toutes les grandes régions du monde au cours de l’histoire. Depuis quelques décennies, l’Asie du Sud-Est connaît des transformations territoriales comparables aux modèles européen, états-unien, chinois, ou même à celui plus récent de l’Amazonie. Ces modèles présentent des caractéristiques similaires : une croissance accélérée de la population et, en corollaire, une augmentation de la production agricole, une intensification des échanges commerciaux et de l’utilisation d’innovations techniques ou technologiques, et, enfin, l’émergence d’une société nouvelle, transformée. Dans le cas du Viêt Nam, c’est la colonisation agricole, l’une des fac...

See original: VertigO - la revue électronique en sciences de l'environnement La colonisation agricole au Viêt Nam : contribution à l’étude de la construction d’un État moderne, du bouleversement à l’intégration des Plateaux centraux


Dictionary sparse bases

Orthogonal matching pursuit, wavelet, curvlets, whateverlets.
Analysing your signal in terms of a possibly-overcomplete dictionary of basis functions because… well, because.

To read

  • scikit version here (also pydbm and )
  • Boyes, G. (2011). Dictionary-Based Analysis/Synthesis and Structured Representations of Musical Audio. McGill University. Online.
  • Davis, G. M., Mallat, S. G., & Zhang, Z. (1994). Adaptive time-frequency decompositions. Optical Engineering, 33(7), 2183–2191. Online.
  • Han, K., & Shin, H. (n.d.). Functional Linear Regression for Functional Response via Sparse Basis Selection. Online.
  • Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2009). Online Dictionary Learning for Sparse Coding. In Proceedings of the 26th Annual International Conference on Machine Learning (pp. 689–696). New York, NY, USA: ACM. DOI. Online.
  • Mairal, J., Bach, F., Ponce, J., & Sapiro, G. (2010). Online learning for matrix factorization and sparse coding. The Journal of Machine Learning Research, 11, 19–60. Online.
  • Mallat, S. G., & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries. Signal Processing, IEEE Transactions on, 41(12), 3397–3415. Online.
  • Mallat, S., & Zhang, Z. (1992). Adaptive time-frequency decomposition with matching pursuits. In Time-Frequency and Time-Scale Analysis, 1992., Proceedings of the IEEE-SP International Symposium (pp. 7–10). DOI.
  • Rubinstein, R., Bruckstein, A. M., & Elad, M. (2010). Dictionaries for Sparse Representation Modeling. Proceedings of the IEEE, 98(6), 1045–1057. DOI. Online.
  • Rubinstein, R., Zibulevsky, M., & Elad, M. (2008). Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit. CS Technion, 40. Online.

See original: The Living Thing Dictionary sparse bases

User interface mapping

Particularly with regard to music.

  1. I just designed an interesting digital instrumwnt with a bunch of free control parameters.
  2. I have an interface with a different (usually much smaller) number of control parameters.
  3. there is no obvious "best", or immediately intruitive mapping from one to the other

How do I plug these into each other in an intelligible, expressive way so as to perform using them?

This question is broad, vague and and comes up all the time.

See also
gesture recognition,
wacky regression.

Ideas I would like to explore:

  • Interpolating between interesting parameters using arbitrary regression. (Wekinator does this using simple neural networks)
  • Constructing basis vectors in some clever way, e.g. sparse basis dictionaries
  • constructing quasi-physical models that explore the space in some smart, intuitive way, e.g. flocking models, hamiltonian modes
  • interpolating some nice presets?

Useful tools

  • libmapper bundles together UI signals and constructs some extra features for them (integrals, differences)
  • mmExtensions by Martin Marier has the best-designed preset interpolation system I have seen, so that its creator may plug a networked bath sponge into clarianet recordings.

See original: The Living Thing User interface mapping

Scientific computation workflow

Keeping my workflow reproducible and fast.
Knitr, scriptable pipelines, data sharing.

TODO: Briefly gloss over some fashionable academic programming languages, when I actually eventually used something other than python or R in more than a trivial way.

Build tools

Argh! Too damn many, that’s for sure, all offering marginal improvements over make, all with uncertain lifespan.
Therefore, arranged in descending order of and undisclosed combination of my perception of order of trendiness, and their being flexible enough to manage scientific tasks, which means:

  1. Sure, sometimes I want to compile some code…
  2. …but usually I want to massage some large chunk of data through a rickety processing pipeline…
  3. …so please do both without getting in my way.
  • doit seems to be the flavour of the minute, and promises modern dependency management and such, for a general audience
  • Ruffus is about setting up your simulation and automation pipelines, especially for science.
  • Sumatra is almost entirely about step 2, and in particular, tracking and reproducing simulation or analysis parameters for sciency types.
  • Fabric - Yesterday’s hottest thing. Optimised for remote deployment, esp of remote services. Seems to be everywhere, or everwhere that you might be deploying to that cloud thing.
  • Paver - had a nice syntax for building stuff, including python extension, but seems to have been untouched for dangerously long.
  • Scons is a make replacement that is itself damn old, and AFAICT, not actually that much easier to use. Oriented toward compiling stuff.
  • Make. The original, and still the default. For connoisseurs of fragile whitespace handling.

See also
text editors,
citation management,
academic writing workflow,

See original: The Living Thing Scientific computation workflow

Academic blogging workflow

"Improving my academic workflow", or,
"How I learned to stop worrying and love text files".

This blog, and virtually all my notes, are in plain text files on my computer, published online as plain html files.

Here are the tedious technical details of how and why
I publish them online using Pelican. This workflow is
but I’m experimenting with some alternate automation tools such as marked, and restview.
As to why I chose Pelican over Jekyll: I am more fluent in Python than Ruby.
As to why I chose Pelican over Nikola: I can’t remember.
When I set this thing up, there was some feature I needed from Nikola that it was missing, I guess?
They seem to have converged to being the same software with different code now anyway.

In the past, I used to use a great many online services to handle my
information; but I’ve been burned too many times by these businesses going
under, or fitting poorly into my workflow changes over time and leaving me no recourse. Waste of time, loss of data.
(So long,,, evernote, google docs…)
Fuck that.
Online is for backups, for NSA spooks, and for over-eager VC-backed startups to
breathelessly enthuse over, while they monetize your browsing habits.
The only online services I use these days are

  • Zotero but it avoids the above criticisms by the fact that it’s open-source, works great offline, is backed by a presumably relatively benevolent university, and handles only information that I wish to share.
  • github isn’t open-source, but the underlying technology it facilitates, git, is very open-source, so I’m happy to benefit from their sheen of trendiness.

I send both of those projects money in order that they continue to be nice to me and do not send me advertising.

See also
text editors,
citation management,
scientific computation workflow.

To read

See original: The Living Thing Academic blogging workflow

Monte Carlo methods

Solving integrals approximately by guessing cleverly.
Often, but not always, used for approximate statistical inference and especially certain Bayesian techniques.

  • Stan is a Monte Carlo simulation tool made by actual statisticians

See also compressed sensing, monte carlo methods, particle filters, and probabilistic computing.

See original: The Living Thing Monte Carlo methods

Probabilistic programming

Something like… programming using probailistic inference instead of trying to determinsitic?
(where does deterministic approximation fit in here?)

Is this, in fact, any different to "normal" programming done by a Bayesian statistician?
The two best candidates for differences I have seen so far are…

  1. the blithe use of models too deep for any professional statistician to be comfortable with them
  2. claiming to be in some way emulating or explaining human cognition

…but I am sure someone out there has some better pitch than that.

See also compressed sensing, Monte Carlo methods, particle filters, and probabilistic computing, which, while strictly speaking is totally the same thing, has a rather different emphasis.

To read

  • the Probabilistic programming wiki
  • Freer, C. E., Roy, D. M., & Tenenbaum, J. B.(2012). Towards common-sense reasoning via conditional simulation: legacies of Turing in Artificial Intelligence. In Turing’s Legacy: Developments from Turing’s Ideas in Logic. Cambridge, United Kingdom: Cambridge University Press. Online.
      1. Goodman and J. B. Tenenbaum. Probabilistic Models of Cognition.
  • Murphy, K. P.(2012). Machine Learning: A Probabilistic Perspective (1 edition.). Cambridge, MA: The MIT Press.

See original: The Living Thing Probabilistic programming

Covariance estimates

Estimating the thing that is always given to you by oracles in homework assignments.

Particular reference to dynamically updating covariance estimates for a possibly-evolving system.
(Which is not quite the Kalman filter problem, since that presumes the variance of our estimes, whcih is to say precision, updates, but that the underlying covariance is static.)

To read

  • Basic inference using Inverse Wishart -by having a very basic "process model" that increases unvertainty of the covariance estimate as some convenient monotonic function of time, i should be able to get this one.
  • Permutations on the theme at wikipedia
  • John Cook’s version
  • Murphy, K. P.(2007, October 3). Conjugate Bayesian analysis of the Gaussian distribution. Online.
  • Pébay, P. (2008). Formulas for robust, one-pass parallel computation of covariances and arbitrary-order statistical moments. Sandia Report SAND2008-6212, Sandia National Laboratories. Online.
  • Wilkinson, D. J.(1995). Bayes linear covariance matrix adjustment. arXiv:bayes-an/9512001. Online.

See original: The Living Thing Covariance estimates