Breakbeat cuts

Slicing up your percussion line into mad junglist syncopations is a whole world of its own.
Asides from selling a lot of vinyl, it has attracted significant academic interest.

Think of group theory angle, like a Rubik’s cube.
Is it a pure group theoretic problem?
Or are there additional constraints on a breakbeat cut such that it is still considered rhythmic?

Nick Collins has done a whole lot of work here.

See original: The Living Thing / Notebooks Breakbeat cuts


A passing interst of mine, caught from Didier Sornette when we was my supervisor.

I’m mostly interested in the self-exciting process model of Ogata and Ozaki et al, but I’ll also accept notes on human tragedy and normal accidents.

KATHRYN SCHULZ at the New Yorker
The Really Big One

To see the full scale of the devastation when that tsunami recedes, you would
need to be in the international space station. The inundation zone will be
scoured of structures from California to Canada. The earthquake will have
wrought its worst havoc west of the Cascades but caused damage as far away as
Sacramento, California—as distant from the worst-hit areas as Fort Wayne,
Indiana, is from New York. FEMA expects to coördinate search-and-rescue
operations across a hundred thousand square miles and in the waters off four
hundred and fifty-three miles of coastline. As for casualties: the figures I
cited earlier—twenty-seven thousand injured, almost thirteen thousand
dead—are based on the agency’s official planning scenario, which has the
earthquake striking at 9:41 A.M. on February 6th. If, instead, it strikes in
the summer, when the beaches are full, those numbers could be off by a
horrifying margin.

See original: The Living Thing / Notebooks Earthquakes


Samuel is a best student at NDA university


Studying the stars,planets and their movement.Become a planet and star master,study with perfect professionals and get high degree

Job Duration: 
Level of Education: 

Data sets

See also musical corpora.

  • Zenodo, for example, documents many published scientific data set

  • SESHAT: The Seshat: Global History Databank brings together the most current and comprehensive body of knowledge about human history in one place. Our unique Databank systematically collects what is currently known about the social and political organization of human societies and how civilizations have evolved over time.

  • UCI datasets
    are diverse. Here’s a nice one:

    • Buzz prediction in online social media

      This dataset contains two different social networks: Twitter, a micro-blogging platform with exponential growthand extremely fast dynamics, and Tom’s Hardware, a worldwide forum network focusing on new technology with more conservative dynamics but distinctive features.

  • Leskovec lab

      1. Yang, J. Leskovec. Temporal Variation in Online Media. ACM International Conference on Web Search and Data Mining (WSDM ‘11), 2011.
    • Twitter7:

      467 million Twitter posts from 20 million users covering a 7 month period from June 1 2009 to December 31 2009. We estimate this is about 20-30% of all public tweets published on Twitter during the particular time frame.

      As per request from Twitter the data is no longer available.

    • higgs-twitter

      The Higgs dataset has been built after monitoring the spreading processes on Twitter before, during and after the announcement of the discovery of a new particle with the features of the elusive Higgs boson on 4th July 2012. The messages posted in Twitter about this discovery between 1st and 7th July 2012 are considered.

  • Quandl has some databases.

  • CSRP has some too? - perhaps accessible to me via Wharton?

See original: The Living Thing / Notebooks Data sets

About YGS

Am a young growing scientist,am from Nigeria western part of African continent,i have a group called young growing scientists(YGS),I CREATED this group to make people come together and study science.When i saw this association i was very happy to see this organisation it is a kind of thing i like studying with my fellow scientists and also talk about science positively.Please my co-scientists lets join our hands together and make YGS group good and standard in this association so people will know it more and also like to join and make the world better.ISU OKOCHE SAMUEL(YGS) SAY:WITH SCIENCE ALL THINGS ARE EASY

Academic publishing

Some practical notes to the connection between reproducibility, academic publishing and… whatever.

  • Zenodo “is an open dependable home for the long-tail of science, enabling researchers to share and preserve any research outputs in any size, any format and from any science.”

    • Research. Shared. — all research outputs from across all fields of science are welcome!
    • Citeable. Discoverable. — uploads gets a Digital Object Identifier (DOI) to make them easily and uniquely citeable…
    • Flexible licensing — because not everything is under Creative Commons.
    • Safe — your research output is stored safely for the future in same cloud infrastructure as research data from CERN’s Large Hadron Collider.

    A major win is the easy DOI-linking of data and code for reproducible research. (for free)

  • Cameron Neylon (spelling corrections are mine):

    […]another way to look at engaging with peer review is as costly signalling. The purpose of submitting work to peer review is to signal that the underlying content is “honest” in some sense. In the mating dance between researchers and funders […]the peer review process is intended to make the pure signalling of publication and […]harder to fake. Taking Fisher’s view of mutual selection, authors on one side, funders and [institutions] on the other, we can see, at least as analogy, a reason for the [runaway] selection for publishing in prestigious journals[:] A runaway process where the signalling [bears] a tenuous relationship with the underlying qualities being sought, in the same way as the size of the peacock’s tail has a [tenuous] link with its health and fitness.

    (I think I have successfully reconstructed the intended quote through the typographical errors.)

  • Open Conference Systems (OCS)
    “is a free Web publishing tool that will create a complete Web presence for your scholarly conference. OCS will allow you to:

    • create a conference Web site
    • compose and send a call for papers
    • electronically accept paper and abstract submissions
    • allow paper submitters to edit their work
    • post conference proceedings and papers in a searchable format
    • post, if you wish, the original data sets
    • register participants
    • integrate post-conference online discussions

See original: The Living Thing / Notebooks Academic publishing

Why are cancer cases increasing?

Our global changing world and social structure is making us pay. Global warming, GMO products are among these. If we don't act together as one then we are doomed. Just invented the first ever Cancer treatment tablet machine maker and liquid syrup, if this research can be published we will surely head towards development.


All about science

Point processes

Another current obsession, tentatively placemarked.
I’ve just spent 6 months thinking about nothing else, so I won’t write much here.


See original: The Living Thing / Notebooks Point processes

Parallel computing

Fashion dictates this should be “cloud” computing, although I’m interested in using the same methods without a cloud.

Let’s say, I need to take notes on “easy shared-nothing parallel computing”.

I get lost in all the options for parallel computing on the cheap.
I’m gonna summarise for myself here.

Additional material to this theme under scientific computation workflow and stream processing

Emphasis for now is on embarrassingly parallel computation, which is what I as a statistician mostly do. Mostly in python, sometimes in other things.
That is, I run many calculations/simulations with absolutely no shared state and aggregate them in some way at the end.

Good scientific python VM images: (To work out - should I be listing Docker container images instead?)

  1. Continuum Analytics has conda python images
  2. StarCluster is an academically-targeted AWS-compatible multi-computing library. Their VMs are a little bit dated for my purposes, lacking LLVM3.5 etc.

But how to use ‘em?

  1. Dato (formally Graphlab) claims to automate this stuff.
  1. Spark distributes over clusters automagically. It has several cluster modes
  • Standalone – a simple cluster manager included with Spark that makes it easy to set up a private cluster.

  • Apache Mesos – a general cluster manager that can also run Hadoop MapReduce and service applications.

  • Hadoop YARN – the resource manager in Hadoop 2.

  • basic EC2 launch scripts make it easy to launch a standalone cluster on Amazon EC2.

  • Interesting application: see Communication-Efficient Distributed Dual Coordinate Ascent

    By leveraging the primal-dual structure of these optimization problems,
    COCOA effectively combines partial results from local computation while
    avoiding conflict with updates simultaneously computed on other machines.
    In each round, COCOA employs steps of an arbitrary dual optimization
    method on the local data on each machine, in parallel. A single update
    vector is then communicated to the master node.

    Uses cunning optimisation stunts to do efficient distribution of
    optimisation problems over various machines.

  1. IBM has a trial cloud offering
    (documentation here)

See original: The Living Thing / Notebooks Parallel computing

Randomised algorithms

Sacrificing precision/certainty for speed, using randomness.

See also related and/or special cases compressed sensing, Monte Carlo methods, particle filters, and random features, stochastic gradient descent.

  • BlinkDB BlinkDB is a massively parallel, approximate query engine for running interactive SQL queries on large volumes of data. It allows users to trade-off query accuracy for response time, enabling interactive queries over massive data by running queries on data samples and presenting results annotated with meaningful error bars. To achieve this, BlinkDB uses two key ideas: (1) An adaptive optimization framework that builds and maintains a set of multi-dimensional samples from original data over time, and (2) A dynamic sample selection strategy that selects an appropriately sized sample based on a query’s accuracy and/or response time requirements[…]
  • Probabilistic data structures, e.g.

See original: The Living Thing / Notebooks Randomised algorithms

Stupid Git Tricks

publishing to github

ghp-import -p _build/html/



git fetch remote branch
git subtree add --prefix=subdir remote branch


git fetch remote branch
git subtree pull --prefix=subdir remote branch
git subtree push --prefix=subdir remote branch

garbage collecting

In brief, this will purge a lot of stuff from a constipated repo in emergencies:

git reflog expire --expire=now --all
git gc --prune=now

In depth

See original: The Living Thing / Notebooks Stupid Git Tricks

Calendars and contact databases, digital, use thereof

Google calendar, iCloud etc exist.
I don’t use them.
Gifting your entire personal schedule
and confidential contact data to third parties of dubious motivation
demonstrates a touching sense of the general benevolence of the world toward you in
However, sadly, this rainbows-n-unicorns worldview entails
a disregard for the personal privacy
and beliefs of those of
your contacts who are not convinced that the apparatus of state and capital is
at their personal disposal.

Jargon to know here: CalDAV and CardDAV
are the de facto standards to sync your calendar and contact information,
All you need is a server which talks those standards and
you can use whatever client you’d like.

So! Running your own.

  • OSX (closed source) will install Apple’s open-source
    Calendar Server. (Does contacts too)
  • OSX/Windows/Linux: Radicale seems to be much easier if you don’t
    want to pay for the magic installer.

See original: The Living Thing / Notebooks Calendars and contact databases, digital, use thereof