söndag 18 augusti 2013

Soft Realtime Java

Anyone visiting my home could verify that garbage collection is something best left to some other entity than me. One severe problem with having a computer program looking what your program is doing trying to clean up after it, is that it sometimes gets so confused it has to trigger some divine intervention, stop the time, go through much of the allocated memory and reallocate parts not in use anymore.

This seasonally cleaning can take a while. Usually that's ok, but for more and more applications the world can't wait. One of these are when the computer is busy generating cool synth sounds for a dancefloor in real time.

To solve this issue many programs simply don't use Java, but C++ or something else where you have more control over how your memory is allocated and deallocated. This leads to nicely working software, like the Ableton Live, but also rather much work for the programmers behind it. Ok, Javas somewhat crimped primitives and the JVMs limitied abilities to reach new efficient processor instructions wouldn't make much to help it out either, but anyway.

One new player in the game is IBM's Metronome GC, that can guarantee a maximum pause time for Java GC that is about 2 ms. Of course, someone made a sample-player synth written in java with rather competitive latencies.

onsdag 14 augusti 2013

Garbage Collection visualization


I really like the old disk defragmenter program in Windows. The small blocks where moved all around and I could watch it for hours (well, minutes at least).


The JVM Garbage Collection is something of a black-box which sometimes just makes everything go to halt. To get a better understanding of how it is struggling with my code I wanted something like the disk defragmenter, and, Eureka! Netflix has made their own tool, gc.viz (blogpost), and also logs all the garbage collection closely in their production systems. Interesting indeed!


lördag 10 augusti 2013

Leaderboard, some approaches

I'm looking into the leader board problem. In short, the problem can be expressed as the most efficient way to take the n best out of a large (often semi ordered) set with the size m (the cardinality of the total league).

The most naive version is to keep a list of the n≤m best players seen so far. If we assume no sorting among these n players, the complexity of the algorithm will be n*m, which is often unacceptable, since both m and n can be huge.

Cache the worst one the list
A slightly less naive way is to cache the least good player in the current toplist. This will, at best, give only slightly above m comparisions, given that the best players came first. Beware though, if the set is given in reversed order, the complexity once again soars to n*m.

Keep a binary tree heap
A heap is an often excellent way to select items out of a set. The most general selection, the heap-sort has a well known n log n complexity. The heap could be a candidate because it keeps the players on the top list sorted and could be a good candidate. This is not the case. The binary tree has the sad property that it can be very unbalanced and we are back again at n*m.

Keep a balanced binary tree heap!
Now we're talking! There are several trees that balance themselves in a well-known worst case time available. Actually, there are some specialized heap data structures availiable, a list from Wikipedia on Heap data structures gives the following
  • 2-3 heap
  • Beap 
  • Binary heap
  • Binomial heap
  • Brodal queue
  • d-ary heap
  • Fibonacci heap
  • Leftist heap
  • Pairing heap
  • Skew heap
  • Soft heap
  • Weak heap
  • Leaf heap
  • Radix heap
  • Randomized meldable heap
where the Fibonacci and Brodal heaps seems to have among the best over-all performance (edit: no, the different heaps do of course have different properties depending on the insert/delete proportions). 

I'll give a try to implement the Fibonacci heap. And yes, it's one availiable in clojure already, courtesy of Mary Rose Cook, maryrosecook/fibonacciheap, built with zippers, nicely described a great blogpost. Super elegant.

fredag 9 augusti 2013

Small app for finding BPMs by key-tap

With Giorgio-voice: When I was young, 15-16 I definetly wanted to become a DJ and producer. I left that for my friends in Skillnicker instead. I took the engineering route instead. And here we are.

When you want to mix two songs togheter they needed to be in the same tempo, therefore we marked our records with the number of BPMs a certain song had. We calculated in the beginning by measuring the length of 16 beats with a stop watch and look into a table. After a while there were programmes where one just could tap the space bar and get a decent result. Here is such a program in Clojure.

  The intresting part of the program is the minimal state, it's just the last time we got a key press in the frame. I will add a Kalman-filter to it soon, giving it a state of a hefty two or three variables.

torsdag 11 juli 2013

Core.async for simple user-interaction

Alex Miller gives an excellent example on how to make rock-paper-scissors with core.async which made me want to try it on my mekanikmaskin - a computer-aided learning and examinating tool.

The cool thing about core.async is that it makes sort-of a reactive programming possible. Put something in a channel, and watch the program take care of it. Almost like REPLs waiting inside functions wherever you like.


And yes, it's a terrible mixture between printlns and the async constructs, but the way to create a more advanced interaction is for another day.

lördag 29 juni 2013

Project.clj superpowers with Alembic

Hugo Duncan (Pallet etc) have made it possible to dynamically add new Maven/Leiningen dependencies from the REPL through Alembic.

This opens up from starting a repl with just Clojure and keep adding dependencies all night long.

As a courtesy to you, dear reader, a current project.clj of mine have the following deps so you can try them out for yourself, in the live-running repl.

Some of my common Project.clj dependencies

[org.clojure/clojure "1.5.1"]
[org.clojure/tools.logging "0.2.6"]

[org.clojure/data.csv "0.1.2"]
[org.clojure/data.json "0.2.2"]


The Clojure language (which should already be available in the repl), functions for logging, parsing comma-separated values from exports and parse JSON from REST-apis.

[org.clojure/google-closure-library-third-party "0.0-2029"]
Needed for ClojureScript code compilation. Probably outdated and I should switch to Pedestal instead.

[clj-time "0.5.0"]
A wrapper around Joda-time, the sane way to handle time, date and intervals on the JVM.

[clj-http "0.7.2"]
An http-client based on Apache Commons

[ch.qos.logback/logback-classic "1.0.11"]
The best working logging library IMHO.

[incanter/incanter-core "1.5.1"]
[incanter/incanter-io "1.5.1"]
[incanter/incanter-charts "1.5.1"]

Incanter is a R-like package for Clojure. There are functions for various statistics, matrix calculations, not-super-cool-but-convenient visualization through JFreeChart, readers for various data sets and more.

[enlive "1.1.1"]
HTML Templating library based on transforms.

[domina "1.0.0"]
ClojureScript-library for manipulating the DOM.

[com.datomic/datomic-free "0.8.4020"
 :exclusions [org.slf4j/slf4j-nop          
              org.slf4j/slf4j-log4j12]]

A convenient, scalable, relational database that is built in Clojure keeping many of Clojure's idioms while getting the Durability in Clojures ACI-persistent datastructures. The exclusions is to make sure the logging from Datomic not collides with logback.

[compojure "1.1.5"]
A convenient library to make ring-handlers for specified routes. If you come from PHP-world this is like where you put the "filenames" and/or the rewrite rules, only that it's made right from the beginning. 

[ring "1.1.8"]
The servlet-abstraction library for clojure.

[com.keminglabs/vomnibus "0.3.2"]
Small useful things, like the ColorBrewer colors.

[com.keminglabs/c2 "0.2.2"]
I try to make visualizations with this one instead of D3, but it's less mature and I'm not skilled enough in the toolchain to get as much from it as from D3 right now. I have not given up yet!

[ring-mock "0.1.5"]
Mocking is a way to create a fake instance of something, in this case it allows a ring handler function to be called to verify that the function returns excepted results, all this without starting a web-server, eliminating the possibility to have collisions in port-allocation etc.

[clj-pid "0.1"]
A .pid contains the Process ID (an integer) as a string. This can be used to verify the process is running, and for kill it by a shellscript.

[schejulure "0.1.4"]
A cron-like scheduling library based on the Java concurrent scheduling libraries and some core.logic.

måndag 3 juni 2013

Clj-Liblinear (thanks Kevin)

I had some problems with "typed" ragged arrays when I tried to wrap Lib-linear in Clojure. Kevin Lynagh and Benedikt Waldvogel had already solved them in clj-liblinear, thanks!

Now onto some batching for determining nice parameters for our tests.