Recently I was honored to be named one of Kaggle’s top 10 data scientists. For coverage on the topic see:
- http://www.readwriteweb.com/cloud/2012/04/what-you-can-learn-from-kaggle.php
- http://satellite.tmcnet.com/news/2012/04/17/6262977.htm
Recently I was honored to be named one of Kaggle’s top 10 data scientists. For coverage on the topic see:
Some time ago I wrote a post about Gibbs sampling using Matlab. There I showed that the JIT compiler of Matlab can get you very close to the speed of compiled C code if you know what you are doing, but that it is easy to screw up and get a very slow program as a result. More recently, I came across a new scientific programming language called Julia, which seems to be designed specifically with this kind of JIT compilation in mind. So I put Julia to the test, using the slow version of the Gibbs sampler from my last post:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | function gibbs2(n, thin) x_samp = zeros(n,1) y_samp = zeros(n,1) x=0.0 y=0.0 for i=1:n for j=1:thin x=(y^2+4)*randg(3) y=1/(1+x)+randn()/sqrt(2*x+2) end x_samp[i] = x y_samp[i] = y end return x_samp, y_samp end |
function gibbs2(n, thin)
x_samp = zeros(n,1)
y_samp = zeros(n,1)
x=0.0
y=0.0
for i=1:n
for j=1:thin
x=(y^2+4)*randg(3)
y=1/(1+x)+randn()/sqrt(2*x+2)
end
x_samp[i] = x
y_samp[i] = y
end
return x_samp, y_samp
end1 2 | julia> @elapsed gibbs2(50000,1000) 7.6084020137786865 |
julia> @elapsed gibbs2(50000,1000) 7.6084020137786865
The first thing to notice is that the Julia code looks very similar to the Matlab code from my last post, which is great for people familiar with that language. The second thing is that it is very fast and does not suffer from Matlab’s dumb JIT problems. In fact, this Julia code is even faster than the ‘fast version’ of the Matlab code from my last post. Way to go Julia!