The speed of IDL compared with other languages
Because IDL is array based, it can give very good execution speed compared with other programming languages. The main reason is that array based computations tend to optimize memory access patterns since array elements are stored adjacent in memory and accessed in sequence. Because actual computation speed has gotten much faster over time, it is often not the bottleneck for achieving good speed on a computational algorithm. Instead, memory access, memory caching, and cache misses are what dominates the speed by slowing down the performance.
I wanted to do a simple, unscientific speed comparison using an algorithm that I have already implemented earlier. I chose the LSD radix sort algorithm since the Wikipedia article includes example code for several programming languages: C, C++, C++14, Python, Java. See this link. Also see this IDL blog post for reference.
I modified the code for all the languages to use a 10,000,000 element 32-bit integer array containing random integers between 0 - 9,999,999. The main modifications were to use a radix (base) of 256, and use a dynamically allocated array that could handle 10,000,000 elements. For the examples that used radix 10 (Java), I also needed to expand the buckets to 256 to hold the histogram for each iteration.
I ran all the tests on the same 6-core Intel(R) Xeon(R) system.Here are the results:
Python: 16.3 seconds(after optimizing a bit, original was 29.3 seconds)
C++: 14.0 seconds
Java: 9.2 seconds
C++14: 2.2 seconds
IDL: 0.86 seconds
C: 0.72 seconds
In my test IDL ends up in a close second place behind the C implementation. Obviously, these code examples are not fully optimized for speed, but might be representative of how people write code when there is not enough time to spend on optimizing the code. I would also mention that the readability of the IDL code is a significant advantage. The IDL code is done in 20 lines, whereas the C code uses 76 lines. This makes it easy to add changes and improvements to the IDL code.
This is an example where IDL performs very well. There are obviously other cases where IDL code can run much slower than optimal. I have found that in most cases when my IDL code runs slow, it is caused by using too many loops and scalar operations instead of more array based operations.
Here is the code listing for the IDL version of the10,000,000 element integer sort, (it also runs 3.8 times faster than IDL's built in sort, for this particular case):
n = 10000000
data = randomu(seed, n, /long) mod n
sorted = data
radix = 256LL
factor = 1ull
for i=0,3 do begin
rem = sorted/factor
digit = rem mod radix
factor = factor*radix
h = histogram(digit, min=0, max=radix-1, binsize=1, $
sorted = sorted[ri[radix+1:*]]
tmp = data[sort(data)]
if array_equal(tmp, sorted) then print, 'Sorted correctly'