Reinventing the wheel
Java sorting algorithms are fast. What I am trying to do here seems futile. Why would anyone wants to improve something that is already quite good, right?
You are absolutely right and you should stop reading my blog.
I tried, anyway, and failed miserably at beating Java 7 Timsort Arrays.sort(Object[]). It is not easy. Timsort is a well-tested, well-battled hybrid sort model of a mergesort and insertion sort algorithm. It works by identifying runs that are descending or non-descending. Then, these runs are combined and sorted using either insertion sort or mergesort depending on a minrun parameter.
Naive mergesort vs. Java Tim’s sort
My first attempt at implementing mergesort was only 80-95% as fast as Timsort on movie titles (string), stock values (double) and integers.
This could also mean that my first attempt was too slow. I even tried to reduce the size of the temporary array in half and performed an in-place merge phase. It didn’t help that much.
Now that I have these baseline performance numbers, my next post will aim at improving them using simple techniques like parallelization.
See part 2
comments powered by Disqus