Improving Java Sorting Algorithms: Part 1

Thai Bui bio photo By Thai Bui Comment

Reinventing the wheel

Java sorting algorithms are fast. What I am trying to do here seems futile. Why would anyone wants to improve something that is already quite good, right?

You are absolutely right and you should stop reading my blog.

I tried, anyway, and failed miserably at beating Java 7 Timsort Arrays.sort(Object[]). It is not easy. Timsort is a well-tested, well-battled hybrid sort model of a mergesort and insertion sort algorithm. It works by identifying runs that are descending or non-descending. Then, these runs are combined and sorted using either insertion sort or mergesort depending on a minrun parameter.

Naive mergesort vs. Java Tim’s sort

My first attempt at implementing mergesort was only 80-95% as fast as Timsort on movie titles (string), stock values (double) and integers.

This could also mean that my first attempt was too slow. I even tried to reduce the size of the temporary array in half and performed an in-place merge phase. It didn’t help that much.

Now that I have these baseline performance numbers, my next post will aim at improving them using simple techniques like parallelization.

See part 2

comments powered by Disqus