See part 1

Parallelism != linear sort improvement

Mergesort is an easy algorithm to be parallelized. However, any parallelable tasks will need at least a coordinating task to split up the work and to synchronize the result. There is always a tradeoff between how much is gained because of parallelism and how much is lost because of coordination.

Parallel mergesort vs. Java Tim’s sort vs. Naive mergesort

My mergesort parallelism attempt was about 27-46% faster than a single-threaded Java Timsort. It is about the same in sorting stock values which I do not know why. It could be that the stock values are almost sorted such that there is little to do and the parallelism improvement is minimal.

In the next post, I will examine the performance of the current Java 7 parallel sort algorithm and the newer Java 8 parallel sort algorithm. I expect mine to be slower since I don’t have enough time and expertise to tune the algorithm on a variety of dataset and machine configurations. However, even if the current algorithm is slower, I still can parallel the sort task across machines to beat Java 8 version.

See part 1

Thai Bui - Yet, Another Personal Blog

Technology, big data, machine learning, algorithms and more.

Improving Java Sorting Algorithms: Part 2

Parallelism != linear sort improvement

Parallel mergesort vs. Java Tim’s sort vs. Naive mergesort