Hi, I'm Nikolai Oplachko (magnickolas)

Parallel Sorting on GPU: Part 2

This is the closing sequel to the first part and dedicated to the derivation of an iterative form of the Batcher’s Odd-Even Mergesort algorithm; implementing it with a GPU shader; a little of benchmarks for fun. Theory preparation Reminder from the first post The GPU’s computation model drastically differs from the CPU one. In simplest terms, we have magnitutes more workers (cores) organized in multiple disjoint local groups. The best (but pretty utopian) way of utilizing this model is to move the data to all of them and let them process it independently (or as locally as possible), because communication between workers from different groups is quite expensive and is able to offset the entire potential speedup from using GPU compared to CPU.

Parallel Sorting on GPU: Part 1

This post series is dedicated to the description and implementation of sorting an array with a graphics processing unit (GPU). It’ll be split into two posts. In this part, we’ll first try to derive one of the earliest comparably efficient parallel sorting algorithm from the sorting networks class. What’s remarkable for this class of sorting algorithms, is that before performing the algorithm, we already know the whole sequence of pairs of indices to be compared and swapped (if needed).

The First Post

My blog is taking the beginning right here.

I want to use it to write down some of the insights I gain in exploring different things. Additionally, to describe solutions to some problems I encounter.