NVIDIA Interview Question

How to accelerate and parallelize prefix-sum (exclusive scan) computation on the GPU?