Capgemini Interview Question

Difference between coalesce and repartition

Interview Answer

Anonymous

Sep 18, 2024

Coalesce is used to reduce the number of partitions. Here data shuffling is very less. Repatriation is used to either increase or decrease the number of partitions. Data distribution is equal here as data is shuffled. This an expensive operations when compared to coalces.