AMD Interview Question

1. Improvements upon the classical attention (in LLMs) that can save compute/arithmetic operations.

Interview Answer

Anonymous

Nov 2, 2024

AFAIK, in LLMs' literature, very few papers target saving arithmetic operations, as LLMs compute time is mostly memory-bounded. I am still very confused on the purpose of asking this bizarre question in a technical round.