1. Improvements upon the classical attention (in LLMs) that can save compute/arithmetic operations.
Anonymous
AFAIK, in LLMs' literature, very few papers target saving arithmetic operations, as LLMs compute time is mostly memory-bounded. I am still very confused on the purpose of asking this bizarre question in a technical round.
Check out your Company Bowl for anonymous work chats.