Apple Interview Question

What is KV cache ? how does it help in LLM inference ?