Prashant Pandey

Accepted for publication in IJRSI Winter Journal 2026

Whether attention key value (KV) states computed for one prompt for a small LLM can be reused to accelerate inference on a new similar prompt, giving an increase to the space to its context memory using token recycling.