Use a lightweight on-device LLM paired with a retrieval augmented generation pipeline over vectorized schema metadata to handle interactive user inputs efficiently.