Serve many users or workflows from a single Aspose.LLM for .NET instance — per-user sessions, serialized inference, request routing patterns....itself is serialized: the native layer does not run multiple inference...tokens of history, with 32-layer model and F16 KV cache, peak...