KV cache dtype for the V (values) tensor in Aspose.LLM for .NET — tolerates more aggressive quantization than TypeK; Q8_0 is a safe default for tight memory....TypeK — stored shape, memory scaling, and options are the same....ContextSize — memory savings scale with context. FlashAttentionMode...