vllm.model_executor.layers.attention_layer_base ¶
Base class for attention-like layers.
AttentionLayerBase ¶
Bases: ABC
Base class for attention-like layers (Attention, Mamba, etc.) that support the v1 engine.
This provides a common interface for getting attention backends from different layer types.
Source code in vllm/model_executor/layers/attention_layer_base.py
get_attn_backend abstractmethod
¶
get_attn_backend() -> type[AttentionBackend]
get_kv_cache_spec abstractmethod
¶
get_kv_cache_spec(
vllm_config: VllmConfig,
) -> KVCacheSpec | None
Get the KV cache spec for this layer. May be None if the layer does not need KV cache.