vLLM-Omni is required for multimodal generation (text-to-image, omni models).
You can still use Container mode without local installation.
To use Subprocess mode, install vLLM-Omni in a separate virtual environment:
# Create venv with Python 3.12
uv venv --python 3.12 --seed ~/.venv-vllm-omni
source ~/.venv-vllm-omni/bin/activate
# Install vLLM base (CUDA)
uv pip install vllm==0.14.0 --torch-backend=auto
# Clone and install vLLM-Omni
git clone https://github.com/vllm-project/vllm-omni.git
cd vllm-omni && uv pip install -e .
# For Qwen3-TTS models, also install:
uv pip install onnxruntime
Then set the venv path (~/.venv-vllm-omni) in the Run Mode configuration below.