vLLM
@vllm_project
NVIDIA published a tutorial for deploying Cosmos Reason 2B on Jetson using vLLM โ covering AGX Thor, AGX Orin, and Orin Super Nano. FP8 quantized VLM with chain-of-thought reasoning, served via `vllm serve` and connected to a real-time webcam UI for interactive vision analysis. Great to see vLLM powering edge inference on Jetson. ๐ Thanks to the @NVIDIARobotics Jetson team! ๐ https://huggingface.co/blog/nvidia/cosmos-on-jetson