MLSys Radar

triton

AMD ROCm Blogs · hardware · 2026-05-22

From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs

Score 30

Triton Inference Server is an open-source platform designed to streamline AI inferencing. It supports the deployment, scaling, and inference of trained models from multiple frameworks, including ONNX Runtime, TensorFlow, PyTorch, and other...

inference serving kernel triton benchmark model-release cloud open-source

Open

High signal Matched: inference, inferencing, serving, triton, benchmark, model, cloud, open-source

vLLM Project · open-source · 2026-03-04

vLLM Triton Attention Backend Deep Dive

Score 14

This article is adapted from a Red Hat hosted vLLM Office Hours session with Burkhard Ringlein from IBM Research, featuring a deep technical walkthrough of the vLLM Triton attention backend....

kernel triton research

Open

High signal Matched: triton, research