Modular · inference-infra · 2026-05-21
Why LLM Inference Needs a New Kind of Router - Part 2
Why LLM Inference Needs a New Kind of Router - Part 2
High signal Matched: inference, router
High-performance AI platform company positioning itself from kernel to cloud.
Modular · inference-infra · 2026-05-21
Why LLM Inference Needs a New Kind of Router - Part 2
High signal Matched: inference, router
Modular · inference-infra · 2026-05-18
Hippocratic AI partners with Modular to power flexible, high-quality inference for real-time patient conversations
High signal Matched: inference
Modular · inference-infra · 2026-05-12
Inkwell: Why Your Inference Platform Matters As Much As Your Model
High signal Matched: inference, model
Modular · inference-infra · 2026-05-08
Why LLM Inference Needs a New Kind of Router - Part 1
High signal Matched: inference, router
Modular · inference-infra · 2026-04-13
TileTensor Part 1 - Safer, More Efficient GPU Kernels
High signal Matched: gpu
Modular · inference-infra · 2026-04-02
Day Zero Launch: Fastest Performance for Gemma 4 on NVIDIA and AMD
High signal Matched: performance, launch
Modular · inference-infra · 2026-03-30
Software Pipelining for GPU Kernels: Part 1 - The Pipeline Problem
High signal Matched: gpu
Modular · inference-infra · 2026-03-19
Modular 26.2: State-of-the-Art Image Generation and Upgraded AI Coding with Mojo
High signal Matched: generation
Modular · inference-infra · 2026-03-16
Modular at NVIDIA GTC 2026: MAX on Blackwell, Mojo Kernel Porting, and DeepSeek V3 on B200
High signal Matched: kernel, b200, blackwell
Modular · inference-infra · 2026-03-06
Modverse #53: Community Builds, Research Milestones, and a Growing Ecosystem
High signal Matched: research
Modular · inference-infra · 2026-03-05
Structured Mojo Kernels Part 1 - Peak Performance, Half the Code
High signal Matched: performance
Modular · inference-infra · 2026-01-14
How to Beat Unsloth's CUDA Kernel Using Mojo—With Zero GPU Experience
High signal Matched: kernel, cuda, gpu
Modular · inference-infra · 2025-11-20
Modular 25.7: Faster Inference, Safer GPU Programming, and a More Unified Developer Experience
High signal Matched: inference, gpu
Modular · inference-infra · 2025-11-07
"TTS 1 Max" (powered by Modular Platform) Ranked #1 Speech Model on Artificial Analysis
High signal Matched: model
Modular · inference-infra · 2025-10-17
Achieving State-of-the-Art Performance on AMD MI355 — in Just 14 Days
High signal Matched: performance
Modular · inference-infra · 2025-09-19
Matrix Multiplication on Blackwell: Part 4 - Breaking SOTA
High signal Matched: blackwell, sota
Modular · inference-infra · 2025-09-12
Matrix Multiplication on Blackwell: Part 3 - The Optimizations Behind 85% of SOTA Performance
High signal Matched: performance, blackwell, sota
Modular · inference-infra · 2025-09-05
Matrix Multiplication on Blackwell: Part 2 - Using Hardware Features to Optimize Matmul
High signal Matched: matmul, blackwell
Modular · inference-infra · 2025-08-28
Matrix Multiplication on Blackwell: Part 1 - Introduction
High signal Matched: blackwell
Modular · inference-infra · 2025-08-05
Modular Platform 25.5: Introducing Large Scale Batch Inference
High signal Matched: inference, introducing
Modular · inference-infra · 2025-07-31
SF Compute and Modular Partner to Revolutionize AI Inference Economics
High signal Matched: inference
Modular · inference-infra · 2025-06-10
Introducing Mammoth: Enterprise-Scale GenAI Deployments Made Simple
High signal Matched: introducing
Modular · inference-infra · 2025-06-10
Modular + AMD: Unleashing AI performance on AMD GPUs
High signal Matched: performance
Modular · inference-infra · 2025-05-29
Modverse #48: Modular Platform 25.3, MAX AI Kernels, and the Modular GPU Kernel Hackathon
High signal Matched: kernel, gpu
Modular · inference-infra · 2025-05-20
Modular GPU Kernel Hackathon Highlights: Innovation, Community, & Mojo🔥
High signal Matched: kernel, gpu
Modular · inference-infra · 2025-04-17
Modverse #47: MAX 25.2 and an evening of GPU programming at Modular HQ
High signal Matched: gpu
Modular · inference-infra · 2025-03-26
What about Triton and Python eDSLs? (Democratizing AI Compute, Part 7)
High signal Matched: triton
Modular · inference-infra · 2025-03-25
MAX 25.2: Unleash the power of your H200's–without CUDA!
High signal Matched: cuda, h200
Modular · inference-infra · 2025-03-05
What about OpenCL and CUDA C++ alternatives? (Democratizing AI Compute, Part 5)
High signal Matched: cuda
Modular · inference-infra · 2025-02-20
CUDA is the incumbent, but is it any good? (Democratizing AI Compute, Part 4)
High signal Matched: cuda
Modular · inference-infra · 2025-02-18
MAX 25.1 - Introducing MAX Builds
High signal Matched: introducing
Modular · inference-infra · 2025-02-12
How did CUDA succeed? (Democratizing AI Compute, Part 3)
High signal Matched: cuda
Modular · inference-infra · 2025-02-06
Paged Attention & Prefix Caching Now Available in MAX Serve
High signal Matched: serve, paged attention
Modular · inference-infra · 2025-02-05
What exactly is “CUDA”? (Democratizing AI Compute, Part 2)
High signal Matched: cuda
Modular · inference-infra · 2025-01-30
Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling
High signal Matched: serve, agents, agentic, function calling
Modular · inference-infra · 2024-12-17
Introducing MAX 24.6: A GPU Native Generative AI Platform
High signal Matched: gpu, introducing
Modular · inference-infra · 2024-12-17
MAX GPU: State of the Art Throughput on a New GenAI platform
High signal Matched: throughput, gpu, state of the art
Modular · inference-infra · 2024-12-17
Build a Continuous Chat Interface with Llama 3 and MAX Serve
High signal Matched: serve
Modular · inference-infra · 2024-09-13
MAX 24.5 - With SOTA CPU Performance for Llama 3.1
High signal Matched: performance, sota
Modular · inference-infra · 2024-07-09
Bring your own PyTorch model
High signal Matched: model
Modular · inference-infra · 2024-06-07
MAX 24.4 - Introducing quantization APIs and MAX on macOS
High signal Matched: introducing, quantization
Modular · inference-infra · 2024-05-29
What ownership is really about: a mental model approach
High signal Matched: model
Modular · inference-infra · 2024-05-02
MAX 24.3 - Introducing MAX Engine Extensibility
High signal Matched: introducing
Modular · inference-infra · 2024-04-10
Row-major vs. Column-major Matrices: A Performance Analysis in Mojo and NumPy
High signal Matched: performance
Modular · inference-infra · 2026-05-29
Three trends from MLSys 2026
Watchlist Matched: none
Modular · inference-infra · 2026-05-19
How I built a pure Mojo app (and 10 libraries) with AI agents
Watchlist Matched: agents
Modular · inference-infra · 2026-05-13
Translating to Mojo via AI Agents
Watchlist Matched: agents
Modular · inference-infra · 2026-05-07
Modular 26.3: Mojo 1.0 Beta, MAX Video Gen, and more
Watchlist Matched: none
Modular · inference-infra · 2026-05-04
Modverse #54: AMD AI DevDay, New Modular Offices, and a Community That Keeps Shipping
Watchlist Matched: none
Modular · inference-infra · 2026-04-16
How Frontier Coding Agents Built a Video Diffusion Pipeline on MAX
Watchlist Matched: agents
Modular · inference-infra · 2026-04-10
Modular Opens Edinburgh & San Francisco Offices
Watchlist Matched: none
Modular · inference-infra · 2026-04-03
Structured Mojo Kernels Part 4 - Portability and the Road Ahead
Watchlist Matched: none
Modular · inference-infra · 2026-03-31
Modverse #54: From GTC to Edinburgh, a Community Building Momentum
Watchlist Matched: none
Modular · inference-infra · 2026-03-26
Structured Mojo Kernels Part 3 - Composition in Practice
Watchlist Matched: none
Modular · inference-infra · 2026-03-11
Structured Mojo Kernels Part 2 - The Three Pillars
Watchlist Matched: none
Modular · inference-infra · 2026-02-18
The Claude C Compiler: What It Reveals About the Future of Software
Watchlist Matched: none
Modular · inference-infra · 2026-02-10
BentoML Joins Modular
Watchlist Matched: none
Modular · inference-infra · 2026-02-05
The Five Eras of KVCache
Watchlist Matched: none
Modular · inference-infra · 2026-01-29
Modular 26.1: A Big Step Towards More Programmable and Portable AI Infrastructure
Watchlist Matched: none
Modular · inference-infra · 2025-12-19
🔥 Modular 2025 Year in Review
Watchlist Matched: none
Modular · inference-infra · 2025-12-05
The path to Mojo 1.0
Watchlist Matched: none
Modular · inference-infra · 2025-12-03
Modverse #52: Advancing AI Together — Community Projects & Platform Milestones
Watchlist Matched: none
Modular · inference-infra · 2025-11-06
PyTorch and LLVM in 2025 — Keeping up With AI Innovation
Watchlist Matched: none
Modular · inference-infra · 2025-09-24
Modular Raises $250M to scale AI's Unified Compute Layer
Watchlist Matched: none
Modular · inference-infra · 2025-09-22
Modular 25.6: Unifying the latest GPUs from NVIDIA, AMD, and Apple
Watchlist Matched: none
Modular · inference-infra · 2025-09-19
Modverse #51: Modular x Inworld x Oracle, Modular Meetup Recap and Community Projects
Watchlist Matched: none
Modular · inference-infra · 2025-08-21
Modverse #50: Modular Platform 25.5, Community Meetups, and Mojo's Debut in the Stack Overflow Developer Survey
Watchlist Matched: none
Modular · inference-infra · 2025-07-16
AI Agents for AWS Marketplace
Watchlist Matched: agents
Modular · inference-infra · 2025-07-09
Modverse #49: Modular Platform 25.4, Modular 🤝 AMD, and Modular Hack Weekend
Watchlist Matched: none
Modular · inference-infra · 2025-07-03
Inside Modular Hack Weekend: Top Projects and Community Highlights
Watchlist Matched: none
Modular · inference-infra · 2025-06-20
How is Modular Democratizing AI Compute? (Democratizing AI Compute, Part 11)
Watchlist Matched: none
Modular · inference-infra · 2025-06-18
Modular 25.4: One Container, AMD and NVIDIA GPUs, No Lock-In
Watchlist Matched: none
Modular · inference-infra · 2025-05-27
Exploring Metaprogramming in Mojo
Watchlist Matched: none
Modular · inference-infra · 2025-05-08
Modular’s bet to break out of the Matrix (Democratizing AI Compute, Part 10)
Watchlist Matched: none
Modular · inference-infra · 2025-05-06
Modular Platform 25.3: 450K+ Lines of Open Source Code and pip Packaging
Watchlist Matched: open source
Modular · inference-infra · 2025-04-23
A New, Simpler License for MAX and Mojo
Watchlist Matched: none
Modular · inference-infra · 2025-04-22
Why do HW companies struggle to build AI software? (Democratizing AI Compute, Part 9)
Watchlist Matched: none
Modular · inference-infra · 2025-04-08
What about the MLIR compiler infrastructure? (Democratizing AI Compute, Part 8)
Watchlist Matched: none
Modular · inference-infra · 2025-03-12
What about TVM, XLA, and AI compilers? (Democratizing AI Compute, Part 6)
Watchlist Matched: none
Modular · inference-infra · 2025-02-27
Modverse #46: MAX 25.1, MAX Builds, and Democratizing AI Compute
Watchlist Matched: none
Modular · inference-infra · 2025-01-30
DeepSeek's Impact on AI (Democratizing AI Compute, Part 1)
Watchlist Matched: none
Modular · inference-infra · 2025-01-23
Use MAX with Open WebUI for RAG and Web Search
Watchlist Matched: rag
Modular · inference-infra · 2025-01-21
Hands-on with Mojo 24.6
Watchlist Matched: none
Modular · inference-infra · 2024-12-19
Evaluating Llama Guard with MAX 24.6 and Hugging Face
Watchlist Matched: evaluating
Modular · inference-infra · 2024-10-25
Understanding SIMD: Infinite Complexity of Trivial Problems
Watchlist Matched: none
Modular · inference-infra · 2024-10-10
Community Spotlight: Writing Mojo with Cursor
Watchlist Matched: none
Modular · inference-infra · 2024-10-01
Hands-on with Mojo 24.5
Watchlist Matched: none
Modular · inference-infra · 2024-07-23
Announcing stack-pr: an open source tool for managing stacked PRs on GitHub
Watchlist Matched: open source
Modular · inference-infra · 2024-07-16
Debugging in Mojo🔥
Watchlist Matched: none
Modular · inference-infra · 2024-07-09
Take control of your AI
Watchlist Matched: none
Modular · inference-infra · 2024-07-09
Develop locally, deploy globally
Watchlist Matched: none
Modular · inference-infra · 2024-07-03
A brief guide to the Mojo n-body example
Watchlist Matched: none
Modular · inference-infra · 2024-06-25
What's new in MAX 24.4? MAX on macOS, fast local Llama3, native quantization and GGUF support
Watchlist Matched: quantization, gguf
Modular · inference-infra · 2024-06-17
What’s new in Mojo 24.4? Improved collections, new traits, os module features and core language enhancements
Watchlist Matched: none
Modular · inference-infra · 2024-06-04
Deep dive into ownership in Mojo
Watchlist Matched: none
Modular · inference-infra · 2024-05-20
Fast⚡k-means clustering in Mojo🔥: a guide to porting Python to Mojo🔥 for accelerated k-means clustering
Watchlist Matched: none
Modular · inference-infra · 2024-05-08
Developer Voices: Deep Dive with Chris Lattner on Mojo
Watchlist Matched: none
Modular · inference-infra · 2024-05-02
What’s New in Mojo 24.3: Community Contributions, Pythonic Collections and Core Language Enhancements
Watchlist Matched: none
Modular · inference-infra · 2024-04-02
What’s new in Mojo 24.2: Mojo Nightly, Enhanced Python Interop, OSS stdlib and more
Watchlist Matched: oss
Modular · inference-infra · 2024-03-28
The Next Big Step in Mojo🔥 Open Source
Watchlist Matched: open source