agents

This post walks through how Baz built their Spec Review agent using Amazon Bedrock and Amazon Bedrock AgentCore. We'll cover the architecture decisions, implementation details, and the business outcomes they achieved by leveraging these AW...

cloud agents

Open

High signal Matched: bedrock, agent

NVIDIA Technical Blog · hardware · 2026-06-02

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw

Score 15

AI agents are a powerful tool for synthesizing data to accelerate research, summarize information, and help teams make decisions faster. But combining internal...

research agents

Open

High signal Matched: research, agent, agents

AWS Machine Learning Blog · cloud · 2026-06-02

OpenAI models and Codex on Amazon Bedrock are now generally available

Score 13

GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock. Deploy them in production applications and agents today, on Bedrock’s high performance inference engine. 

inference benchmark cloud agents

Open

High signal Matched: inference, performance, bedrock, agents

AWS Machine Learning Blog · cloud · 2026-06-02

Extending MCP support for Amazon Bedrock AgentCore Gateway

Score 11

While deploying Model Context Protocol (MCP) servers in production, enterprises need fine-grained access control across servers, observability into which teams use which tools, security guarantees against data exfiltration, and centralized...

model-release cloud agents

Open

High signal Matched: model, bedrock, mcp

AWS Machine Learning Blog · cloud · 2026-06-02

Secure AI agents with Policy and Lambda interceptors in Amazon Bedrock AgentCore gateway

Score 9

In this post, we use a lakehouse data agent to demonstrate how you can use Policy for deterministic access control and Lambda interceptors for dynamic validation. We then show how to combine Lambda interceptors and Policy to implement a ge...

cloud agents

Open

High signal Matched: bedrock, agent, agents

AWS Machine Learning Blog · cloud · 2026-06-02

Enable safe agentic payments with built-in guardrails using Amazon Bedrock AgentCore payments

Score 9

In this post, we address several key risks that surface when designing an agentic payment system, and how to address them with the capabilities of AgentCore payments.

cloud agents

Open

High signal Matched: bedrock, agentic

AWS Machine Learning Blog · cloud · 2026-06-02

AgentOps: Operationalize agentic AI at scale with Amazon Bedrock AgentCore

Score 9

When you build agentic AI solutions, you face unique operational challenges. Agents make unpredictable decisions, costs spiral unexpectedly, and debugging non-deterministic failures seems impossible. Agentic AI applications don't just exec...

cloud agents

Open

High signal Matched: bedrock, agents, agentic

vLLM Project · open-source · 2026-06-02

Session-Aware Agentic Routing: Continuity-Aware Model Selection for Long-Horizon LLM Agents

Score 15

Long-horizon LLM agents create a routing problem that single-turn prompt routers were not designed to solve. A router still needs to know which model is best for the current request, but it also...

moe model-release agents

Open

High signal Matched: router, model, agents, agentic

NVIDIA Technical Blog · hardware · 2026-06-01

Run Local AI Agents with Faster Models and Multi-Node Clustering on NVIDIA DGX Spark

Score 13

The rise of autonomous, long-running AI agents has introduced a new class of compute demand, namely tasks that maintain large context windows, spawn concurrent...

distributed agents

Open

High signal Matched: multi-node, agents

Lambda · cloud · 2026-06-01

Unbox one of NVIDIA's first co-packaged optics switches with us. See why we bet on CPO early.

Score 15

When we design large GPU clusters, the network is no longer a background system. It's part of the compute envelope. At the 800G and NVIDIA GB300 NVL72 scale, the back-end fabric accounts for 86% of networking power in a three-layer cluster...

inference serving distributed benchmark hardware model-release rag agents

Open

High signal Matched: generation, token generation, throughput, infiniband, gpu, model, retrieval, agentic

NVIDIA Technical Blog · hardware · 2026-06-01

NVIDIA Vera CPU Sets a New Standard for Agentic Workloads in AI Factories

Score 11

Each wave of AI has created a new scaling law. Pretraining scaled intelligence through larger datasets, more parameters, and massively parallel GPU systems....

hardware training agents

Open

High signal Matched: gpu, pretraining, agentic

AMD ROCm Blogs · hardware · 2026-06-01

Out-of-the-Box ROLL Support on AMD GPUs: Accelerating Reinforcement Learning at Scale

Score 13

Reinforcement learning (RL) is rapidly becoming a foundational technology for Large Language Models (LLMs)—powering key abilities such as reasoning and agentic behaviors. As RL workloads grow more complex and computationally intensive, the...

benchmark hardware agents

Open

High signal Matched: performance, gpu, agentic

AWS Machine Learning Blog · cloud · 2026-05-29

Evaluating Deep Agents using LangSmith on AWS

Score 9

This post combines learnings from LangChain’s work on evaluating deep agents and Anthropic’s guide to demystifying evals for AI agents into a practical guide. In this post, you will learn how to: 1) apply five evaluation patterns for deep...

research cloud evals agents

Open

High signal Matched: evaluation, bedrock, evals, evaluating, agent, agents

AWS Machine Learning Blog · cloud · 2026-05-29

Build a test suite that grows with your agent with dataset management in Amazon Bedrock AgentCore

Score 13

Datasets in AgentCore is in public preview. Agent evaluation is most powerful when you combine fast-moving online signals with stable offline baselines. To understand whether your agent is truly improving over time, you need a fixed benchm...

benchmark research cloud evals agents

Open

High signal Matched: benchmark, evaluation, bedrock, agent

AWS Machine Learning Blog · cloud · 2026-05-29

Claude Opus 4.8 is now available on AWS

Score 11

This post covers Opus 4.8's improvements and practical guidance for AI engineers integrating the model into agentic systems and production inference workloads on Amazon Bedrock.

inference model-release cloud agents

Open

High signal Matched: inference, model, bedrock, agentic

PyTorch Foundation · open-source · 2026-05-28

Up to 580tps! New Speed Record of Qwen3.5-397B-A17B on GPU for Agentic Workloads with TokenSpeed

Score 17

TL;DR: The TokenSpeed inference engine achieved a record-breaking 580 tps running the Qwen3.5-397B-A17B model on GPUs. This extreme performance for agentic workloads is driven by systematic elimination of memory copies,...

inference benchmark hardware model-release agents

Open

High signal Matched: inference, performance, gpu, model, agentic

vLLM Project · open-source · 2026-05-28

Accelerating Laguna XS.2 Inference with vLLM, Speculators, and LLM Compressor

Score 15

As organizations increasingly adopt AI-powered development tools, the need for high-performance agentic models that deliver both accuracy and operational efficiency has become critical. Laguna...

inference benchmark agents

Open

High signal Matched: inference, performance, agentic

Modal · inference-infra · 2026-05-27

Role-Based Access Control for humans and agents

Score 9

Introducing Role-Based Access Control for humans and agents, now available for all users on Teams and Enterprise plans.

model-release agents

Open

High signal Matched: introducing, agents

NVIDIA Technical Blog · hardware · 2026-05-20

Add a Specialized Deep Research Skill to Agent Harnesses

Score 12

Agent harnesses like Claude Code, Codex, and LangChain Deep Agents are excellent orchestrators. They manage sessions, chain tools, execute code, and respond to...

research agents

Open

High signal Matched: research, agent, agents

NVIDIA Technical Blog · hardware · 2026-05-19

NVIDIA-Verified Agent Skills Provide Capability Governance for AI Agents

Score 10

Autonomous AI agents are becoming more capable. Open models, Model Context Protocol (MCP)-connected tools, and portable skills are also making agents easier to...

model-release agents

Open

High signal Matched: model, agent, agents, mcp

NVIDIA Technical Blog · hardware · 2026-05-19

Mastering Agentic Techniques: AI Agent Evaluation

Score 16

Evaluating an AI model and evaluating an AI agent are related—but they answer fundamentally different questions. A model benchmark tests the capability of a...

benchmark model-release research evals agents

Open

High signal Matched: benchmark, model, evaluation, evaluating, agent, agentic

Together AI · inference-infra · 2026-05-19

Benchmarking inference at scale: coding agents

Score 16

Real-world inference benchmarks for coding agents: 31% more TPS than TensorRT-LLM, 2× better TTFT at saturation, and 76% lower cost than Claude Opus 4.6.

inference benchmark evals agents

Open

High signal Matched: inference, ttft, cost, benchmarks, agents

Modal · inference-infra · 2026-05-19

Introducing Claude Managed Agents with Modal Sandboxes

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

NVIDIA Technical Blog · hardware · 2026-05-14

How the NVIDIA Vera Rubin Platform is Solving Agentic AI’s Scale-Up Problem

Score 12

Agentic inference has fundamentally changed the runtime dynamics of inference workloads by introducing non-deterministic trajectories—actions, observations,...

inference model-release agents

Open

High signal Matched: inference, introducing, agentic

LMCache · open-source · 2026-05-13

Benchmarking LMCache for Multi-Turn Agentic Workloads on AMD MI300X

Score 20

A practitioner’s guide to KV-cache tiering on ROCm — what works, what doesn’t, and the regime where it actually matters. Key Summary We benchmarked multi-turn agentic workloads using 739 anonymized Claude Code conversation trac...

kv-cache moe hardware model-release quantization agents

Open

High signal Matched: lmcache, moe, mi300x, rocm, fp8, agentic

Nota AI · korea · 2026-05-11

[NetsPresso® x AI Agents] Easier to Use, Even More Powerful

Score 52

  Jaehoon Lee Technical Content Manager, Nota AI   NetsPresso® now embraces AI agents. An easy-to-use interface sits on top of the validated pipeline that handles everything from model compression to device deployment.When a user...

inference serving kernel speculative-decoding moe benchmark hardware model-release research quantization evals agents api

Open

High signal Matched: inference, endpoint, kernel, verification, moe, benchmark, latency, cost, gpu, release, model, evaluation, quantization, quantized, int4, evaluate, benchmarks, swe-bench, mmlu, agent, agents, api

BAIR · research · 2026-05-08

Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling

Score 28

.apr-fig { text-align: center; margin: 1.35em 0; line-height: 1.4; } .apr-fig--wide img { display: inline-block; width: 100%; max-width: 100%; height: auto; vertical-align: middle; } .apr-fig--wide-0-8 { max-width: 80%; margin-left: auto;...

inference serving kv-cache speculative-decoding benchmark model-release research training fine-tuning evals long-context agents frontier-model

Open

High signal Matched: inference, decoding, prefill, generation, serve, throughput, kv cache, verification, performance, latency, cost, model, paper, research, evaluation, training, pretraining, sft, benchmarks, long context, context window, agentic, reasoning model

NVIDIA Technical Blog · hardware · 2026-05-08

Improving Bash Generation in Small Language Models with Grammar-Constrained Decoding

Score 20

Bash is one of the most flexible and powerful interfaces exposed to AI agents. In the right system, a model that emits grep, curl, tar, or a shell pipeline is...

inference model-release agents

Open

High signal Matched: decoding, generation, model, agents

vLLM Project · open-source · 2026-05-06

Serving Agentic Workloads at Scale with vLLM x Mooncake

Score 18

TL;DR: Agentic workloads generate massive shared prefixes that are often recomputed across turns. By integrating Mooncake's distributed KV cache store into vLLM, we achieve 3.8x higher throughput,...

inference serving distributed kv-cache benchmark agents

Open

High signal Matched: serving, throughput, distributed, kv cache, agentic

NVIDIA Technical Blog · hardware · 2026-05-05

How to Build In-Vehicle AI Agents with NVIDIA: From Cloud to Car

Score 12

The automotive cockpit is undergoing a fundamental shift from rule-based interfaces to agentic, multimodal AI systems capable of reasoning, planning, and...

cloud agents

Open

High signal Matched: cloud, agents, agentic

NVIDIA Technical Blog · hardware · 2026-04-30

Automating GPU Kernel Translation with AI Agents: cuTile Python to cuTile.jl

Score 20

NVIDIA CUDA Tile (cuTile) is a tile-based programming model that enables developers to write GPU kernels in terms of tile-level operations—loads, stores, and...

kernel cuda hardware model-release agents

Open

High signal Matched: kernel, cuda, gpu, model, agents

Nota AI · korea · 2026-04-29

[NVIDIA Nemotron Hackathon] Grand Prize Among 20 Teams: Behind Two Sleepless Days

Score 32

  Hancheol Park, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonho LeeEdge AI Engineer Intern, NetsPresso Tech, Nota AI Jaehoon Lee Technical Content Manager,...

inference moe benchmark model-release research korea training fine-tuning quantization evals agents

Open

High signal Matched: generation, moe, performance, model, weights, paper, research, evaluation, korea, korean, seoul, naver, training, fine-tuning, quantization, agent, agents, agentic

Together AI · inference-infra · 2026-04-29

DeepSeek-V4 Pro now available on Together AI

Score 10

DeepSeek-V4 Pro is now available on Together AI with 512K context, controllable reasoning modes, and cached-input pricing for long-context reasoning workloads like code agents, document intelligence, and research synthesis.

research long-context agents

Open

High signal Matched: research, long-context, agents

Hugging Face · open-source · 2026-04-29

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

Score 10

No feed summary available yet.

model-release long-context agents

Open

High signal Matched: introducing, long-context, agents

NVIDIA Technical Blog · hardware · 2026-04-28

NVIDIA Nemotron 3 Nano Omni Powers Multimodal Agent Reasoning in a Single Efficient Open Model

Score 16

Agentic systems often reason across screens, documents, audio, video, and text within a single perception‑to‑action loop. However, they still rely on...

model-release agents open-source

Open

High signal Matched: model, open model, agent, agentic

Together AI · inference-infra · 2026-04-28

Together AI Brings NVIDIA Nemotron 3 Nano Omni to Developers on Day 0

Score 12

NVIDIA Nemotron 3 Nano Omni is now on Together AI: a single open model that reasons across video, images, audio, and text, built for agentic workloads at scale.

model-release agents open-source

Open

High signal Matched: model, open model, agentic

vLLM Project · open-source · 2026-04-28

Run Highly Efficient Multimodal Agentic AI with NVIDIA Nemotron 3 Nano Omni Using vLLM

Score 10

We are excited to support the newly released NVIDIA Nemotron 3 Nano Omni model on vLLM.

model-release agents

Open

High signal Matched: model, agentic

Sakana AI · model-lab · 2026-04-24

Sakana Fugu: A Multi-Agent Orchestration System as a Foundation Model

Score 9

No feed summary available yet.

model-release agents

Open

High signal Matched: model, agent

NVIDIA Technical Blog · hardware · 2026-04-20

Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments

Score 10

AI tools are significantly accelerating software development and changing how developers work with code. These tools serve as real-time copilots, automating...

serving agents

Open

High signal Matched: serve, agents, agentic

NVIDIA Technical Blog · hardware · 2026-04-17

Full-Stack Optimizations for Agentic Inference with NVIDIA Dynamo

Score 12

Coding agents are starting to write production code at scale. Stripe’s agents generate 1,300+ PRs per week. Ramp attributes 30% of merged PRs to agents....

inference agents

Open

High signal Matched: inference, agents, agentic

Modal · inference-infra · 2026-04-14

Autoscaling Autoresearch: Give your agents elastic GPUs on Modal

Score 10

Autoresearch automates AI research. Modal automates AI infrastructure.

research agents

Open

High signal Matched: research, agents

NVIDIA Technical Blog · hardware · 2026-04-12

MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platforms for Complex AI Applications

Score 12

The release of MiniMax M2.7 adds enhancements to the popular MiniMax M2.5 model, built for agentic harnesses,...

model-release agents

Open

High signal Matched: release, model, agentic

SkyPilot · open-source · 2026-04-10

SkyPilot Agent Skill: Let Agents Manage Your GPUs

Score 10

With the SkyPilot Agent Skill, your AI coding agent can launch clusters, run training jobs and manage cloud resources across any infrastructure using natural language.

model-release cloud training agents

Open

High signal Matched: launch, cloud, training, agent, agents

Google Research · big-tech · 2026-04-09

Improving the academic workflow: Introducing two AI agents for better figures and peer review

Score 8

Generative AI

model-release agents

Open

High signal Matched: introducing, agents

SkyPilot · open-source · 2026-04-09

Research-Driven Agents: What Happens When Your Agent Reads Before It Codes

Score 16

Coding agents working from code alone generate shallow hypotheses. Adding a research phase — arxiv papers, competing forks, other backends — produced 5 kernel fusions that made llama.cpp CPU inference 15% faster.

inference kernel research agents

Open

High signal Matched: inference, kernel, arxiv, research, agent, agents

LMCache · open-source · 2026-04-04

LMCache’s New Architecture Boosts MoE Inference Performance by 10×

Score 34

Modern LLM serving workloads are defined by strict latency requirements, high concurrency, and rapidly growing context lengths. Applications such as multi-turn chat, AI agents, and retrieval-augmented generation continuously build on prior...

inference serving kv-cache moe benchmark rag agents

Open

High signal Matched: inference, serving, decoding, generation, throughput, lmcache, moe, performance, latency, ttft, retrieval-augmented generation, retrieval, agents

Together AI · inference-infra · 2026-04-02

Deepgram speech-to-text and voice models now available natively on Together AI

Score 14

Production STT and TTS from Deepgram, available on Together AI Dedicated Model Inference for real-time voice agents.

inference model-release agents

Open

High signal Matched: inference, model, agents

Nota AI · korea · 2026-03-31

The Real Reason TurboQuant Shook the Market: AI Optimization Has Gone Mainstream

Score 46

  Jaehoon Lee Technical Content Manager, Nota AI   In March, a single official announcement from Google Research rocked trillions of won in the market capitalization of U.S. infrastructure and semiconductor stocks. The catalyst:...

inference serving kv-cache benchmark hardware model-release research training fine-tuning quantization agents frontier-model

Open

High signal Matched: inference, serving, generation, throughput, kv cache, benchmark, performance, cost, b200, blackwell, introducing, model, fp8, research, training, fine-tuning, quantization, quantized, agent, agentic, frontier model

Nota AI · korea · 2026-03-23

[GTC 2026 Recap] The Trillion-Dollar Inference Race Begins: How Nota AI Fills the Gap

Score 42

  Jaehoon Lee Technical Content Manager, Nota AI   GTC has evolved far beyond a technology conference, drawing attention from global economies and financial markets alike. This year, CEO Jensen Huang took the stage in his tradema...

inference serving kernel cuda kv-cache benchmark hardware model-release research cloud training long-context agents open-source

Open

High signal Matched: inference, prefill, generation, throughput, cuda, kv cache, performance, latency, cost, gpu, npu, launch, model, research, cloud, training, long-context, context window, agent, agents, agentic, open-source

SkyPilot · open-source · 2026-03-19

Scaling Karpathy's Autoresearch: What Happens When the Agent Gets a GPU Cluster

Score 10

Karpathy's autoresearch runs one experiment at a time. We gave it access to our GPU infra and let it run experiments in parallel.

hardware agents

Open

High signal Matched: gpu, agent

Hugging Face · open-source · 2026-03-17

Holotron-12B - High Throughput Computer Use Agent

Score 10

No feed summary available yet.

serving benchmark agents

Open

High signal Matched: throughput, agent, computer use

NVIDIA Technical Blog · hardware · 2026-03-16

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale

Score 16

Reasoning models are growing rapidly in size and are increasingly being integrated into agentic AI workflows that interact with other models and external tools....

inference distributed agents

Open

High signal Matched: inference, multi-node, agentic

NVIDIA Technical Blog · hardware · 2026-03-16

Introducing NVIDIA BlueField-4-Powered CMX Context Memory Storage Platform for the Next Frontier of AI

Score 12

AI‑native organizations increasingly face scaling challenges as agentic AI workflows drive context windows to millions of tokens and models scale toward...

model-release agents

Open

High signal Matched: introducing, agentic

Together AI · inference-infra · 2026-03-16

Together AI at NVIDIA GTC 2026: Explore our latest innovations across research and products

Score 14

Together AI arrives at NVIDIA GTC 2026 with new launches in inference, agents, voice AI, and open models — plus technical sessions from its research and engineering leaders.

inference research agents

Open

High signal Matched: inference, research, agents

Together AI · inference-infra · 2026-03-12

Build real-time voice agents on Together AI

Score 10

Build real-time voice agents on Together AI with co-located STT, LLM, and TTS infrastructure, native Deepgram and Cartesia support, and end-to-end latency under 500ms.

benchmark agents

Open

High signal Matched: latency, agents

SqueezeBits · korea · 2026-03-11

Reliable & Scalable Synthetic Data for Physical AI (Part 2): Making Cosmos 3.1 x Faster for Production

Score 12

Explore why Physical AI deployment needs synthetic data at scale with Squeezebits' research and discover how to overcome inference bottlenecks to accelerate Roboost Agent.

inference research agents

Open

High signal Matched: inference, research, agent

Together AI · inference-infra · 2026-03-11

Together AI Brings NVIDIA Nemotron 3 to Developers on Day 0

Score 10

NVIDIA Nemotron 3 Super is now available on Together AI Dedicated Inference, delivering efficient multi-agent reasoning, a 1M-token context window, and production-grade deployment on managed infrastructure.

inference long-context agents

Open

High signal Matched: inference, context window, agent

vLLM Project · open-source · 2026-03-11

Run Highly Efficient and Accurate Multi-Agent AI with NVIDIA Nemotron 3 Super Using vLLM

Score 10

We are excited to support the newly released NVIDIA Nemotron 3 Super model on vLLM.

model-release agents

Open

High signal Matched: model, agent

SkyPilot · open-source · 2026-02-27

Don't Run OpenClaw on Your Main Machine

Score 8

OpenClaw gives an AI agent full access to your system. Here's why you should run it on an isolated cloud VM, and how to set that up.

cloud agents

Open

High signal Matched: cloud, agent

SqueezeBits · korea · 2026-02-25

Reliable & Scalable Synthetic Data for Physical AI (Part 1): Taming NVIDIA Cosmos with RoBoost Agent

Score 10

Scaling Physical AI requires reliable synthetic data. Learn how RoBoost Agent integrates NVIDIA Cosmos to transform world models into trustworthy data engines for robotics and autonomous driving.

agents

Open

High signal Matched: agent

Together AI · inference-infra · 2026-01-26

DSGym: A holistic framework for evaluating and training data science agents

Score 18

Introducing DSGym—a holisti evaluation and training framework for LLM-based data science agents. Features 90+ bioinformatics tasks, 92 Kaggle competitions, and synthetic trajectory generation. Our 4B model achieves state-of-the-art perform...

inference benchmark model-release research training evals agents open-source

Open

High signal Matched: generation, performance, introducing, model, evaluation, training, evaluating, agents, open-source

Together AI · inference-infra · 2026-01-13

Learn how Cursor partnered with Together AI to deliver real-time, low-latency inference at scale

Score 24

Together AI teamed with Cursor to build the real-time inference stack that keeps in-editor agents fast and reliable. They productionized NVIDIA Blackwell (B200/GB200), tuning ARM hosts, kernels, and FP4/TensorRT quantization for low latenc...

inference benchmark hardware model-release quantization agents

Open

High signal Matched: inference, latency, b200, gb200, blackwell, model, quantization, agents

vLLM Project · open-source · 2025-12-15

Run Highly Efficient and Accurate AI Agents with NVIDIA Nemotron 3 Nano on vLLM

Score 10

Jan 28th Update: NVIDIA just released their Nemotron 3 Nano model in NVFP4 precision. This model is supported by vLLM out of the box and it uses a new method called Quantization-Aware Distillation...

model-release quantization agents

Open

High signal Matched: model, quantization, agents

Together AI · inference-infra · 2025-12-03

Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud

Score 12

Build, train, and deploy advanced AI agents with integrated reinforcement learning on the Together platform.

cloud agents

Open

High signal Matched: cloud, agents

Together AI · inference-infra · 2025-11-04

Announcing the fastest inference for realtime voice AI agents

Score 14

Together AI launches the fastest voice AI stack: streaming Whisper STT, serverless open-source TTS (Orpheus & Kokoro), and Voxtral transcription. Sub-second latency for production voice agents.

inference benchmark agents open-source

Open

High signal Matched: inference, latency, agents, open-source

Hugging Face · open-source · 2025-10-23

Building the Open Agent Ecosystem Together: Introducing OpenEnv

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agent

Google Research · big-tech · 2025-09-25

Towards better health conversations: Research insights on a “wayfinding” AI agent based on Gemini

Score 8

Generative AI

research agents

Open

High signal Matched: research, agent

Together AI · inference-infra · 2025-08-21

How Together AI Uses AI Agents to Automate Complex Engineering Tasks: Lessons from Developing Efficient LLM Inference Systems

Score 16

Build AI agents for complex, long-running engineering tasks. Learn key patterns from a case study: accelerating LLM inference with speculative decoding.

inference speculative-decoding agents

Open

High signal Matched: inference, decoding, speculative decoding, agents

Hugging Face · open-source · 2025-08-18

MCP for Research: How to Connect AI to Research Tools

Score 10

No feed summary available yet.

research agents

Open

High signal Matched: research, mcp

SkyPilot · open-source · 2025-08-12

Self-host open-source LLM agent sandbox on your own cloud

Score 10

Your AI writes code. Now what? If you’re building AI agents in 2025, you probably wondered that as well. Your LLM generates some Python code that analyzes data, manipulates files, or calls APIs. But where does it run? Most people eit...

cloud agents open-source

Open

High signal Matched: cloud, agent, agents, open-source

Together AI · inference-infra · 2025-07-25

Qwen3-Coder: The Most Capable Agentic Coding Model Now Available on Together AI

Score 12

Unlock agentic coding with Qwen3-Coder on Together AI: 256K context, SWE-bench rivaling Claude Sonnet 4, zero-setup instant deployment.

model-release evals agents

Open

High signal Matched: model, swe-bench, agentic

Together AI · inference-infra · 2025-07-14

Kimi K2: Leading Open-Source Model Now Available on Together AI

Score 16

Run Kimi K2 (1T params) on Together AI—frontier open model for agentic reasoning and coding, serverless deployment, 99.9% SLA, lower cost and instant scaling.

benchmark model-release agents open-source

Open

High signal Matched: cost, model, open model, agentic, open-source

BAIR · research · 2025-07-01

Whole-Body Conditioned Egocentric Video Prediction

Score 10

.modal { display: none; position: fixed; z-index: 9999; padding-top: 50px; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; background-color: rgba(0,0,0,0.9); } .modal-content { margin: auto; display: block; max-width: 90%; max-...

inference benchmark model-release research training evals agents

Open

High signal Matched: inference, generation, performance, model, paper, arxiv, evaluation, training, evaluate, agent, agents

Hugging Face · open-source · 2025-06-06

ScreenSuite - The most comprehensive evaluation suite for GUI Agents!

Score 10

No feed summary available yet.

research evals agents

Open

High signal Matched: evaluation, agents

AIBrix · open-source · 2025-02-21

Introducing AIBrix: Cost-Effective and Scalable Control Plane for vLLM

Score 26

Open-source large language models (LLMs) like LLaMA, Deepseek, Qwen and Mistral etc have surged in popularity, offering enterprises greater flexibility, cost savings, and control over their AI deployments. These models have empowered organ...

inference benchmark model-release agents open-source

Open

High signal Matched: inference, generation, latency, cost, introducing, model, agents, open-source

AIBrix · open-source · 2025-02-19

AIBrix v0.2.0 Release: Distributed KV Cache, Orchestration and Heterogeneous GPU Support

Score 42

We’re excited to announce the v0.2.0 release of AIBrix! Building on feedback from v0.1.0 production adoption and user interest, this release introduces several new features to enhance performance and usability. Extend the vLLM Prefix...

inference serving distributed kv-cache benchmark hardware model-release agents

Open

High signal Matched: inference, serving, prefill, throughput, distributed, multi-node, kv cache, prefix cache, performance, cost, gpu, accelerator, release, agent

Hugging Face · open-source · 2025-02-04

DABStep: Data Agent Benchmark for Multi-step Reasoning

Score 10

No feed summary available yet.

benchmark agents

Open

High signal Matched: benchmark, agent

Modular · inference-infra · 2025-01-30

Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling

Score 10

Agentic Building Blocks: Creating AI Agents with MAX Serve and OpenAI Function Calling

serving agents

Open

High signal Matched: serve, agents, agentic, function calling

Hugging Face · open-source · 2024-12-31

Introducing smolagents: simple agents that write actions in code.

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

Hugging Face · open-source · 2024-07-01

Our Transformers Code Agent beats the GAIA benchmark 🏅

Score 10

No feed summary available yet.

benchmark agents

Open

High signal Matched: benchmark, agent

Hugging Face · open-source · 2024-05-13

License to Call: Introducing Transformers Agents 2.0

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

Hugging Face · open-source · 2023-07-24

Introducing Agents.js: Give tools to your LLMs using JavaScript

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

Hugging Face · open-source · 2023-02-07

Introducing ⚔️ AI vs. AI ⚔️ a deep reinforcement learning multi-agents competition system

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

Hugging Face · open-source · 2021-12-02

Introducing Snowball Fight ☃️, our first ML-Agents environment

Score 10

No feed summary available yet.

model-release agents

Open

High signal Matched: introducing, agents

Prime Intellect · inference-infra · 2026-06-03

researchGeneral Agent: A Self-Evolving, Synthetic Agent Environment

Score 6

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Prime Intellect · inference-infra · 2026-06-03

researchrenderers: Token-Level Templating for Agentic RL

Score 6

No feed summary available yet.

agents

Open

Watchlist Matched: agentic

Prime Intellect · inference-infra · 2026-06-03

announcementsReleasing Lab: the training platform for self-improving agents

Score 6

No feed summary available yet.

training agents

Open

Watchlist Matched: training, agents

Runpod · cloud · 2026-06-03

AgentsDeploy AI agents that run, react, and scale instantly.

Score 2

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Nebius · cloud · 2026-06-03

Nebius and Tavily: Bringing agentic search into the production AI stack

Score 0

No feed summary available yet.

agents

Open

Watchlist Matched: agentic

Nebius · cloud · 2026-06-03

Nebius and LangChain partner to power production-grade AI agents on open models

Score 0

No feed summary available yet.

agents

Open

Watchlist Matched: agents

FriendliAI · inference-infra · 2026-06-03

Kimi K2.6 Meets FriendliAI: Frontier Agentic AI, Deployed in One Click

Score 6

No feed summary available yet.

agents

Open

Watchlist Matched: agentic

FriendliAI · inference-infra · 2026-06-03

Score 6

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Moonshot AI Kimi · model-lab · 2026-06-03

Coding with Kimi K2: Top 6 Agents & Setup Guides

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Moonshot AI Kimi · model-lab · 2026-06-03

Kimi K2: Open Agentic Intelligence

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agentic

Mistral AI · model-lab · 2026-06-03

Studio Build, test, and run AI agents and apps.

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Mistral AI · model-lab · 2026-06-03

Vibe AI agent for long-horizon work.

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Mistral AI · model-lab · 2026-06-03

Vibe for code Coding agents in the terminal, IDE, and background.

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Anthropic · model-lab · 2026-06-03

How we contain Claude across products

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Anthropic · model-lab · 2026-06-03

Scaling Managed Agents: Decoupling the brain from the hands

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Anthropic · model-lab · 2026-06-03

Quantifying infrastructure noise in agentic coding evals

Score 5

No feed summary available yet.

agents

Open

Watchlist Matched: agentic

Stanford CRFM · research · 2026-06-03

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

Score 2

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2026-06-03

Adding MCP Tools to Reachy Mini

Score 4

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Hugging Face · open-source · 2026-06-02

Holo3.1: Fast & Local Computer Use Agents

Score 2

No feed summary available yet.

agents

Open

Watchlist Matched: agents, computer use

NVIDIA Technical Blog · hardware · 2026-06-02

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA

Score 6

AI agents are changing how you interact with your PC. Creators, developers, and AI enthusiasts are already using these agents extensively to assist with...

agents

Open

Watchlist Matched: agents

AWS Machine Learning Blog · cloud · 2026-06-02

Building a secure auth code flow setup using AgentCore Gateway with MCP clients

Score 7

This post demonstrates how to implement Open Authorization (OAuth) Code flow as an inbound authorization mechanism for MCP servers hosted on Amazon Bedrock AgentCore Gateway. By the end of this guide, you will have a production-ready setup...

cloud agents

Open

Watchlist Matched: bedrock, mcp

NVIDIA Technical Blog · hardware · 2026-06-02

Deploy Agentic-Ready AI at the Edge with Memory Efficiency in NVIDIA JetPack 7.2

Score 4

As AI agents move from the digital world to the physical environment, they can readily use NVIDIA Jetson to accelerate real-world deployment with optimized...

agents

Open

Watchlist Matched: agents, agentic

AWS Machine Learning Blog · cloud · 2026-06-02

Amazon Quick integration with time-series databases for market intelligence using MCP

Score 7

In this post, we walk through a practical implementation using KDB-X MCP server integration with Amazon Quick, demonstrating how traders and analysts can ask questions using conversational language and receive actionable insights from data...

benchmark agents

Open

Watchlist Matched: performance, mcp

Hugging Face · open-source · 2026-06-01

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

Score 2

No feed summary available yet.

agents

Open

Watchlist Matched: agent

NVIDIA Technical Blog · hardware · 2026-06-01

Advancing AI Infrastructure for Agentic AI with NVIDIA DOCA In-Silicon Security

Score 4

The AI era is driving a new class of infrastructure: AI factories that transform data into intelligence for autonomous AI agents operating at unprecedented...

agents

Open

Watchlist Matched: agents, agentic

Microsoft Research · big-tech · 2026-05-29

Data Formulator 0.7: AI-powered data analytics for enterprise data

Score 5

Data Formulator introduces AI-powered analytics for enterprise data workflows. Data teams can easily bring enterprise data into an AI-ready workspace where users can explore, analyze, and visualize data with AI agents to turn raw data into...

research agents

Open

Watchlist Matched: research, agents

Cloudflare Blog · cloud · 2026-05-28

How we built Cloudflare's data platform and an AI agent on top of it

Score 0

Here’s how we built Town Lake, Cloudflare's unified analytics platform, alongside Skipper, an internal AI agent running on top of it.

agents

Open

Watchlist Matched: agent

LY Corporation Tech Blog · korea · 2026-05-26

ID-JAG The Hard Way: Learning AI agent authorization through failure

Score 0

Hi, I'm Jeongwoo, a security platform engineer at LY Corporation developing and operating Athenz.In ...

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2026-05-25

Harness, Scaffold, and the AI Agent Terms Worth Getting Right

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Microsoft Research · big-tech · 2026-05-22

MagenticLite, MagenticBrain, Fara1.5: An agentic experience optimized for small models

Score 6

MagenticLite is an agentic system for small models that works across the browser and local file system in a single workflow. It combines specialized models and orchestration to support efficient agentic performance on everyday tasks. The p...

benchmark research agents

Open

Watchlist Matched: performance, research, agentic

NVIDIA Technical Blog · hardware · 2026-05-21

Automating and Optimizing Financial Signal Discovery with Multi-Agent Systems

Score 3

In quantitative finance, researchers build algorithms to trade assets, derivatives, and other financial instruments. A key part of that work is finding signals:...

agents

Open

Watchlist Matched: agent

NVIDIA Technical Blog · hardware · 2026-05-20

Mastering Agentic Techniques: AI Agent Customization

Score 3

Autonomous AI agents are taking on all types of work for businesses: routing logistics fleets, triaging support tickets, generating code, and orchestrating...

agents

Open

Watchlist Matched: agent, agents, agentic

Modal · inference-infra · 2026-05-20

Scaling reinforcement learning at Applied Compute

Score 1

How Applied Compute trains custom agents with Reinforcement Learning for enterprises like DoorDash, Cognition, and Mercor on Modal.

agents

Open

Watchlist Matched: agents

Cloudflare Blog · cloud · 2026-05-19

Announcing Claude Managed Agents on Cloudflare

Score 0

Cloudflare has integrated with Anthropic's Claude Managed Agents to provide a fast, isolated execution environment for autonomous code delivery. This means builders can scale agent workflows globally while strictly controlling access to pr...

agents

Open

Watchlist Matched: agent, agents

Modular · inference-infra · 2026-05-19

How I built a pure Mojo app (and 10 libraries) with AI agents

Score 1

How I built a pure Mojo app (and 10 libraries) with AI agents

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-05-13

Transform Video Into Instantly Searchable, Actionable Intelligence with AI Agents and Skills

Score 3

In today’s data-driven world, organizations increasingly rely on video to capture critical information, yet extracting meaningful, real-time insights from...

agents

Open

Watchlist Matched: agents

Modular · inference-infra · 2026-05-13

Translating to Mojo via AI Agents

Score 1

Translating to Mojo via AI Agents

agents

Open

Watchlist Matched: agents

Microsoft Research · big-tech · 2026-05-12

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

Score 4

Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest. The post SocialReasoni...

research agents

Open

Watchlist Matched: research, agents

NVIDIA Technical Blog · hardware · 2026-05-08

Streaming Tokens and Tools: Multi-Turn Agentic Harness Support in NVIDIA Dynamo

Score 3

An agentic exchange must preserve a structured interaction: assistant turns interleave reasoning with one or more tool calls, and subsequent user turns return...

agents

Open

Watchlist Matched: agentic

NVIDIA Technical Blog · hardware · 2026-05-05

Building for the Rising Complexity of Agentic Systems with Extreme Co-Design

Score 3

Generative AI’s explosive first chapter was defined by humans sending requests and models responding. The agentic chapter is different.  Agents don't...

agents

Open

Watchlist Matched: agents, agentic

NVIDIA Technical Blog · hardware · 2026-05-04

Optimize Supply Chain Decision Systems Using NVIDIA cuOpt Agent Skills

Score 3

Modern supply chains operate under the constant pressures of fluctuating demand, volatile costs, constrained capacity, and interdependent decision-making....

agents

Open

Watchlist Matched: agent

Cloudflare Blog · cloud · 2026-04-30

Agents can now create Cloudflare accounts, buy domains, and deploy

Score 0

Starting today, agents can now be Cloudflare customers. They can create a Cloudflare account, start a paid subscription, register a domain, and get back an API token to deploy code right away. Humans can be in the loop to grant permission,...

agents api

Open

Watchlist Matched: agents, api

Lambda · cloud · 2026-04-30

Creating highly efficient agents: 450M tool-calling tokens distilled for post-training from top open-source models

Score 4

Harnesses If you've used Claude Code or Codex, you've used a harness. A harness is the infrastructure layer that wraps an AI coding agent and decides how it operates, what it can touch, and how you measure whether it worked. It's how most...

hardware training agents open-source

Open

Watchlist Matched: gpu, training, post-training, agent, agents, open-source

NVIDIA Technical Blog · hardware · 2026-04-29

Powering AI Factories with NVIDIA Enterprise Reference Architectures

Score 3

The next wave of enterprise productivity is being built on AI factories. As organizations deploy agentic AI systems capable of reasoning, automation, and...

agents

Open

Watchlist Matched: agentic

NVIDIA Technical Blog · hardware · 2026-04-28

24/7 Simulation Loops: How Agentic AI Keeps Subsurface Engineering Moving

Score 3

The subsurface industry is at a critical point in its digital evolution. For decades, unlocking reservoir potential has relied on experts performing essential...

agents

Open

Watchlist Matched: agentic

Sakana AI · model-lab · 2026-04-27

Learning to Orchestrate Agents in Natural Language with the Conductor

Score 0

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Hugging Face · open-source · 2026-04-24

DeepSeek-V4: a million-token context that agents can actually use

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-04-23

Winning a Kaggle Competition with Generative AI–Assisted Coding

Score 3

In March 2026, three LLM agents generated over 600,000 lines of code, ran 850 experiments, and helped secure a first-place finish in a Kaggle playground...

agents

Open

Watchlist Matched: agents

Google Research · big-tech · 2026-04-22

ReasoningBank: Enabling agents to learn from experience

Score 0

Generative AI

agents

Open

Watchlist Matched: agents

Cloudflare Blog · cloud · 2026-04-20

Building the agentic cloud: everything we launched during Agents Week 2026

Score 6

Agents Week 2026 is a wrap. Let’s take a look at everything we announced, from compute and security to the agent toolbox, platform tools, and the emerging agentic web. Everything we shipped for the agentic cloud.

cloud agents

Open

Watchlist Matched: cloud, agent, agents, agentic

NVIDIA Technical Blog · hardware · 2026-04-17

Build a More Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw

Score 3

Agents are evolving from question-and-answer systems into long-running autonomous assistants that read files, call APIs, and drive multi-step workflows....

agents

Open

Watchlist Matched: agent, agents

LY Corporation Tech Blog · korea · 2026-04-17

Why ID-JAG is the future of AI agent security

Score 0

As of 2026, the AI paradigm is steadily shifting from mere chat interfaces to action-centric executi...

agents

Open

Watchlist Matched: agent

NVIDIA Technical Blog · hardware · 2026-04-16

How to Build Vision AI Pipelines Using NVIDIA DeepStream Coding Agents

Score 3

Developing real-time vision AI applications presents a significant challenge for developers, often demanding intricate data pipelines, countless lines of code,...

agents

Open

Watchlist Matched: agents

Modular · inference-infra · 2026-04-16

How Frontier Coding Agents Built a Video Diffusion Pipeline on MAX

Score 1

How Frontier Coding Agents Built a Video Diffusion Pipeline on MAX

agents

Open

Watchlist Matched: agents

Hugging Face · open-source · 2026-04-16

Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Hugging Face · open-source · 2026-04-15

Inside VAKRA: Reasoning, Tool Use, and Failure Modes of Agents

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents, tool use

Modal · inference-infra · 2026-04-15

Building with Modal and the OpenAI Agents SDK

Score 1

Modal is an official sandbox provider for the OpenAI Agents SDK.

agents api

Open

Watchlist Matched: agents, sdk

Together AI · inference-infra · 2026-04-13

EinsteinArena: Harnessing the collective intelligence of agents in the wild to advance science

Score 3

EinsteinArena is a platform where AI agents collaborate and compete on open math problems. AI agents on EinsteinArena have already set 11 new state-of-the-art results on open math problems — including pushing the kissing number lower bound...

agents

Open

Watchlist Matched: agents

AI2 · research · 2026-04-13

Evaluating agents for scientific discovery

Score 0

Two benchmarks developed at Ai2 – ScienceWorld and DiscoveryWorld – reveal that even incredibly strong AI science agents struggle with problems human scientists solve routinely.

evals agents

Open

Watchlist Matched: evaluating, benchmarks, agents

NVIDIA Technical Blog · hardware · 2026-03-24

Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety

Score 3

Agentic AI is an ecosystem where specialized models work together to handle planning, reasoning, retrieval, and safety guardrailing. As these systems scale,...

rag agents

Open

Watchlist Matched: rag, retrieval, agents, agentic

Hugging Face · open-source · 2026-03-24

A New Framework for Evaluating Voice Agents (EVA)

Score 1

No feed summary available yet.

evals agents

Open

Watchlist Matched: evaluating, agents

AI2 · research · 2026-03-24

MolmoWeb: An open agent for automating web tasks

Score 6

Introducing MolmoWeb, an open visual web agent that navigates and completes tasks in a browser using screenshots alone, along with MolmoWebMix, the largest public dataset for training web agents.

model-release training agents

Open

Watchlist Matched: introducing, training, agent, agents

AI2 · research · 2026-03-23

Highlights from Ai2 at NVIDIA GTC 2026

Score 0

A recap of Ai2's week at NVIDIA GTC 2026, covering panels on open models, live demos of Olmo Hybrid and Asta AutoDiscovery, and conversations on coding agents, hybrid architectures, and robotics.

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-03-18

How to Build Deep Agents for Enterprise Search with NVIDIA AI-Q and LangChain

Score 3

While consumer AI offers powerful capabilities, workplace tools often suffer from disjointed data and limited context. Built with LangChain, the NVIDIA AI-Q...

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-03-17

Building the AI Grid with NVIDIA: Orchestrating Intelligence Everywhere

Score 3

AI-native services are exposing a new bottleneck in AI infrastructure: As millions of users, agents, and devices demand access to intelligence, the challenge is...

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-03-16

Scaling Autonomous AI Agents and Workloads with NVIDIA DGX Spark

Score 3

Autonomous AI agents are driving the next wave of AI innovation. These agents must often manage long-running tasks that use multiple communication channels and...

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-03-16

Run Autonomous, Self-Evolving Agents More Safely with NVIDIA OpenShell

Score 3

AI has evolved from assistants following your directions to agents that act independently. Called claws, these agents can take a goal, figure out how to achieve...

agents

Open

Watchlist Matched: agents

NVIDIA Technical Blog · hardware · 2026-03-16

NVIDIA Vera Rubin POD: Seven Chips, Five Rack-Scale Systems, One AI Supercomputer

Score 3

Artificial intelligence is token-driven. Every prompt, reasoning step, and agent interaction generates tokens. Over the past year, token consumption has grown...

agents

Open

Watchlist Matched: agent

Together AI · inference-infra · 2026-02-25

CoderForge-Preview: SOTA open dataset for training efficient coding agents

Score 3

No feed summary available yet.

training agents frontier-model

Open

Watchlist Matched: training, agents, sota

Hugging Face · open-source · 2026-02-19

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Modal · inference-infra · 2026-02-19

How Ramp built a full context background coding agent on Modal

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2026-02-12

OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments

Score 1

No feed summary available yet.

evals agents

Open

Watchlist Matched: evaluating, agents

Google Research · big-tech · 2026-01-28

Towards a science of scaling agent systems: When and why agent systems work

Score 0

Generative AI

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2026-01-27

Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective

Score 1

No feed summary available yet.

training agents open-source

Open

Watchlist Matched: training, agentic, oss

Hugging Face · open-source · 2026-01-21

AssetOpsBench: Bridging the Gap Between AI Agent Benchmarks and Industrial Reality

Score 1

No feed summary available yet.

evals agents

Open

Watchlist Matched: benchmarks, agent

LY Corporation Tech Blog · korea · 2026-01-05

Building a multi-agent pipeline for NL-to-SQL analytics

Score 0

This post is a follow-up to Creating a domain-specific NL-to-SQL MCP server, which introduced our MC...

agents

Open

Watchlist Matched: agent, mcp

Hugging Face · open-source · 2026-01-05

NVIDIA brings agents to life with DGX Spark and Reachy Mini

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

SkyPilot · open-source · 2025-12-17

Train an agent to use google search as a tool with RL

Score 1

Train a tool-calling agent with VeRL and use SkyPilot to scale it up with independent RL trainer and env rollout

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-12-16

CUGA on Hugging Face: Democratizing Configurable AI Agents

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Hugging Face · open-source · 2025-12-04

DeepMath: A lightweight math reasoning Agent with smolagents

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

LY Corporation Tech Blog · korea · 2025-11-28

Creating a domain-specific NL-to-SQL MCP server

Score 0

IntroductionEnterprise data analysis faces a fundamental challenge: the gap between business questio...

agents

Open

Watchlist Matched: mcp

Modal · inference-infra · 2025-11-20

Agents need good developer experience too

Score 1

Turns out, good devex for agents looks a lot like good devex for humans.

agents

Open

Watchlist Matched: agents

Google Research · big-tech · 2025-11-07

DS-STAR: A state-of-the-art versatile data science agent

Score 0

Data Mining & Modeling

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-10-30

Aligning to What? Rethinking Agent Generalization in MiniMax M2

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Together AI · inference-infra · 2025-10-28

Dynamic AI agent testing for the real world with Collinear Simulations and Together Evals

Score 3

Test AI agents in the real world with Collinear TraitMix and Together Evals: dynamic persona simulations, multi-turn dialogs, and LLM-as-judge scoring.

evals agents

Open

Watchlist Matched: evals, agent, agents

SkyPilot · open-source · 2025-10-14

How to train and scale AI math/coding agents using VeRL on any AI infra

Score 1

Want to train an AI agent with RL that can solve math problems or write code? This tutorial walks you through building your own math and coding agents with step-by-step examples with plenty of screenshots to help you along the way. We use...

training agents

Open

Watchlist Matched: training, post-training, agent, agents

Google Research · big-tech · 2025-09-30

The anatomy of a personal health agent

Score 0

Generative AI

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-09-29

Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-09-23

Smol2Operator: Post-Training GUI Agents for Computer Use

Score 1

No feed summary available yet.

training agents

Open

Watchlist Matched: training, post-training, agents, computer use

Hugging Face · open-source · 2025-09-22

Gaia2 and ARE: Empowering the community to study agents

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agents

Google Research · big-tech · 2025-09-19

Sensible Agent: A framework for unobtrusive interaction with proactive AR agents

Score 0

Human-Computer Interaction and Visualization

agents

Open

Watchlist Matched: agent, agents

Hugging Face · open-source · 2025-09-10

Jupyter Agents: training LLMs to reason with notebooks

Score 1

No feed summary available yet.

training agents

Open

Watchlist Matched: training, agents

LY Corporation Tech Blog · korea · 2025-08-20

LY Corporation on AI, a recap of Tech-Verse 2025

Score 0

Hello. I'm Sumin Shin, a developer working on services related to LLM agents at LINE AI LAB, LINE Pl...

agents

Open

Watchlist Matched: agents

Replicate · inference-infra · 2025-08-10

Announcing Replicate's remote MCP server

Score 0

Use our MCP to discover, compare, and run models from apps like Claude, Cursor, and VS Code.

agents

Open

Watchlist Matched: mcp

Google Research · big-tech · 2025-08-01

MLE-STAR: A state-of-the-art machine learning engineering agent

Score 0

Machine Intelligence

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-07-31

Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Hugging Face · open-source · 2025-07-17

Five Big Improvements to Gradio MCP Servers

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Hugging Face · open-source · 2025-07-17

Back to The Future: Evaluating AI Agents on Predicting Future Events

Score 0

No feed summary available yet.

evals agents

Open

Watchlist Matched: evaluating, agents

Modular · inference-infra · 2025-07-16

AI Agents for AWS Marketplace

Score 1

AI Agents for AWS Marketplace

agents

Open

Watchlist Matched: agents

Hugging Face · open-source · 2025-07-10

ScreenEnv: Deploy your full stack Desktop Agent

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Hugging Face · open-source · 2025-07-10

Building the Hugging Face MCP Server

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Hugging Face · open-source · 2025-07-09

Upskill your LLMs With Gradio MCP Servers

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Together AI · inference-infra · 2025-07-02

DeepSWE: Training a Fully Open-sourced, State-of-the-Art Coding Agent by Scaling RL

Score 3

No feed summary available yet.

training agents

Open

Watchlist Matched: training, agent

Together AI · inference-infra · 2025-06-12

From Zero to One: Building An Autonomous and Open Data Scientist Agent from Scratch

Score 3

Build a data scientist agent using Together’s open-source models and Code Interpreter—easy to implement, solid benchmarks, and full code on GitHub.

evals agents open-source

Open

Watchlist Matched: benchmarks, agent, open-source

Hugging Face · open-source · 2025-06-03

Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent

Together AI · inference-infra · 2025-05-28

Mixture-of-Agents Alignment: Harnessing the Collective Intelligence of Open-Source LLMs to Improve Post-Training

Score 3

No feed summary available yet.

training agents open-source

Open

Watchlist Matched: training, post-training, agents, open-source

Hugging Face · open-source · 2025-05-23

Tiny Agents in Python: a MCP-powered agent in ~70 lines of code

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent, agents, mcp

Hugging Face · open-source · 2025-04-30

How to Build an MCP Server with Gradio

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: mcp

Hugging Face · open-source · 2025-04-25

Tiny Agents: an MCP-powered agent in 50 lines of code

Score 1

No feed summary available yet.

agents

Open

Watchlist Matched: agent, agents, mcp

BAIR · research · 2025-03-25

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

Score 6

Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle "stop-and...

serving kernel benchmark model-release research training agents

Open

Watchlist Matched: throughput, kernel, performance, model, paper, training, agent, agents

Hugging Face · open-source · 2025-02-28