cloud

hardware model-release cloud training

High signal Matched: bedrock, api

Lambda · cloud · 2026-06-03

Introducing workspaces for Lambda Cloud

Score 17

Lambda workspaces help teams organize cloud resources, control access, and separate dev, staging, and production in shared GPU environments. A junior researcher kills a production training run. A contractor sees weights they shouldn't. If...

High signal Matched: gpu, introducing, weights, cloud, training

AWS Machine Learning Blog · cloud · 2026-06-03

How Baz improved its AI Agent Code Review accuracy using Amazon Bedrock AgentCore

Score 11

This post walks through how Baz built their Spec Review agent using Amazon Bedrock and Amazon Bedrock AgentCore. We'll cover the architecture decisions, implementation details, and the business outcomes they achieved by leveraging these AW...

High signal Matched: bedrock, agent

AWS Machine Learning Blog · cloud · 2026-06-02

Reference your own AWS Secrets Manager secrets in Amazon Bedrock AgentCore Identity

Score 9

Today, we’re excited to announce the ability to reference a secret in AWS Secrets Manager for AgentCore Identity, so you can reference your own preconfigured secret from Secrets Manager and retain full control over how it is managed. With...

inference benchmark cloud agents

High signal Matched: bedrock

AWS Machine Learning Blog · cloud · 2026-06-02

OpenAI models and Codex on Amazon Bedrock are now generally available

Score 13

GPT-5.5, GPT-5.4, and Codex are now generally available on Amazon Bedrock. Deploy them in production applications and agents today, on Bedrock’s high performance inference engine. 

model-release cloud agents

High signal Matched: inference, performance, bedrock, agents

AWS Machine Learning Blog · cloud · 2026-06-02

Extending MCP support for Amazon Bedrock AgentCore Gateway

Score 11

While deploying Model Context Protocol (MCP) servers in production, enterprises need fine-grained access control across servers, observability into which teams use which tools, security guarantees against data exfiltration, and centralized...

High signal Matched: model, bedrock, mcp

AWS Machine Learning Blog · cloud · 2026-06-02

Secure AI agents with Policy and Lambda interceptors in Amazon Bedrock AgentCore gateway

Score 9

In this post, we use a lakehouse data agent to demonstrate how you can use Policy for deterministic access control and Lambda interceptors for dynamic validation. We then show how to combine Lambda interceptors and Policy to implement a ge...

High signal Matched: bedrock, agent, agents

AWS Machine Learning Blog · cloud · 2026-06-02

Enable safe agentic payments with built-in guardrails using Amazon Bedrock AgentCore payments

Score 9

In this post, we address several key risks that surface when designing an agentic payment system, and how to address them with the capabilities of AgentCore payments.

High signal Matched: bedrock, agentic

AWS Machine Learning Blog · cloud · 2026-06-02

AgentOps: Operationalize agentic AI at scale with Amazon Bedrock AgentCore

Score 9

When you build agentic AI solutions, you face unique operational challenges. Agents make unpredictable decisions, costs spiral unexpectedly, and debugging non-deterministic failures seems impossible. Agentic AI applications don't just exec...

High signal Matched: bedrock, agents, agentic

AWS Machine Learning Blog · cloud · 2026-05-30

Comprehensive observability for Amazon SageMaker AI LLM inference: From GPU utilization to LLM quality

Score 17

This post demonstrates a comprehensive observability solution using Amazon Managed Grafana dashboards that provides a holistic view of both quality and quantity for LLMs served on Amazon SageMaker AI endpoints with inference components.

inference hardware cloud

model-release cloud training

High signal Matched: inference, gpu, sagemaker

AWS Machine Learning Blog · cloud · 2026-05-29

Training Azerbaijani language models on Amazon SageMaker AI

Score 13

Azercell Telecom LLC, Azerbaijan's leading telecommunications provider, wanted to build an Azerbaijani large language model (LLM) on Amazon SageMaker AI for telecom use cases and a customer-facing chatbot. The challenge: adapting foundatio...

High signal Matched: model, sagemaker, training

AWS Machine Learning Blog · cloud · 2026-05-29

Build a custom portal with embedded Amazon SageMaker AI MLflow Apps

Score 11

In this post, you learn how to build a custom portal with embedded SageMaker AI MLflow Apps UI. You walk through the architecture pattern behind a React front end paired with a Flask reverse proxy that handles AWS Signature Version 4 (SigV...

High signal Matched: cloud, sagemaker

AWS Machine Learning Blog · cloud · 2026-05-29

Streamline external access to Amazon SageMaker MLflow using a REST API proxy

Score 11

In this post, we demonstrate how to build a secure Flask-based MLflow proxy service that provides HTTPS access to Amazon SageMaker MLflow without requiring the MLflow SDK. This solution is for organizations undergoing cloud transformation...

cloud api

High signal Matched: cloud, sagemaker, api, sdk

AWS Machine Learning Blog · cloud · 2026-05-29

Evaluating Deep Agents using LangSmith on AWS

Score 9

This post combines learnings from LangChain’s work on evaluating deep agents and Anthropic’s guide to demystifying evals for AI agents into a practical guide. In this post, you will learn how to: 1) apply five evaluation patterns for deep...

research cloud evals agents

High signal Matched: evaluation, bedrock, evals, evaluating, agent, agents

AWS Machine Learning Blog · cloud · 2026-05-29

Build a test suite that grows with your agent with dataset management in Amazon Bedrock AgentCore

Score 13

Datasets in AgentCore is in public preview. Agent evaluation is most powerful when you combine fast-moving online signals with stable offline baselines. To understand whether your agent is truly improving over time, you need a fixed benchm...

benchmark research cloud evals agents

inference model-release cloud agents

High signal Matched: benchmark, evaluation, bedrock, agent

AWS Machine Learning Blog · cloud · 2026-05-29

Claude Opus 4.8 is now available on AWS

Score 11

This post covers Opus 4.8's improvements and practical guidance for AI engineers integrating the model into agentic systems and production inference workloads on Amazon Bedrock.

High signal Matched: inference, model, bedrock, agentic

PyTorch Foundation · open-source · 2026-05-27

Alibaba Cloud Joins the PyTorch Foundation as a Platinum Member

Score 11

The PyTorch Foundation, a community-driven hub for open source AI under the Linux Foundation, is announcing today that Alibaba Cloud has joined as a Platinum member. Alibaba Cloud is a...

cloud open-source

inference distributed hardware model-release cloud quantization evals

High signal Matched: cloud, open source

AMD ROCm Blogs · hardware · 2026-05-25

AI Inference on AMD Ryzen™ AI Max Processor

Score 20

Local large language model (LLM) inference has rapidly evolved, but a persistent limitation remains: model size is constrained by available GPU memory. Discrete GPUs typically offer 8–24 GB of dedicated VRAM, which can limit the size of mo...

inference serving kernel triton benchmark model-release cloud open-source

High signal Matched: inference, multi-gpu, gpu, model, checkpoint, cloud, quantization, evaluate

AMD ROCm Blogs · hardware · 2026-05-22

From Build to Benchmark: ONNX Model Serving with Triton Inference Server on AMD GPUs

Score 30

Triton Inference Server is an open-source platform designed to streamline AI inferencing. It supports the deployment, scaling, and inference of trained models from multiple frameworks, including ONNX Runtime, TensorFlow, PyTorch, and other...

High signal Matched: inference, inferencing, serving, triton, benchmark, model, cloud, open-source

Lambda · cloud · 2026-05-21

Lambda Bare Metal Instances: full hardware control with API-driven operations

Score 8

The unit of AI compute has shifted from single hosts to rack-scale systems that integrate NVIDIA GPUs, CPUs, scale-up networking fabrics, and liquid cooling, such as the NVIDIA GB300 NVL72 and NVIDIA Vera Rubin NVL72. Teams at the frontier...

inference serving benchmark cloud training api

High signal Matched: serving, performance, cloud, training, api

NVIDIA Technical Blog · hardware · 2026-05-21

Building Token‑Metered AI Services on Telco AI Factories

Score 10

Telcos around the world are building sovereign AI factories based on the NVIDIA Cloud Partner (NCP) reference architecture, giving governments, enterprises, and...

High signal Matched: cloud

Modal · inference-infra · 2026-05-21

Modal's Series C: Raising $355M at a $4.65B valuation

Score 8

We've raised $355M at a $4.65B valuation to continue building the production cloud for AI.

High signal Matched: cloud

NVIDIA Technical Blog · hardware · 2026-05-05

How to Build In-Vehicle AI Agents with NVIDIA: From Cloud to Car

Score 12

The automotive cockpit is undergoing a fundamental shift from rule-based interfaces to agentic, multimodal AI systems capable of reasoning, planning, and...

inference kv-cache benchmark hardware model-release cloud

High signal Matched: cloud, agents, agentic

LMCache · open-source · 2026-04-23

LMCache on Amazon SageMaker HyperPod: Accelerating LLM Inference with Managed Tiered KV Cache

Score 30

Overview Large language model (LLM) inference performance depends heavily on how efficiently the system manages key-value (KV) cache — the stored attention states that allow the model to avoid recomputing previous tokens. As context length...

hardware model-release cloud

High signal Matched: inference, kv cache, lmcache, performance, latency, gpu, model, sagemaker

SkyPilot · open-source · 2026-04-22

GPU Compass: Navigate the GPU Frontier Across 20+ Clouds & 2K+ Offerings

Score 18

Introducing GPU Compass: One dashboard to browse, compare pricing, and launch across every GPU cloud.

model-release cloud training agents

High signal Matched: gpu, introducing, launch, cloud

SkyPilot · open-source · 2026-04-10

SkyPilot Agent Skill: Let Agents Manage Your GPUs

Score 10

With the SkyPilot Agent Skill, your AI coding agent can launch clusters, run training jobs and manage cloud resources across any infrastructure using natural language.

High signal Matched: launch, cloud, training, agent, agents

Together AI · inference-infra · 2026-04-07

What is an AI Native Cloud?

Score 12

AI-native companies need infrastructure built for models, not legacy workloads. Learn what defines an AI Native Cloud and why it matters for the next platform shift.

inference model-release cloud

High signal Matched: cloud

LY Corporation Tech Blog · korea · 2026-04-02

Cloud infrastructure transformation at LY Corporation: introducing the architecture of Flava, the next-generation platform integrating two massive cl...

Score 14

Hello. I’m Inoue, and I work on private cloud infrastructure at LY Corporation.What powers LY Corpor...

High signal Matched: generation, introducing, cloud

Nota AI · korea · 2026-03-23

[GTC 2026 Recap] The Trillion-Dollar Inference Race Begins: How Nota AI Fills the Gap

Score 42

  Jaehoon Lee Technical Content Manager, Nota AI   GTC has evolved far beyond a technology conference, drawing attention from global economies and financial markets alike. This year, CEO Jensen Huang took the stage in his tradema...

inference serving kernel cuda kv-cache benchmark hardware model-release research cloud training long-context agents open-source

High signal Matched: inference, prefill, generation, throughput, cuda, kv cache, performance, latency, cost, gpu, npu, launch, model, research, cloud, training, long-context, context window, agent, agents, agentic, open-source

NVIDIA Technical Blog · hardware · 2026-03-23

Building a Zero-Trust Architecture for Confidential AI Factories

Score 10

AI is moving from experimentation to production. However, most data enterprises need exists outside the public cloud. This includes sensitive information like...

inference kernel research cloud

High signal Matched: cloud

Together AI · inference-infra · 2026-03-05

Key research and product announcements at the AI Native Conf

Score 18

At AI Native Conf, Together AI announced breakthroughs across kernels, RL, and inference optimization — including FlashAttention-4, ThunderAgent, and together.compile. Research that ships to production. That's the AI Native Cloud.

High signal Matched: inference, flashattention, research, cloud

SkyPilot · open-source · 2026-02-27

Don't Run OpenClaw on Your Main Machine

Score 8

OpenClaw gives an AI agent full access to your system. Here's why you should run it on an isolated cloud VM, and how to set that up.

serving moe hardware model-release cloud

High signal Matched: cloud, agent

vLLM Project · open-source · 2026-02-26

Efficiently serve dozens of fine-tuned models with vLLM on Amazon SageMaker AI and Amazon Bedrock

Score 30

Organizations and individuals running multiple custom AI models, especially recent Mixture of Experts (MoE) model families, can face the challenge of paying for idle GPU capacity when the...

model-release cloud frontier-model

High signal Matched: serve, moe, mixture of experts, gpu, model, sagemaker, bedrock

Together AI · inference-infra · 2025-12-15

Announcing native availability of NVIDIA Nemotron 3 Nano, NVIDIA’s latest reasoning model

Score 14

Nemotron 3 Nano, NVIDIA’s newest reasoning model, is now available on Together AI, the AI Native Cloud

High signal Matched: model, cloud, reasoning model

SkyPilot · open-source · 2025-12-11

SkyPilot 0.11: Multi-Cloud Pools for Batch Inference, Fast Managed Jobs, Enterprise-Ready at Scale, Programmability

Score 14

Announcing SkyPilot 0.11 with Pools for batch inference, faster managed jobs, and enterprise-scale improvements.

inference cloud

High signal Matched: inference, cloud

Together AI · inference-infra · 2025-12-03

Together AI and Meta partner to bring PyTorch Reinforcement Learning to the AI Native Cloud

Score 12

Build, train, and deploy advanced AI agents with integrated reinforcement learning on the Together platform.

High signal Matched: cloud, agents

Together AI · inference-infra · 2025-12-03

How to run TorchForge reinforcement learning pipelines in the Together AI Native Cloud

Score 12

No feed summary available yet.

High signal Matched: cloud

Hugging Face · open-source · 2025-11-13

Building for an Open Future - our new partnership with Google Cloud

Score 10

No feed summary available yet.

High signal Matched: cloud

Google Research · big-tech · 2025-10-18

Solving virtual machine puzzles: How AI is optimizing cloud computing

Score 8

Algorithms & Theory

High signal Matched: cloud

Hugging Face · open-source · 2025-10-16

Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face

Score 10

No feed summary available yet.

cloud open-source

High signal Matched: cloud, oss

Modal · inference-infra · 2025-09-16

Inside Modal Notebooks: How we built a cloud GPU notebook that boots in seconds

Score 14

Exploring the internals of our new product, a modern Jupyter notebook built for fast startup and real-time collaboration.

High signal Matched: gpu, cloud

SkyPilot · open-source · 2025-09-04

Scaling AI Infrastructure at Abridge with SkyPilot

Score 8

How we transformed our fragmented multi-cloud AI infrastructure into a unified system with SkyPilot, achieving 10x faster development cycles.

High signal Matched: cloud

SkyPilot · open-source · 2025-08-21

From SLURM to SkyPilot: How Avataar cut costs 11x with multi-cloud AI infrastructure

Score 12

Avataar's enterprise AI content platform cut costs 11x and unlocked GPU capacity by migrating from inflexible SLURM deployment to SkyPilot's multi-cloud infrastructure.

High signal Matched: gpu, cloud

SkyPilot · open-source · 2025-08-12

Self-host open-source LLM agent sandbox on your own cloud

Score 10

Your AI writes code. Now what? If you’re building AI agents in 2025, you probably wondered that as well. Your LLM generates some Python code that analyzes data, manipulates files, or calls APIs. But where does it run? Most people eit...

cloud agents open-source

inference serving benchmark hardware model-release cloud

High signal Matched: cloud, agent, agents, open-source

AIBrix · open-source · 2025-08-05

AIBrix v0.4.0 Release: P/D Disaggregation and Expert Parallelism Support, KVCache v1 Connector, KV Event Synchronization & Multi‑Engine Support

Score 20

AIBrix is a composable, cloud‑native LLM inference infrastructure designed to deliver high performance and low cost at scale. We now present a major update in a new release - v0.4.0. This release tackles key bottlenecks in orchestration an...

High signal Matched: inference, prefill, generation, token generation, throughput, performance, cost, gpu, release, cloud

SkyPilot · open-source · 2025-07-30

Slurm vs K8s for AI Infra: Academic HPC vs Cloud-Native Reality - the non-ideal solutions

Score 12

There are a lot of discussions happening in AI infrastructure right now. On one side, we have researchers who trained on Slurm in grad school, comfortable with sbatch train_model.sh and the predictability of academic HPC clusters. On the o...

distributed benchmark hardware cloud

High signal Matched: model, cloud

SkyPilot · open-source · 2025-07-16

The Evolution of AI Job Orchestration. Part 2: The AI-Native Control Plane & Orchestration that Finally Works for ML

Score 16

This is Part 2 of our series on the evolution of AI Job Orchestration. In Part 1, we explored how Neoclouds are democratizing GPU access but leaving the “last mile” unsolved. Now we’ll discover how AI-native orchestration...

High signal Matched: infiniband, performance, cost, gpu, cloud

SkyPilot · open-source · 2025-07-02

Managing Networks in the Chaotic Cloud and Kubernetes World

Score 12

Configure high-performance networking on different cloud providers and managed infrastructure with unified SkyPilot's network tier abstraction

inference kv-cache benchmark model-release cloud

High signal Matched: performance, cloud

AIBrix · open-source · 2025-05-22

AIBrix v0.3.0 Release: KVCache Offloading, Prefix Cache, Fairness Routing, and Benchmarking Tools

Score 24

AIBrix is a composable, cloud-native AI infrastructure toolkit designed to power scalable and cost-effective large language model (LLM) inference. As production demands for memory-efficient and latency-aware LLM services continue to grow,...

inference distributed model-release cloud open-source

High signal Matched: inference, prefix cache, latency, cost, release, model, cloud

llm-d · open-source · 2025-05-20

llm-d Press Release

Score 20

Red Hat launches llm-d: Open source distributed AI inference platform backed by NVIDIA, Google Cloud, IBM. Scale generative AI with intelligent routing on Kubernetes.

High signal Matched: inference, distributed, release, cloud, open source

Modal · inference-infra · 2025-05-07

Linear programming for fun and profit

Score 8

How we use an eighty-year-old algorithm to find arbitrages in the cloud market.

benchmark model-release cloud training

High signal Matched: cloud

SkyPilot · open-source · 2025-04-08

High-Performance Model Checkpointing on the Cloud

Score 18

Techniques to speed up checkpointing by 9.6x and how to easily achieve them in SkyPilot

inference model-release cloud api open-source

High signal Matched: performance, model, cloud, checkpointing

Replicate · inference-infra · 2025-03-05

Wan2.1: generate videos with an API

Score 10

Wan2.1 is the most capable open-source video generation model, producing coherent and high-quality outputs. Learn how to run it in the cloud with a single line of code.

High signal Matched: generation, model, cloud, api, open-source

Modal · inference-infra · 2025-02-24

'I paid for the whole GPU, I am going to use the whole GPU': A high-level guide to GPU utilization

Score 12

A guide to maximizing the utilization of GPUs, from cloud allocations to FLOP/s.

High signal Matched: gpu, cloud

Hugging Face · open-source · 2024-12-09

Hugging Face models in Amazon Bedrock

Score 10

No feed summary available yet.

High signal Matched: bedrock

Modal · inference-infra · 2024-11-24

Press release: Modal signs strategic collaboration agreement with AWS to deliver accelerated generative AI solutions

Score 12

Announcing Modal's newest cloud partnership.

inference kv-cache benchmark hardware model-release cloud open-source

High signal Matched: release, cloud

AIBrix · open-source · 2024-11-13

Introducing AIBrix v0.1.0: Building the Future of Scalable, Cost-Effective AI Infrastructure for Large Models

Score 32

In recent years, large language models (LLMs) have revolutionized AI applications, powering solutions in areas like chatbots, automated content generation, and advanced recommendation engines. Services like OpenAI’s have gained significant...

High signal Matched: decoding, prefill, generation, kv cache, performance, cost, gpu, release, introducing, cloud, open-source

SkyPilot · open-source · 2024-11-01

Getting $1M cloud credits for AI startups — and using them wisely

Score 10

For AI teams: How do you efficiently spend $1M+ cloud credits across 3+ clouds?

High signal Matched: cloud

Hugging Face · open-source · 2024-08-19

Deploy Meta Llama 3.1 405B on Google Cloud Vertex AI

Score 10

No feed summary available yet.

inference benchmark model-release research cloud training fine-tuning evals open-source

High signal Matched: cloud

Nota AI · korea · 2024-08-02

Deploying an Efficient Vision-Language Model on Mobile Devices

Score 38

  Jaeyeon KimResearch Engineer, Nota AI Geonmin KimResearch Engineer, Nota AI Hancheol ParkTeam Lead of NetsPresso Application, Nota AI   IntroductionRecent large language models (LLMs) have demonstrated unprecedented performance...

High signal Matched: decoding, benchmark, performance, latency, tokens/sec, model, arxiv, research, technical report, evaluation, cloud, training, lora, benchmarks, leaderboard, open-source

Replicate · inference-infra · 2024-07-23

Run Meta Llama 3.1 405B with an API

Score 8

Llama 3.1 405B: is the most powerful open-source language model from Meta. Learn how to run it in the cloud with one line of code.

High signal Matched: model, cloud, api, open-source

Hugging Face · open-source · 2024-07-09

Google Cloud TPUs made available to Hugging Face users

Score 10

No feed summary available yet.

High signal Matched: cloud

Replicate · inference-infra · 2024-06-12

Run Stable Diffusion 3 with an API

Score 8

Stable Diffusion 3 is the latest text-to-image model from Stability, with improved image quality, typography, prompt understanding, and resource efficiency. Learn how to run it in the cloud with one line of code.

model-release cloud api

High signal Matched: model, cloud, api

Hugging Face · open-source · 2024-06-07

Introducing the Hugging Face Embedding Container for Amazon SageMaker

Score 14

No feed summary available yet.

High signal Matched: introducing, sagemaker

Hugging Face · open-source · 2024-05-21

From cloud to developers: Hugging Face and Microsoft Deepen Collaboration

Score 10

No feed summary available yet.

High signal Matched: cloud

Modal · inference-infra · 2024-05-20

Why Substack moved their AI and ML pipelines to Modal

Score 8

Learn how Substack sped up their developer iteration cycles by moving ML training and deployment to Modal from AWS SageMaker.

cloud training

High signal Matched: sagemaker, training

Modal · inference-infra · 2024-05-13

Introducing: Region selection

Score 12

You can now specify which cloud region you would like to run your Functions in.

High signal Matched: introducing, cloud

Modal · inference-infra · 2024-05-07

Product updates: Cloud buckets, Okta SSO & more

Score 10

Welcome to another round of Modal Product Updates! Here's what's new this month.

High signal Matched: cloud

Replicate · inference-infra · 2024-04-23

Run Snowflake Arctic with an API

Score 8

Arctic is a new open-source language model from Snowflake. Learn how to run it in the cloud with one line of code.

High signal Matched: model, cloud, api, open-source

Replicate · inference-infra · 2024-04-18

Run Meta Llama 3 with an API

Score 8

Llama 3 is the latest language model from Meta. Learn how to run it in the cloud with one line of code.

model-release cloud api

High signal Matched: model, cloud, api

Hugging Face · open-source · 2024-03-18

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Score 14

No feed summary available yet.

inference serving benchmark model-release cloud

High signal Matched: h100, cloud

SkyPilot · open-source · 2024-02-20

Introducing SkyServe: 50% Cheaper AI Serving on Any Cloud with High Availability

Score 20

SkyServe: A simple, cost-efficient, multi-region/cloud library for serving GenAI models.

inference cloud api open-source

High signal Matched: serving, cost, introducing, cloud

Replicate · inference-infra · 2024-01-30

Run Code Llama 70B with an API

Score 8

Code Llama 70B is one of the powerful open-source code generation models. Learn how to run it in the cloud with one line of code.

High signal Matched: generation, cloud, api, open-source

Modal · inference-infra · 2023-10-10

Press release: Modal Labs announces Series A financing round

Score 14

Modal Labs Announces Series A Financing Round, Securing $16 Million Investment to Launch Cloud-Based Infrastructure Platform, Build Towards End-to-End Enterprise Data Stack

High signal Matched: release, launch, cloud

Replicate · inference-infra · 2023-10-06

How to run Mistral 7B with an API

Score 8

Mistral 7B is an open-source large language model. Learn what it's good at and how to run it in the cloud with one line of code.

High signal Matched: model, cloud, api, open-source

Hugging Face · open-source · 2023-10-03

🧨 Accelerating Stable Diffusion XL Inference with JAX on Cloud TPU v5e

Score 18

No feed summary available yet.

inference hardware cloud

High signal Matched: inference, tpu, cloud

SkyPilot · open-source · 2023-09-27

Scaling AI Robotics on the Cloud

Score 12

Covariant runs AI on the cloud using SkyPilot, delivering models 4x faster cost-effectively.

High signal Matched: cost, cloud

Hugging Face · open-source · 2023-09-26

Llama 2 on Amazon SageMaker a Benchmark

Score 14

No feed summary available yet.

High signal Matched: benchmark, sagemaker

Hugging Face · open-source · 2023-09-01

Fetch Cuts ML Processing Latency by 50% Using Amazon SageMaker & Hugging Face

Score 14

No feed summary available yet.

High signal Matched: latency, sagemaker

SkyPilot · open-source · 2023-08-02

Finetuning Llama 2 in your own cloud environment, privately

Score 10

An operational guide on finetuning Llama 2, ready for commercial use.

cloud fine-tuning

High signal Matched: cloud, finetuning

Replicate · inference-infra · 2023-07-27

Run Llama 2 with an API

Score 8

Llama 2 is the first open source language model of the same caliber as OpenAI’s models. Learn how to run it in the cloud with one line of code.

High signal Matched: model, cloud, api, open source

SkyPilot · open-source · 2023-06-29

Serving LLM 24x Faster On the Cloud with vLLM and SkyPilot

Score 14

SkyPilot makes the deployment and development of vLLM easy and fast on clouds.

inference serving cloud

inference model-release cloud

High signal Matched: serving, cloud

Hugging Face · open-source · 2023-05-31

Introducing the Hugging Face LLM Inference Container for Amazon SageMaker

Score 18

No feed summary available yet.

High signal Matched: inference, introducing, sagemaker

SkyPilot · open-source · 2023-05-02

Analyzing the Whole Mouse Brain Atlas on the Cloud With SkyPilot [User Post]

Score 12

Experience report from Salk Institute on how biologists use SkyPilot to conduct research on the cloud.

research cloud

High signal Matched: research, cloud

SkyPilot · open-source · 2023-03-20

Run LLaMA LLM chatbots on any cloud with one click

Score 10

Want to host your own LLM Chatbot on any cloud of your choosing?

model-release cloud fine-tuning

High signal Matched: cloud

Replicate · inference-infra · 2023-02-07

Introducing LoRA: A faster way to fine-tune Stable Diffusion

Score 10

It's like DreamBooth, but much faster. And you can run it in the cloud on Replicate.

High signal Matched: introducing, cloud, lora

Replicate · inference-infra · 2022-11-21

Train and deploy a DreamBooth model on Replicate

Score 10

With just a handful of images and a single API call, you can train a model, publish it to Replicate, and run predictions on it in the cloud.

model-release cloud api

benchmark model-release cloud

High signal Matched: model, cloud, api

SkyPilot · open-source · 2022-11-16

SkyPilot: ML and Data Science on any cloud with massive cost savings

Score 16

Introducing SkyPilot.

High signal Matched: cost, introducing, cloud

Hugging Face · open-source · 2022-01-11

Deploy GPT-J 6B for inference using Hugging Face Transformers and Amazon SageMaker

Score 14

No feed summary available yet.

inference cloud

High signal Matched: inference, sagemaker

Hugging Face · open-source · 2021-07-08

Deploy Hugging Face models easily with Amazon SageMaker

Score 10

No feed summary available yet.

distributed cloud training

High signal Matched: sagemaker

Hugging Face · open-source · 2021-04-08

Distributed Training: Train BART/T5 for Summarization using 🤗 Transformers and Amazon SageMaker

Score 14

No feed summary available yet.

High signal Matched: distributed, sagemaker, training, distributed training

Hugging Face · open-source · 2021-03-23

The Partnership: Amazon SageMaker and Hugging Face

Score 10

No feed summary available yet.

High signal Matched: sagemaker

Hugging Face · open-source · 2021-03-18

My Journey to a serverless transformers pipeline on Google Cloud

Score 10

No feed summary available yet.

High signal Matched: cloud

AWS Machine Learning Blog · cloud · 2026-06-02

Building a secure auth code flow setup using AgentCore Gateway with MCP clients

Score 7

This post demonstrates how to implement Open Authorization (OAuth) Code flow as an inbound authorization mechanism for MCP servers hosted on Amazon Bedrock AgentCore Gateway. By the end of this guide, you will have a production-ready setup...

Watchlist Matched: bedrock, mcp

Cloudflare Blog · cloud · 2026-04-28

Shutdowns, power outages, and conflict: a review of Q1 2026 Internet disruptions

Score 0

The first quarter of 2026 saw a surge in Internet disruptions, from nationwide shutdowns in Uganda and Iran to unprecedented drone strikes on cloud infrastructure. We explore the data behind these events using Cloudflare Radar.

Watchlist Matched: cloud

Cloudflare Blog · cloud · 2026-04-20

Building the agentic cloud: everything we launched during Agents Week 2026

Score 6

Agents Week 2026 is a wrap. Let’s take a look at everything we announced, from compute and security to the agent toolbox, platform tools, and the emerging agentic web. Everything we shipped for the agentic cloud.