fine-tuning - MLSys Blogs

AWS Machine Learning Blog · cloud · 2026-06-03

The art and science of hyperparameter optimization on Amazon Nova Forge

Score 11

Fine-tuning for domain-specific tasks means improving performance in one area without degrading the model’s general capabilities, and getting that balance right is harder than it looks. This post walks through how to navigate that balance,...

benchmark model-release training fine-tuning

Open

High signal Matched: performance, model, training, checkpointing, fine-tuning

LMCache · open-source · 2026-05-27

When Open Source Meets Open Source: A Joint Effort Between LMCache and Mooncake

Score 11

A collaboration story about LMCache multiprocess mode + MooncakeStore — From 0 to 1, from functional to optimized. 1. Before We Begin Recently, the LMCache community and the Mooncake community carried out a series of valuable open-source c...

kv-cache fine-tuning open-source

Open

High signal Matched: lmcache, adapter, open-source, open source

NVIDIA Technical Blog · hardware · 2026-05-12

How to Eliminate Pipeline Friction in AI Model Serving

Score 16

The path from a trained AI model to production should be smooth, but rarely is. Many teams invest weeks fine-tuning models, only to discover that exporting to a...

inference serving model-release fine-tuning

Open

High signal Matched: serving, model, fine-tuning

BAIR · research · 2026-05-08

Adaptive Parallel Reasoning: The Next Paradigm in Efficient Inference Scaling

Score 28

.apr-fig { text-align: center; margin: 1.35em 0; line-height: 1.4; } .apr-fig--wide img { display: inline-block; width: 100%; max-width: 100%; height: auto; vertical-align: middle; } .apr-fig--wide-0-8 { max-width: 80%; margin-left: auto;...

inference serving kv-cache speculative-decoding benchmark model-release research training fine-tuning evals long-context agents frontier-model

Open

High signal Matched: inference, decoding, prefill, generation, serve, throughput, kv cache, verification, performance, latency, cost, model, paper, research, evaluation, training, pretraining, sft, benchmarks, long context, context window, agentic, reasoning model

Nota AI · korea · 2026-04-29

[NVIDIA Nemotron Hackathon] Grand Prize Among 20 Teams: Behind Two Sleepless Days

Score 32

  Hancheol Park, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonho LeeEdge AI Engineer Intern, NetsPresso Tech, Nota AI Jaehoon Lee Technical Content Manager,...

inference moe benchmark model-release research korea training fine-tuning quantization evals agents

Open

High signal Matched: generation, moe, performance, model, weights, paper, research, evaluation, korea, korean, seoul, naver, training, fine-tuning, quantization, agent, agents, agentic

Nota AI · korea · 2026-03-31

The Real Reason TurboQuant Shook the Market: AI Optimization Has Gone Mainstream

Score 46

  Jaehoon Lee Technical Content Manager, Nota AI   In March, a single official announcement from Google Research rocked trillions of won in the market capitalization of U.S. infrastructure and semiconductor stocks. The catalyst:...

inference serving kv-cache benchmark hardware model-release research training fine-tuning quantization agents frontier-model

Open

High signal Matched: inference, serving, generation, throughput, kv cache, benchmark, performance, cost, b200, blackwell, introducing, model, fp8, research, training, fine-tuning, quantization, quantized, agent, agentic, frontier model

Together AI · inference-infra · 2026-03-18

Together AI expands fine-tuning service with tool calling, reasoning, and vision support

Score 14

Together AI expands fine-tuning with native support for tool call, reasoning, and vision-language models, plus 100B+ model training, up to 6× higher throughput, and job cost and ETA estimates.

serving benchmark model-release training fine-tuning

Open

High signal Matched: throughput, cost, model, training, fine-tuning

AIBrix · open-source · 2026-03-03

AIBrix v0.6.0 Release: Envoy Sidecar, Mixed LLM Workloads Routing, Routing Profiles, LoRA Delivery & New APIs

Score 28

🚀 AIBrix v0.6.0 Release Today we’re excited to announce AIBrix v0.6.0, a release that expands how you deploy and route inference traffic. Key highlights include: Envoy Sidecar Support – Run Envoy alongside the gateway-plugin without...

inference model-release fine-tuning rag api

Open

High signal Matched: inference, prefill, release, model, lora, rerank, api, openai-compatible

llm-d · open-source · 2026-02-04

llm-d 0.5: Sustaining Performance at Scale

Score 16

llm-d v0.5 introduces hierarchical KV-cache offloading, LoRA-aware scheduling, UCCL networking, and scale-to-zero autoscaling for sustained inference performance at scale.

inference benchmark fine-tuning

Open

High signal Matched: inference, performance, lora

Together AI · inference-infra · 2026-02-02

Fine-tuning open LLM judges to outperform GPT-5.2

Score 14

Fine-tuned open-source LLM judges can outperform GPT-5.2 at evaluating model outputs. Using Direct Preference Optimization on just 5,400 preference pairs, we trained GPT-OSS 120B to beat GPT-5.2 on human preference alignment—at 15x lower c...

inference benchmark model-release fine-tuning evals open-source

Open

High signal Matched: inference, cost, model, fine-tuning, evaluating, open-source, oss

SqueezeBits · korea · 2026-01-07

Intel® Gaudi® Hands-on Workshop | A Recap of the Gaudi Workshop with SqueezeBits x Lablup

Score 12

A recap of the Intel® Gaudi® hands-on workshop co-hosted by SqueezeBits and Lablup. AI model compression, fine-tuning, and vLLM serving on Gaudi® hardware with Backend.AI.

inference serving model-release fine-tuning

Open

High signal Matched: serving, model, fine-tuning

Together AI · inference-infra · 2025-08-19

Transform OpenAI gpt-oss Models into Domain Experts with Together AI Fine-Tuning

Score 10

Customize OpenAI’s gpt-oss-20B/120B with Together AI’s fine-tuning: train, optimize, and instantly deploy domain experts with enterprise reliability and cost efficiency.

benchmark fine-tuning open-source

Open

High signal Matched: cost, fine-tuning, oss

Together AI · inference-infra · 2025-08-15

Fine-Tuning Small Open-Source LLMs to Outperform Large Closed-Source Models by 60% on Specialized Tasks

Score 12

Parsed fine-tuned a 27B open-source model to beat Claude Sonnet 4 by 60% on a real-world healthcare task—while running 10–100x cheaper.

model-release fine-tuning open-source

Open

High signal Matched: model, fine-tuning, open-source

Hugging Face · open-source · 2025-07-23

Fast LoRA inference for Flux with Diffusers and PEFT

Score 10

No feed summary available yet.

inference fine-tuning

Open

High signal Matched: inference, lora

SqueezeBits · korea · 2025-07-21

GraLoRA: Boosting Fine-Tuning Accuracy Without Extra Cost

Score 20

LoRA excels at efficient fine-tuning but suffers at higher ranks due to gradient entanglement. We introduce GraLoRA, which addresses these issues through finer-grained, block-wise updates, significantly enhancing performance and expressivi...

benchmark fine-tuning

Open

High signal Matched: performance, cost, fine-tuning, lora

Nota AI · korea · 2025-07-10

Video Self-Distillation for Single-Image Encoders: Learning Temporal Priors from Unlabeled Video

Score 20

  Marcel Simon, Ph. D.ML Researcher, Nota AI GmbH Tae-Ho KimCTO & Co-Founder, Nota AI Seul-Ki Yeom, Ph. D.Research Lead, Nota AI GmbH   SummaryProposes a simple next-frame prediction task using unlabeled video to enhance sing...

inference benchmark model-release research training fine-tuning evals

Open

High signal Matched: inference, performance, model, paper, research, training, fine-tuning, benchmarks

BAIR · research · 2025-04-11

Defending against Prompt Injection with Structured Queries (StruQ) and Preference Optimization (SecAlign)

Score 10

Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated ap...

benchmark model-release research training fine-tuning evals rag api frontier-model

Open

High signal Matched: cost, model, evaluation, training, dpo, fine-tuning, retrieval, api, sota

Nota AI · korea · 2025-02-25

A Study on Detecting LLM-Generated Multilingual Content

Score 18

  Hancheol Park, Ph. D.AI Research Engineer, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, Nota AI Jaeyeon KimAI Research Engineer, Nota AI   SummaryIn this study, we propose a method for determining whether given multilingual...

inference benchmark model-release research training fine-tuning

Open

High signal Matched: generation, performance, model, paper, research, training, fine-tuning

SqueezeBits · korea · 2024-12-05

[vLLM vs TensorRT-LLM] #10 Serving Multiple LoRAs at Once

Score 14

This article provides a comparative analysis of multi-LoRA serving capabilities of vLLM and TensorRT-LLM frameworks.

inference serving fine-tuning

Open

High signal Matched: serving, lora

Hugging Face · open-source · 2024-11-04

Argilla 2.4: Easily Build Fine-Tuning and Evaluation Datasets on the Hub — No Code Required

Score 10

No feed summary available yet.

research fine-tuning evals

Open

High signal Matched: evaluation, fine-tuning

Nota AI · korea · 2024-08-02

Deploying an Efficient Vision-Language Model on Mobile Devices

Score 38

  Jaeyeon KimResearch Engineer, Nota AI Geonmin KimResearch Engineer, Nota AI Hancheol ParkTeam Lead of NetsPresso Application, Nota AI   IntroductionRecent large language models (LLMs) have demonstrated unprecedented performance...

inference benchmark model-release research cloud training fine-tuning evals open-source

Open

High signal Matched: decoding, benchmark, performance, latency, tokens/sec, model, arxiv, research, technical report, evaluation, cloud, training, lora, benchmarks, leaderboard, open-source

Hugging Face · open-source · 2024-07-25

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Score 10

No feed summary available yet.

research fine-tuning evals

Open

High signal Matched: evaluation, fine-tuning

Hugging Face · open-source · 2024-07-18

TGI Multi-LoRA: Deploy Once, Serve 30 Models

Score 10

No feed summary available yet.

serving fine-tuning

Open

High signal Matched: serve, lora

Modal · inference-infra · 2024-05-21

Create an infinite icon library by fine-tuning Stable Diffusion

Score 8

How we fine-tuned a Stable Diffusion model on the Heroicons library to generate all the icons we could dream of.

model-release fine-tuning

Open

High signal Matched: model, fine-tuning

Hugging Face · open-source · 2023-12-05

Goodbye cold boot - how we made LoRA Inference 300% faster

Score 10

No feed summary available yet.

inference fine-tuning

Open

High signal Matched: inference, lora

Hugging Face · open-source · 2023-11-07

Comparing the Performance of LLMs: A Deep Dive into Roberta, Llama 2, and Mistral for Disaster Tweets Analysis with Lora

Score 10

No feed summary available yet.

benchmark fine-tuning

Open

High signal Matched: performance, lora

SkyPilot · open-source · 2023-08-02

Finetuning Llama 2 in your own cloud environment, privately

Score 10

An operational guide on finetuning Llama 2, ready for commercial use.

cloud fine-tuning

Open

High signal Matched: cloud, finetuning

Replicate · inference-infra · 2023-03-23

How to use Alpaca-LoRA to fine-tune a model like ChatGPT

Score 8

No feed summary available yet.

model-release fine-tuning

Open

High signal Matched: model, lora

Hugging Face · open-source · 2023-03-09

Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU

Score 10

No feed summary available yet.

hardware training fine-tuning

Open

High signal Matched: gpu, rlhf, fine-tuning

Replicate · inference-infra · 2023-02-07

Introducing LoRA: A faster way to fine-tune Stable Diffusion

Score 10

It's like DreamBooth, but much faster. And you can run it in the cloud on Replicate.

model-release cloud fine-tuning

Open

High signal Matched: introducing, cloud, lora

Hugging Face · open-source · 2021-11-19

Accelerating PyTorch distributed fine-tuning with Intel technologies

Score 10

No feed summary available yet.

distributed fine-tuning

Open

High signal Matched: distributed, fine-tuning

TensorRT-LLM · open-source · 2026-06-03

Generate text with multiple LoRA adapters

Score 6

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: lora

Fireworks AI · inference-infra · 2026-06-03

The Fine-Tuning Bottleneck Isn't the Algorithm

Score 6

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Together AI · inference-infra · 2026-04-30

Announcing Together AI and Adaption Partnership

Score 3

Together AI and Adaption partner to bring Together Fine-Tuning natively into Adaptive Data, helping teams optimize datasets, run fine-tuning, evaluate results, and deploy stronger open models.

fine-tuning evals

Open

Watchlist Matched: fine-tuning, evaluate

Hugging Face · open-source · 2026-04-16

Training and Finetuning Multimodal Embedding & Reranker Models with Sentence Transformers

Score 1

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, finetuning

AI2 · research · 2026-03-11

MolmoBot: Training robot manipulation entirely in simulation

Score 6

MolmoBot is an open robotic manipulation model suite trained entirely in simulation—demonstrating zero-shot transfer to real-world robots without any real-world data collection or fine-tuning.

model-release training fine-tuning

Open

Watchlist Matched: model, training, fine-tuning

Hugging Face · open-source · 2025-11-21

20x Faster TRL Fine-tuning with RapidFire AI

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Together AI · inference-infra · 2025-09-10

Fine-Tuning Platform Upgrades: Larger Models, Longer Contexts, Enhanced Hugging Face Integrations

Score 3

Together AI expands Fine-Tuning Platform: train 100B+ models, extend context lengths, integrate with Hugging Face Hub, and access new DPO options.

training fine-tuning

Open

Watchlist Matched: dpo, fine-tuning

Hugging Face · open-source · 2025-07-01

Training and Finetuning Sparse Embedding Models with Sentence Transformers

Score 1

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, finetuning

Hugging Face · open-source · 2025-06-19

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning, lora

Together AI · inference-infra · 2025-05-29

FLUX.1 Kontext models: Character consistency and precise image editing without fine-tuning

Score 3

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2025-04-23

Finetuning olmOCR to be a faithful OCR-Engine

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: finetuning

Together AI · inference-infra · 2025-04-17

Together Fine-Tuning Platform, Now With Preference Optimization and Continued Training

Score 3

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, fine-tuning

Replicate · inference-infra · 2025-03-28

Creative roundup: avatars, lightsabers, and LoRA tricks

Score 0

We take a quick look at the latest creative models, experiments, and community projects.

fine-tuning

Open

Watchlist Matched: lora

Hugging Face · open-source · 2025-03-26

Training and Finetuning Reranker Models with Sentence Transformers

Score 1

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, finetuning

Modal · inference-infra · 2024-12-10

What is LLM fine-tuning?

Score 1

An intro to fine-tuning large language models in 2025

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2024-09-18

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Score 1

No feed summary available yet.

fine-tuning quantization

Open

Watchlist Matched: fine-tuning, quantization

Replicate · inference-infra · 2024-08-15

Fine-tune FLUX.1 with your own images

Score 6

We've added fine-tuning (LoRA) support to FLUX.1 image generation models. You can train FLUX.1 on your own images with one line of code using Replicate's API.

inference fine-tuning api

Open

Watchlist Matched: generation, fine-tuning, lora, api

Hugging Face · open-source · 2024-06-24

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2024-05-28

Training and Finetuning Embedding Models with Sentence Transformers

Score 1

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, finetuning

Hugging Face · open-source · 2024-02-23

Fine-Tuning Gemma Models in Hugging Face

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2024-01-10

Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2024-01-02

LoRA training scripts of the world, unite!

Score 1

No feed summary available yet.

training fine-tuning

Open

Watchlist Matched: training, lora

Modal · inference-infra · 2023-12-20

How to fine-tune an LLM on Modal

Score 1

An operational guide to fine-tuning an LLM on any dataset in minutes (ft. CodeLlama, Llama 2, Mistral, and more)

fine-tuning

Open

Watchlist Matched: fine-tuning

Replicate · inference-infra · 2023-12-06

Clone your voice using open-source models

Score 0

We’ve added fine-tuning for realistic voice cloning (RVC). You can train RVC on your own dataset from a YouTube video with a few lines of code using Replicate's API.

fine-tuning api open-source

Open

Watchlist Matched: fine-tuning, api, open-source

Replicate · inference-infra · 2023-10-13

Fine-tune MusicGen to generate music in any style

Score 0

We’ve added fine-tuning support to MusicGen. You can train the small, medium and melody models on your own audio files using Replicate.

fine-tuning

Open

Watchlist Matched: fine-tuning

Hugging Face · open-source · 2023-09-13

Fine-tuning Llama 2 70B using PyTorch FSDP

Score 1

No feed summary available yet.

fine-tuning

Open

Watchlist Matched: fine-tuning

Replicate · inference-infra · 2023-08-22

Painting with words: a history of text-to-image AI

Score 6

With the recent release of Stable Diffusion XL fine-tuning on Replicate, and today being the 1-year anniversary of Stable Diffusion, now feels like the perfect opportunity to take a step back and reflect on how text-to-image AI has improve...

model-release fine-tuning

Open

Watchlist Matched: release, fine-tuning

Replicate · inference-infra · 2023-08-08

Fine-tune SDXL with your own images

Score 0

We’ve added fine-tuning (Dreambooth, Textual Inversion and LoRA) support to SDXL 1.0. You can train SDXL on your own images with one line of code using the Replicate API.

fine-tuning api

Open

Watchlist Matched: fine-tuning, lora, api

Hugging Face · open-source · 2023-07-14