AWS Machine Learning Blog · cloud · 2026-06-03
Score 11
Fine-tuning for domain-specific tasks means improving performance in one area without degrading the model’s general capabilities, and getting that balance right is harder than it looks. This post walks through how to navigate that balance,...
High signal Matched: performance, model, training, checkpointing, fine-tuning
LMCache · open-source · 2026-05-27
Score 11
A collaboration story about LMCache multiprocess mode + MooncakeStore — From 0 to 1, from functional to optimized. 1. Before We Begin Recently, the LMCache community and the Mooncake community carried out a series of valuable open-source c...
High signal Matched: lmcache, adapter, open-source, open source
NVIDIA Technical Blog · hardware · 2026-05-12
Score 16
The path from a trained AI model to production should be smooth, but rarely is. Many teams invest weeks fine-tuning models, only to discover that exporting to a...
High signal Matched: serving, model, fine-tuning
BAIR · research · 2026-05-08
Score 28
.apr-fig { text-align: center; margin: 1.35em 0; line-height: 1.4; } .apr-fig--wide img { display: inline-block; width: 100%; max-width: 100%; height: auto; vertical-align: middle; } .apr-fig--wide-0-8 { max-width: 80%; margin-left: auto;...
High signal Matched: inference, decoding, prefill, generation, serve, throughput, kv cache, verification, performance, latency, cost, model, paper, research, evaluation, training, pretraining, sft, benchmarks, long context, context window, agentic, reasoning model
Nota AI · korea · 2026-04-29
Score 32
Hancheol Park, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonho LeeEdge AI Engineer Intern, NetsPresso Tech, Nota AI Jaehoon Lee Technical Content Manager,...
High signal Matched: generation, moe, performance, model, weights, paper, research, evaluation, korea, korean, seoul, naver, training, fine-tuning, quantization, agent, agents, agentic
Nota AI · korea · 2026-03-31
Score 46
Jaehoon Lee Technical Content Manager, Nota AI In March, a single official announcement from Google Research rocked trillions of won in the market capitalization of U.S. infrastructure and semiconductor stocks. The catalyst:...
High signal Matched: inference, serving, generation, throughput, kv cache, benchmark, performance, cost, b200, blackwell, introducing, model, fp8, research, training, fine-tuning, quantization, quantized, agent, agentic, frontier model
Together AI · inference-infra · 2026-03-18
Score 14
Together AI expands fine-tuning with native support for tool call, reasoning, and vision-language models, plus 100B+ model training, up to 6× higher throughput, and job cost and ETA estimates.
High signal Matched: throughput, cost, model, training, fine-tuning
AIBrix · open-source · 2026-03-03
Score 28
🚀 AIBrix v0.6.0 Release Today we’re excited to announce AIBrix v0.6.0, a release that expands how you deploy and route inference traffic. Key highlights include: Envoy Sidecar Support – Run Envoy alongside the gateway-plugin without...
High signal Matched: inference, prefill, release, model, lora, rerank, api, openai-compatible
llm-d · open-source · 2026-02-04
Score 16
llm-d v0.5 introduces hierarchical KV-cache offloading, LoRA-aware scheduling, UCCL networking, and scale-to-zero autoscaling for sustained inference performance at scale.
High signal Matched: inference, performance, lora
Together AI · inference-infra · 2026-02-02
Score 14
Fine-tuned open-source LLM judges can outperform GPT-5.2 at evaluating model outputs. Using Direct Preference Optimization on just 5,400 preference pairs, we trained GPT-OSS 120B to beat GPT-5.2 on human preference alignment—at 15x lower c...
High signal Matched: inference, cost, model, fine-tuning, evaluating, open-source, oss
SqueezeBits · korea · 2026-01-07
Score 12
A recap of the Intel® Gaudi® hands-on workshop co-hosted by SqueezeBits and Lablup. AI model compression, fine-tuning, and vLLM serving on Gaudi® hardware with Backend.AI.
High signal Matched: serving, model, fine-tuning
Together AI · inference-infra · 2025-08-19
Score 10
Customize OpenAI’s gpt-oss-20B/120B with Together AI’s fine-tuning: train, optimize, and instantly deploy domain experts with enterprise reliability and cost efficiency.
High signal Matched: cost, fine-tuning, oss
Together AI · inference-infra · 2025-08-15
Score 12
Parsed fine-tuned a 27B open-source model to beat Claude Sonnet 4 by 60% on a real-world healthcare task—while running 10–100x cheaper.
High signal Matched: model, fine-tuning, open-source
Hugging Face · open-source · 2025-07-23
Score 10
No feed summary available yet.
High signal Matched: inference, lora
SqueezeBits · korea · 2025-07-21
Score 20
LoRA excels at efficient fine-tuning but suffers at higher ranks due to gradient entanglement. We introduce GraLoRA, which addresses these issues through finer-grained, block-wise updates, significantly enhancing performance and expressivi...
High signal Matched: performance, cost, fine-tuning, lora
Nota AI · korea · 2025-07-10
Score 20
Marcel Simon, Ph. D.ML Researcher, Nota AI GmbH Tae-Ho KimCTO & Co-Founder, Nota AI Seul-Ki Yeom, Ph. D.Research Lead, Nota AI GmbH SummaryProposes a simple next-frame prediction task using unlabeled video to enhance sing...
High signal Matched: inference, performance, model, paper, research, training, fine-tuning, benchmarks
BAIR · research · 2025-04-11
Score 10
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated ap...
High signal Matched: cost, model, evaluation, training, dpo, fine-tuning, retrieval, api, sota
Nota AI · korea · 2025-02-25
Score 18
Hancheol Park, Ph. D.AI Research Engineer, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, Nota AI Jaeyeon KimAI Research Engineer, Nota AI SummaryIn this study, we propose a method for determining whether given multilingual...
High signal Matched: generation, performance, model, paper, research, training, fine-tuning
SqueezeBits · korea · 2024-12-05
Score 14
This article provides a comparative analysis of multi-LoRA serving capabilities of vLLM and TensorRT-LLM frameworks.
High signal Matched: serving, lora
Hugging Face · open-source · 2024-11-04
Score 10
No feed summary available yet.
High signal Matched: evaluation, fine-tuning
Nota AI · korea · 2024-08-02
Score 38
Jaeyeon KimResearch Engineer, Nota AI Geonmin KimResearch Engineer, Nota AI Hancheol ParkTeam Lead of NetsPresso Application, Nota AI IntroductionRecent large language models (LLMs) have demonstrated unprecedented performance...
High signal Matched: decoding, benchmark, performance, latency, tokens/sec, model, arxiv, research, technical report, evaluation, cloud, training, lora, benchmarks, leaderboard, open-source
Hugging Face · open-source · 2024-07-25
Score 10
No feed summary available yet.
High signal Matched: evaluation, fine-tuning
Hugging Face · open-source · 2024-07-18
Score 10
No feed summary available yet.
High signal Matched: serve, lora
Modal · inference-infra · 2024-05-21
Score 8
How we fine-tuned a Stable Diffusion model on the Heroicons library to generate all the icons we could dream of.
High signal Matched: model, fine-tuning
Hugging Face · open-source · 2023-12-05
Score 10
No feed summary available yet.
High signal Matched: inference, lora
Hugging Face · open-source · 2023-11-07
Score 10
No feed summary available yet.
High signal Matched: performance, lora
SkyPilot · open-source · 2023-08-02
Score 10
An operational guide on finetuning Llama 2, ready for commercial use.
High signal Matched: cloud, finetuning
Replicate · inference-infra · 2023-03-23
Score 8
No feed summary available yet.
High signal Matched: model, lora
Hugging Face · open-source · 2023-03-09
Score 10
No feed summary available yet.
High signal Matched: gpu, rlhf, fine-tuning
Replicate · inference-infra · 2023-02-07
Score 10
It's like DreamBooth, but much faster. And you can run it in the cloud on Replicate.
High signal Matched: introducing, cloud, lora
Hugging Face · open-source · 2021-11-19
Score 10
No feed summary available yet.
High signal Matched: distributed, fine-tuning
TensorRT-LLM · open-source · 2026-06-03
Score 6
No feed summary available yet.
Watchlist Matched: lora
Fireworks AI · inference-infra · 2026-06-03
Score 6
No feed summary available yet.
Watchlist Matched: fine-tuning
Together AI · inference-infra · 2026-04-30
Score 3
Together AI and Adaption partner to bring Together Fine-Tuning natively into Adaptive Data, helping teams optimize datasets, run fine-tuning, evaluate results, and deploy stronger open models.
Watchlist Matched: fine-tuning, evaluate
Hugging Face · open-source · 2026-04-16
Score 1
No feed summary available yet.
Watchlist Matched: training, finetuning
AI2 · research · 2026-03-11
Score 6
MolmoBot is an open robotic manipulation model suite trained entirely in simulation—demonstrating zero-shot transfer to real-world robots without any real-world data collection or fine-tuning.
Watchlist Matched: model, training, fine-tuning
Hugging Face · open-source · 2025-11-21
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Together AI · inference-infra · 2025-09-10
Score 3
Together AI expands Fine-Tuning Platform: train 100B+ models, extend context lengths, integrate with Hugging Face Hub, and access new DPO options.
Watchlist Matched: dpo, fine-tuning
Hugging Face · open-source · 2025-07-01
Score 1
No feed summary available yet.
Watchlist Matched: training, finetuning
Hugging Face · open-source · 2025-06-19
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning, lora
Together AI · inference-infra · 2025-05-29
Score 3
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2025-04-23
Score 1
No feed summary available yet.
Watchlist Matched: finetuning
Together AI · inference-infra · 2025-04-17
Score 3
No feed summary available yet.
Watchlist Matched: training, fine-tuning
Replicate · inference-infra · 2025-03-28
Score 0
We take a quick look at the latest creative models, experiments, and community projects.
Watchlist Matched: lora
Hugging Face · open-source · 2025-03-26
Score 1
No feed summary available yet.
Watchlist Matched: training, finetuning
Modal · inference-infra · 2024-12-10
Score 1
An intro to fine-tuning large language models in 2025
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2024-09-18
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning, quantization
Replicate · inference-infra · 2024-08-15
Score 6
We've added fine-tuning (LoRA) support to FLUX.1 image generation models. You can train FLUX.1 on your own images with one line of code using Replicate's API.
Watchlist Matched: generation, fine-tuning, lora, api
Hugging Face · open-source · 2024-06-24
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2024-05-28
Score 1
No feed summary available yet.
Watchlist Matched: training, finetuning
Hugging Face · open-source · 2024-02-23
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2024-01-10
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2024-01-02
Score 1
No feed summary available yet.
Watchlist Matched: training, lora
Modal · inference-infra · 2023-12-20
Score 1
An operational guide to fine-tuning an LLM on any dataset in minutes (ft. CodeLlama, Llama 2, Mistral, and more)
Watchlist Matched: fine-tuning
Replicate · inference-infra · 2023-12-06
Score 0
We’ve added fine-tuning for realistic voice cloning (RVC). You can train RVC on your own dataset from a YouTube video with a few lines of code using Replicate's API.
Watchlist Matched: fine-tuning, api, open-source
Replicate · inference-infra · 2023-10-13
Score 0
We’ve added fine-tuning support to MusicGen. You can train the small, medium and melody models on your own audio files using Replicate.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2023-09-13
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Replicate · inference-infra · 2023-08-22
Score 6
With the recent release of Stable Diffusion XL fine-tuning on Replicate, and today being the 1-year anniversary of Stable Diffusion, now feels like the perfect opportunity to take a step back and reflect on how text-to-image AI has improve...
Watchlist Matched: release, fine-tuning
Replicate · inference-infra · 2023-08-08
Score 0
We’ve added fine-tuning (Dreambooth, Textual Inversion and LoRA) support to SDXL 1.0. You can train SDXL on your own images with one line of code using the Replicate API.
Watchlist Matched: fine-tuning, lora, api
Hugging Face · open-source · 2023-07-14
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2023-06-19
Score 1
No feed summary available yet.
Watchlist Matched: adapter
Hugging Face · open-source · 2023-05-24
Score 1
No feed summary available yet.
Watchlist Matched: qlora, quantization
Hugging Face · open-source · 2023-02-10
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning
Hugging Face · open-source · 2023-01-26
Score 1
No feed summary available yet.
Watchlist Matched: fine-tuning, lora
Hugging Face · open-source · 2021-10-13
Score 1
No feed summary available yet.
Watchlist Matched: fine tuning