Moreh · korea · 2026-06-03
Moreh vLLM Performance Evaluation: DeepSeek V3/R1 671B on AMD Instinct MI300X GPUs
No feed summary available yet.
High signal Matched: performance, mi300x, evaluation
Moreh · korea · 2026-06-03
No feed summary available yet.
High signal Matched: performance, mi300x, evaluation
Moreh · korea · 2026-06-03
No feed summary available yet.
High signal Matched: performance, mi300x, evaluation
Perplexity Research · model-lab · 2026-06-03
No feed summary available yet.
High signal Matched: research
Prime Intellect · inference-infra · 2026-06-03
No feed summary available yet.
High signal Matched: research
OpenAI · model-lab · 2026-06-03
No feed summary available yet.
High signal Matched: research
Stanford CRFM · research · 2026-06-03
No feed summary available yet.
High signal Matched: model, evaluation
Anthropic · model-lab · 2026-06-03
No feed summary available yet.
High signal Matched: research
Cohere · model-lab · 2026-06-03
No feed summary available yet.
High signal Matched: research
LG AI Research · korea · 2026-06-03
No feed summary available yet.
High signal Matched: research
Databricks AI · big-tech · 2026-06-03
No feed summary available yet.
High signal Matched: research
NVIDIA Technical Blog · hardware · 2026-06-02
AI agents are a powerful tool for synthesizing data to accelerate research, summarize information, and help teams make decisions faster. But combining internal...
High signal Matched: research, agent, agents
AWS Machine Learning Blog · cloud · 2026-06-02
In this post, we walk through how to use Amazon Quick Research to integrate biomedical data sources for rare cancer research. The walkthrough uses pediatric sarcoma as the research domain and draws on publicly available datasets from PubMe...
High signal Matched: research
vLLM Project · open-source · 2026-06-01
A technical deep dive on running vLLM on NVIDIA DGX Spark and GB10 systems, covering sm_121 architecture, unified memory behavior, NVFP4 model serving, Nemotron-3-Super configuration, Docker deployment, Prometheus metrics, and local evalua...
High signal Matched: serving, model, evaluation
Nota AI · korea · 2026-05-29
Jaehoon Lee Technical Content Manager, Nota AI When enterprises adopt AI, the most common bottleneck is not model development. It is the deployment stage: getting a finished model to run reliably on the actual target device.T...
High signal Matched: inference, throughput, benchmark, performance, latency, cost, gpu, model, evaluation, quantization, int8, benchmarks, leaderboard
Google Research · big-tech · 2026-05-29
General Science
High signal Matched: research
AWS Machine Learning Blog · cloud · 2026-05-29
This post combines learnings from LangChain’s work on evaluating deep agents and Anthropic’s guide to demystifying evals for AI agents into a practical guide. In this post, you will learn how to: 1) apply five evaluation patterns for deep...
High signal Matched: evaluation, bedrock, evals, evaluating, agent, agents
AWS Machine Learning Blog · cloud · 2026-05-29
Datasets in AgentCore is in public preview. Agent evaluation is most powerful when you combine fast-moving online signals with stable offline baselines. To understand whether your agent is truly improving over time, you need a fixed benchm...
High signal Matched: benchmark, evaluation, bedrock, agent
vLLM Project · open-source · 2026-05-26
The EAGLE series — including EAGLE 1, EAGLE 2, and EAGLE 3 — has become one of the most widely adopted and practically deployed families of speculative decoding algorithms across both research and...
High signal Matched: decoding, speculative decoding, eagle, research
NVIDIA Technical Blog · hardware · 2026-05-20
Agent harnesses like Claude Code, Codex, and LangChain Deep Agents are excellent orchestrators. They manage sessions, chain tools, execute code, and respond to...
High signal Matched: research, agent, agents
Google Research · big-tech · 2026-05-20
General Science
High signal Matched: research
NVIDIA Technical Blog · hardware · 2026-05-19
Evaluating an AI model and evaluating an AI agent are related—but they answer fundamentally different questions. A model benchmark tests the capability of a...
High signal Matched: benchmark, model, evaluation, evaluating, agent, agentic
Microsoft Research · big-tech · 2026-05-16
Our recent paper, “LLMs Corrupt Your Documents When You Delegate”, has generated discussion about the reliability of AI systems in delegated workflows. We appreciate the interest in this work and want to clarify several important points ab...
High signal Matched: paper, research, evaluation
Together AI · inference-infra · 2026-05-15
Together AI partners with Pearl Research Labs to launch a discounted Pearl-powered inference endpoint for Gemma-4-31B-it-pearl, using Proof of Useful Work to turn AI workloads into crypto emissions.
High signal Matched: inference, endpoint, cost, launch, research
Microsoft Research · big-tech · 2026-05-14
mimalloc is an open-source, modern, scalable memory allocator that is a drop-in replacement for malloc and free. It is relatively small (~12K lines), with clear internal data structures, and is easy to build and integrate into other projec...
High signal Matched: performance, research, open-source
Microsoft Research · big-tech · 2026-05-14
Introducing GridSFM, a small foundation model that can predict AC optimal power flow in milliseconds, boosting efficiency and unlocking cost savings. Learn how GridSFM gives grid operators direct visibility into congestion, stability, and...
High signal Matched: cost, introducing, model, research
Microsoft Research · big-tech · 2026-05-12
MatterSim is expanding what AI can do for materials science—from faster large-scale simulations to MatterSim-MT, a new multi-task model for simulating properties beyond potential energy surfaces alone. The post Advancing AI for materials w...
High signal Matched: model, research
Nota AI · korea · 2026-05-11
Jaehoon Lee Technical Content Manager, Nota AI NetsPresso® now embraces AI agents. An easy-to-use interface sits on top of the validated pipeline that handles everything from model compression to device deployment.When a user...
High signal Matched: inference, endpoint, kernel, verification, moe, benchmark, latency, cost, gpu, release, model, evaluation, quantization, quantized, int4, evaluate, benchmarks, swe-bench, mmlu, agent, agents, api
BAIR · research · 2026-05-08
.apr-fig { text-align: center; margin: 1.35em 0; line-height: 1.4; } .apr-fig--wide img { display: inline-block; width: 100%; max-width: 100%; height: auto; vertical-align: middle; } .apr-fig--wide-0-8 { max-width: 80%; margin-left: auto;...
High signal Matched: inference, decoding, prefill, generation, serve, throughput, kv cache, verification, performance, latency, cost, model, paper, research, evaluation, training, pretraining, sft, benchmarks, long context, context window, agentic, reasoning model
Together AI · inference-infra · 2026-05-04
As AI moves from research to production, the challenge for AI-native teams shifts from building models to running them — efficiently, reliably, and at scale.
High signal Matched: inference, research
Google Research · big-tech · 2026-04-30
Data Mining & Modeling
High signal Matched: research
Nota AI · korea · 2026-04-29
Hancheol Park, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, NetsPresso Tech, Nota AI Geonho LeeEdge AI Engineer Intern, NetsPresso Tech, Nota AI Jaehoon Lee Technical Content Manager,...
High signal Matched: generation, moe, performance, model, weights, paper, research, evaluation, korea, korean, seoul, naver, training, fine-tuning, quantization, agent, agents, agentic
Together AI · inference-infra · 2026-04-29
DeepSeek-V4 Pro is now available on Together AI with 512K context, controllable reasoning modes, and cached-input pricing for long-context reasoning workloads like code agents, document intelligence, and research synthesis.
High signal Matched: research, long-context, agents
NVIDIA Technical Blog · hardware · 2026-04-24
Federated learning (FL) is no longer a research curiosity—it’s a practical response to a hard constraint: the most valuable data is often the least movable....
High signal Matched: research
Nota AI · korea · 2026-04-22
Jaehoon Lee Technical Content Manager, Nota AI Series Notice: NetsPresso® Technical Blog, Part 2In Part 1, we walked through a scenario of deploying Llama 3.2 1B on an edge device to illustrate the NetsPresso® workflow. The f...
High signal Matched: inference, kernel, cuda, matmul, benchmark, performance, latency, cost, npu, model, weights, paper, research, evaluation, furiosa, training, quantization, int8, int4, awq, gptq, sdk, open-source
BAIR · research · 2026-04-20
.grasp-results-table table { font-size: 0.875rem; line-height: 1.35; width: 100%; } .grasp-results-table th, .grasp-results-table td { padding: 0.35rem 0.5rem; } /* Consistent whitespace between major sections (this post is long and hr-hea...
High signal Matched: performance, model, paper, arxiv, evaluation, training
Modal · inference-infra · 2026-04-14
Autoresearch automates AI research. Modal automates AI infrastructure.
High signal Matched: research, agents
SkyPilot · open-source · 2026-04-09
Coding agents working from code alone generate shallow hypotheses. Adding a research phase — arxiv papers, competing forks, other backends — produced 5 kernel fusions that made llama.cpp CPU inference 15% faster.
High signal Matched: inference, kernel, arxiv, research, agent, agents
Nota AI · korea · 2026-04-08
Jaehoon Lee Technical Content Manager, Nota AI AI Model Optimization: Why Models Won't Run on HardwareThe Chip Is Ready, but the Model Won't DeployIf you have ever tried deploying an AI model onto your own chip, the following...
High signal Matched: inference, multi-gpu, kv cache, verification, performance, latency, gpu, model, research, evaluation, quantization, quantized, awq, gptq, evaluate
Together AI · inference-infra · 2026-04-03
New research shows LLMs can optimize database query execution plans—achieving up to 4.78x speedups by correcting the cardinality estimation errors that statistical heuristics miss.
High signal Matched: research
Nota AI · korea · 2026-03-31
Jaehoon Lee Technical Content Manager, Nota AI In March, a single official announcement from Google Research rocked trillions of won in the market capitalization of U.S. infrastructure and semiconductor stocks. The catalyst:...
High signal Matched: inference, serving, generation, throughput, kv cache, benchmark, performance, cost, b200, blackwell, introducing, model, fp8, research, training, fine-tuning, quantization, quantized, agent, agentic, frontier model
Nota AI · korea · 2026-03-23
Jaehoon Lee Technical Content Manager, Nota AI GTC has evolved far beyond a technology conference, drawing attention from global economies and financial markets alike. This year, CEO Jensen Huang took the stage in his tradema...
High signal Matched: inference, prefill, generation, throughput, cuda, kv cache, performance, latency, cost, gpu, npu, launch, model, research, cloud, training, long-context, context window, agent, agents, agentic, open-source
Nota AI · korea · 2026-03-20
NP Product Team, Nota AI The role of Edge AI is rapidly expanding.Offline voice assistants now carry on conversations in our daily lives, vehicles infer routes in real time, and smartphones generate images without a network c...
High signal Matched: inference, kv cache, moe, benchmark, performance, latency, cost, model, research, seoul, quantization
Google Research · big-tech · 2026-03-18
Health & Bioscience
High signal Matched: research
Google Research · big-tech · 2026-03-17
Education Innovation
High signal Matched: research
Together AI · inference-infra · 2026-03-16
Together AI arrives at NVIDIA GTC 2026 with new launches in inference, agents, voice AI, and open models — plus technical sessions from its research and engineering leaders.
High signal Matched: inference, research, agents
Nota AI · korea · 2026-03-13
Hancheol Park, Ph. D. AI Research Engineer, Nota AI Tairen PiaoAI Research Engineer, Nota AI Tae-Ho KimCTO & Co-Founder, Nota AI ✔️ Resource : The official quantized model of Solar-Open-100B, which passed the first round of Sout...
High signal Matched: inference, serving, prefill, generation, throughput, moe, router, benchmark, performance, latency, ttft, tpot, blackwell, release, model, weights, open model, research, evaluation, korea, korean, upstage, training, post-training, quantization, quantized, int4, evaluate, benchmarks, mmlu, long-context
BAIR · research · 2026-03-13
--> Understanding the behavior of complex machine learning systems, particularly Large Language Models (LLMs), is a critical challenge in modern artificial intelligence. Interpretability research aims to make the decision-making process mo...
High signal Matched: inference, serving, decoding, performance, cost, model, research, training, evaluate, mmlu, long-context, rag
SqueezeBits · korea · 2026-03-11
Explore why Physical AI deployment needs synthetic data at scale with Squeezebits' research and discover how to overcome inference bottlenecks to accelerate Roboost Agent.
High signal Matched: inference, research, agent
Modular · inference-infra · 2026-03-06
Modverse #53: Community Builds, Research Milestones, and a Growing Ecosystem
High signal Matched: research
Together AI · inference-infra · 2026-03-05
At AI Native Conf, Together AI announced breakthroughs across kernels, RL, and inference optimization — including FlashAttention-4, ThunderAgent, and together.compile. Research that ships to production. That's the AI Native Cloud.
High signal Matched: inference, flashattention, research, cloud
vLLM Project · open-source · 2026-03-04
This article is adapted from a Red Hat hosted vLLM Office Hours session with Burkhard Ringlein from IBM Research, featuring a deep technical walkthrough of the vLLM Triton attention backend....
High signal Matched: triton, research
Together AI · inference-infra · 2026-03-02
We've refreshed our visual identity — designed with Pentagram to express how Together AI connects open-source innovation, systems research, and builders to unlock new possibilities.
High signal Matched: introducing, research, open-source
AI2 · research · 2026-02-27
The Asta Interaction Dataset (AID) contains real researcher queries revealing how scientists actually use AI-powered research tools, and where their habits diverge from what tool builders expect.
High signal Matched: research
Nota AI · korea · 2026-02-26
Jewon Lee | Wooksu Shin | Seungmin Yang | Ki-Ung Song | Donguk Lim | Jaeyeon Kim | Tae-Ho Kim | Bo-Kyeong KimEdgeFM Team, Nota AI ✔️ Resources for more information: GitHub, ArXiv, Project Page, Demo.✔️ Accepted at ICLR 2026. &...
High signal Matched: inference, generation, verification, benchmark, performance, latency, cost, model, arxiv, evaluation, training, post-training, benchmarks
Modal · inference-infra · 2026-02-25
Learn why researchers at Scaling Intelligence, Hazy Research, and other top labs are choosing Modal.
High signal Matched: research
Together AI · inference-infra · 2026-02-23
State-of-the-art speech models like Whisper and Deepgram score near-human on benchmarks — then fail 39% of the time on street names. New research from Together AI exposes the gap and a fix.
High signal Matched: research, benchmarks
Together AI · inference-infra · 2026-02-06
What do language models generate when you don't tell them what to generate? New research reveals that LLM families have distinct 'knowledge priors'—GPT models default to code and math, Llama favors narratives, DeepSeek generates religious...
High signal Matched: research
Hugging Face · open-source · 2026-01-27
No feed summary available yet.
High signal Matched: evaluation
Together AI · inference-infra · 2026-01-26
Introducing DSGym—a holisti evaluation and training framework for LLM-based data science agents. Features 90+ bioinformatics tasks, 92 Kaggle competitions, and synthetic trajectory generation. Our 4B model achieves state-of-the-art perform...
High signal Matched: generation, performance, introducing, model, evaluation, training, evaluating, agents, open-source
BAIR · research · 2026-01-10
An encoder (optical system) maps objects to noiseless images, which noise corrupts into measurements. Our information estimator uses only these noisy measurements and a noise model to quantify how well measurements distinguish objects. Man...
High signal Matched: performance, model, paper, evaluation, training, evaluate
Nota AI · korea · 2025-12-19
Seungmin YangEdgeFM Lead, Nota AI On this page ▾ SummaryWith the introduction of NVFP4—a new 4-bit floating point data type in NVIDIA’s Blackwell GPU architecture—LLM inference achieves markedly improved efficiency.Blackwell’s NVFP4...
High signal Matched: inference, serving, decoding, prefill, generation, token generation, throughput, kernel, gemm, cutlass, distributed, benchmark, performance, latency, ttft, tpot, tokens/sec, cost, gpu, blackwell, launch, model, weights, fp8, research, training, post-training, quantization, quantized, awq, benchmarks, mmlu, retrieval
Google Research · big-tech · 2025-12-19
Year in Review
High signal Matched: research
Hugging Face · open-source · 2025-12-17
No feed summary available yet.
High signal Matched: evaluation
Together AI · inference-infra · 2025-12-17
Dan Fu, our VP of Kernels, has published a new post challenging the idea that AI is hitting a hardware wall. He argues that we are vastly underutilizing current chips and that better software-hardware co-design will unlock the next order o...
High signal Matched: performance, research
Google Research · big-tech · 2025-12-16
Algorithms & Theory
High signal Matched: paper
Hugging Face · open-source · 2025-11-25
No feed summary available yet.
High signal Matched: research, state of the art
AIBrix · open-source · 2025-11-10
🚀 AIBrix v0.5.0 Release Today, we’re excited to announce AIBrix v0.5.0, a release that pushes AIBrix closer to a batteries-included control plane for modern LLM workloads. This release introduces an OpenAI-compatible Batch API for hi...
High signal Matched: prefill, latency, release, evaluation, api, openai-compatible
Google Research · big-tech · 2025-10-31
Climate & Sustainability
High signal Matched: research
Hugging Face · open-source · 2025-10-01
No feed summary available yet.
High signal Matched: introducing, evaluation, retrieval
Google Research · big-tech · 2025-10-01
Algorithms & Theory
High signal Matched: research
Google Research · big-tech · 2025-09-25
Generative AI
High signal Matched: research, agent
Google Research · big-tech · 2025-09-10
General Science
High signal Matched: research
BAIR · research · 2025-09-01
What exactly does word2vec learn, and how? Answering this question amounts to understanding representation learning in a minimal yet interesting language modeling task. Despite the fact that word2vec is a well-known precursor to modern lan...
High signal Matched: benchmark, performance, model, weights, paper, training
SqueezeBits · korea · 2025-08-20
Efficient AI Study & Meetup recap: SqueezeBits' community study on AI model compression, featuring paper reviews, participant interviews, and networking from the offline meetup.
High signal Matched: model, paper
Hugging Face · open-source · 2025-08-18
No feed summary available yet.
High signal Matched: research, mcp
Google Research · big-tech · 2025-08-07
General Science
High signal Matched: research
Nota AI · korea · 2025-07-10
Marcel Simon, Ph. D.ML Researcher, Nota AI GmbH Tae-Ho KimCTO & Co-Founder, Nota AI Seul-Ki Yeom, Ph. D.Research Lead, Nota AI GmbH SummaryProposes a simple next-frame prediction task using unlabeled video to enhance sing...
High signal Matched: inference, performance, model, paper, research, training, fine-tuning, benchmarks
Hugging Face · open-source · 2025-07-04
No feed summary available yet.
High signal Matched: evaluation, training
BAIR · research · 2025-07-01
.modal { display: none; position: fixed; z-index: 9999; padding-top: 50px; left: 0; top: 0; width: 100%; height: 100%; overflow: auto; background-color: rgba(0,0,0,0.9); } .modal-content { margin: auto; display: block; max-width: 90%; max-...
High signal Matched: inference, generation, performance, model, paper, arxiv, evaluation, training, evaluate, agent, agents
Hugging Face · open-source · 2025-06-06
No feed summary available yet.
High signal Matched: evaluation, agents
Nota AI · korea · 2025-05-08
Jaewoo SongSoftware Engineer, Nota AI SummaryThis study proposes an AI model preprocessing method for improved quantization accuracies on edge AI devices which do not support advanced quantization methods due to their limitat...
High signal Matched: performance, model, weights, research, quantization, int8, int4
Nota AI · korea · 2025-05-07
Jewon Lee | Ki-Ung Song | Seungmin Yang | Donguk Lim | Jaeyeon Kim | Wooksu Shin | Bo-Kyeong Kim | Tae-Ho KimEdgeFM Team, Nota AI Yong Jae Lee, Ph. D.Associate Professor, UW-Madison SummaryOur method, Trimmed-Llama, reduces t...
High signal Matched: inference, generation, kv cache, benchmark, performance, latency, model, weights, research, training, benchmarks, open-source
Modal · inference-infra · 2025-04-18
sync. is a research lab training foundational models to understand and manipulate humans in video. After outgrowing Google Colab, they partnered with Modal for efficient deployment, allowing rapid iteration and scaling to process over 100...
High signal Matched: research, training
BAIR · research · 2025-04-11
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is listed as the #1 threat by OWASP to LLM-integrated ap...
High signal Matched: cost, model, evaluation, training, dpo, fine-tuning, retrieval, api, sota
BAIR · research · 2025-04-08
PLAID is a multimodal generative model that simultaneously generates protein 1D sequence and 3D structure, by learning the latent space of protein folding models. The awarding of the 2024 Nobel Prize to AlphaFold2 marks an important moment...
High signal Matched: inference, generation, cost, model, weights, research, training, retrieval
Nota AI · korea · 2025-04-08
Seul-Ki Yeom, Ph. D. Research Lead, Nota AI GmbH Tae-Ho KimCTO & Co-Founder, Nota AI SummaryDelivers real-time AI performance on edge devices such as smartphones, IoT devices, and embedded systems.Introduces a novel "Reus...
High signal Matched: inference, kernel, benchmark, performance, cost, introducing, model, paper, research, benchmarks
SqueezeBits · korea · 2025-02-27
This article introduces Fits on Chips, an LLMOps toolkit for performance evaluation.
High signal Matched: performance, evaluation
SkyPilot · open-source · 2025-02-26
DeepSeek R1 has shown great reasoning capability when it is firstly released. In this blog post, we detail our learnings in using DeepSeek R1 to build a Retrieval-Augmented Generation (RAG) system, tailored for legal documents. We choose l...
High signal Matched: generation, research, rag, retrieval-augmented generation, retrieval
Nota AI · korea · 2025-02-25
Hancheol Park, Ph. D.AI Research Engineer, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, Nota AI Jaeyeon KimAI Research Engineer, Nota AI SummaryIn this study, we propose a method for determining whether given multilingual...
High signal Matched: generation, performance, model, paper, research, training, fine-tuning
SqueezeBits · korea · 2025-02-17
A brief review of the research paper from our team, published at ICML 2024.
High signal Matched: verification, paper, research
Nota AI · korea · 2025-02-10
Hancheol Park, Ph. D.AI Research Engineer, Nota AI Geonmin Kim, Ph. D.AI Research Engineer, Nota AI SummaryIn this study, we present a method for detecting ambiguous samples in natural language understanding (NLU) tasks using...
High signal Matched: performance, paper, research, evaluation, training, evaluate
SqueezeBits · korea · 2025-01-06
In this blog series, we thoroughly evaluate Intel's AI accelerator, the Gaudi series, focusing on its performance, features, and usability.
High signal Matched: performance, accelerator, evaluation, evaluate
Hugging Face · open-source · 2024-12-10
No feed summary available yet.
High signal Matched: research, open source
Hugging Face · open-source · 2024-12-04
No feed summary available yet.
High signal Matched: benchmark, evaluation, leaderboard
SqueezeBits · korea · 2024-12-03
In this blog series, we thoroughly evaluate Intel's AI accelerator, the Gaudi series, focusing on its performance, features, and usability.
High signal Matched: performance, accelerator, evaluation, evaluate
Hugging Face · open-source · 2024-11-04
No feed summary available yet.
High signal Matched: evaluation, fine-tuning
SqueezeBits · korea · 2024-10-01
This article provides a comparative analysis of vLLM and TensorRT-LLM frameworks for serving LLMs, evaluating their performance based on key metrics like throughput, TTFT, and TPOT to offer insights for practitioners in optimizing LLM depl...
High signal Matched: serving, throughput, performance, ttft, tpot, evaluation, evaluating
Modal · inference-infra · 2024-08-05
Scale up smaller open models with search and evaluation to match frontier capabilities.
High signal Matched: evaluation
Nota AI · korea · 2024-08-02
Jaeyeon KimResearch Engineer, Nota AI Geonmin KimResearch Engineer, Nota AI Hancheol ParkTeam Lead of NetsPresso Application, Nota AI IntroductionRecent large language models (LLMs) have demonstrated unprecedented performance...
High signal Matched: decoding, benchmark, performance, latency, tokens/sec, model, arxiv, research, technical report, evaluation, cloud, training, lora, benchmarks, leaderboard, open-source
Hugging Face · open-source · 2024-07-25
No feed summary available yet.
High signal Matched: evaluation, fine-tuning
Nota AI · korea · 2024-06-13
Jeongho KimResearch Engineer, Nota AI SummaryOnline multi-camera system for efficient individual trackingAccurate ID management with Cluster Self-Refinement (CSR)Improved performance with enhanced pose estimation Intro...
High signal Matched: performance, model, paper, research, evaluation, leaderboard
Hugging Face · open-source · 2024-05-24
No feed summary available yet.
High signal Matched: evaluation
Hugging Face · open-source · 2024-04-16
No feed summary available yet.
High signal Matched: introducing, evaluation, leaderboard
Hugging Face · open-source · 2024-04-04
No feed summary available yet.
High signal Matched: research
Hugging Face · open-source · 2024-02-20
No feed summary available yet.
High signal Matched: introducing, evaluation, korean, leaderboard
SkyPilot · open-source · 2023-05-02
Experience report from Salk Institute on how biologists use SkyPilot to conduct research on the cloud.
High signal Matched: research, cloud
Hugging Face · open-source · 2022-11-17
No feed summary available yet.
High signal Matched: arxiv
Hugging Face · open-source · 2022-08-01
No feed summary available yet.
High signal Matched: research
Hugging Face · open-source · 2022-06-28
No feed summary available yet.
High signal Matched: evaluation
Hugging Face · open-source · 2022-05-19
No feed summary available yet.
High signal Matched: research
Hugging Face · open-source · 2022-03-22
No feed summary available yet.
High signal Matched: research
Microsoft Research · big-tech · 2026-05-29
Data Formulator introduces AI-powered analytics for enterprise data workflows. Data teams can easily bring enterprise data into an AI-ready workspace where users can explore, analyze, and visualize data with AI agents to turn raw data into...
Watchlist Matched: research, agents
Microsoft Research · big-tech · 2026-05-28
Understanding AI as an extension of human intelligence—not a replacement for it—offers a more grounded path for building trustworthy AI systems. The post Extending Human Intelligence Through AI appeared first on Microsoft Research.
Watchlist Matched: research
Microsoft Research · big-tech · 2026-05-22
MagenticLite is an agentic system for small models that works across the browser and local file system in a single workflow. It combines specialized models and orchestration to support efficient agentic performance on everyday tasks. The p...
Watchlist Matched: performance, research, agentic
Microsoft Research · big-tech · 2026-05-21
Vega turns a full credential into a single proof, sharing only what is needed and nothing more, with performance that works in real apps. The post Vega: Zero-knowledge proofs for digital identity in the age of AI appeared first on Microsof...
Watchlist Matched: performance, research
Lambda · cloud · 2026-05-20
HRT turns to Lambda as on-premise infrastructure reaches its ceiling
Watchlist Matched: research
Microsoft Research · big-tech · 2026-05-12
Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest. The post SocialReasoni...
Watchlist Matched: research, agents
Microsoft Research · big-tech · 2026-05-09
Microsoft Research is excited to release an open dataset of approximate transmission topology of the U.S. power grid derived from publicly available data. The ability to study transmission-level power grid behavior is essential for modern...
Watchlist Matched: release, research
AI2 · research · 2026-05-07
Ai2 is bringing NSF OMAI compute online to power a fully open AI research ecosystem, turning national infrastructure investment into reusable models, data, methods, and tools that can accelerate scientific discovery.
Watchlist Matched: research
BAIR · research · 2025-11-01
In this post, I’ll introduce a reinforcement learning (RL) algorithm based on an “alternative” paradigm: divide and conquer. Unlike traditional methods, this algorithm is not based on temporal difference (TD) learning (which has scalabilit...
Watchlist Matched: benchmark, performance, model, paper, training
BAIR · research · 2025-03-25
Training Diffusion Models with Reinforcement Learning We deployed 100 reinforcement learning (RL)-controlled cars into rush-hour highway traffic to smooth congestion and reduce fuel consumption for everyone. Our goal is to tackle "stop-and...
Watchlist Matched: throughput, kernel, performance, model, paper, training, agent, agents