Inference Studio

Neurometric Inference Studio is an internal tool to evaluate AI model cost vs performance. It helps you understand what you’re spending and where Small Language Models (SLMs) could replace expensive frontier models.

Key Features

Model Comparison: Run the same prompts across multiple models and compare results side-by-side.
Quality Scoring: Automated pass/fail quality scoring and cosine similarity comparisons across outputs.
Cost-Quality Analysis: Interactive scatter plots to visualize the cost-quality tradeoff for each model.
Background Jobs: Asynchronous processing using Graphile Worker for analyzing large datasets and running Ray jobs.
Langfuse Integration: Sync traces from your existing Langfuse observability setup directly into the studio.

Tech Stack

The Inference Studio is built with a modern stack:

Next.js 16 (App Router)
React 19
Tailwind CSS v4
D3.js (for scatter plots)
Prisma (ORM)
PostgreSQL
Graphile Worker (background jobs)

Next Steps

Check out the Setup Guide to run the studio locally.

Getting started

Inference Studio

SLM Marketplace

Inference Studio Overview

Inference Studio

Key Features

Tech Stack

Next Steps

Getting started

Inference Studio

SLM Marketplace

Documentation Index

​Inference Studio

​Key Features

​Tech Stack

​Next Steps

Inference Studio

Key Features

Tech Stack

Next Steps