What does MinusX do?

A process to model your entire company's data

Why Most 'Talk to Data' Falls Short

Most 'Talk to Data solutions' fails because they lack the right context. Without clear data models, AI systems are forced to guess at ambiguous business terms, leading to inconsistent answers across your organization. Complex queries become error-prone, and tribal knowledge stays siloed within individual teams rather than being codified into a shared understanding.

Unlike coding, where the cost of wrong code can be reviewed and fixed, the users of data are often not proficient in SQL or data modeling. A system that is 90% accurate is not good enough for data as errors compound. This is why most AI solutions fail to deliver on their promises.

To get accurate AI-powered data insights, you need well-modeled data — a clear semantic layer where business entities, metrics, and relationships are unambiguous. This is where dimensional modeling comes in: properly structured facts and dimensions that represent your business logic consistently.

Building this semantic layer manually takes months of data engineering work. MinusX does it in hours, clarifying and codifying business logic from your team's collective knowledge—extracting it from people's heads and transforming it into an automated system that allows everyone in your organization to get a reliable 'Talk to data' experience.

Steps 1 - 2: Input & Clarification

MinusX first connects to your data warehouse and learns your business logic through an interactive process.

Step 1

Submit Business Questions + SQL

Provide at least 30 business questions with their "Gold SQL" queries. This is the foundation for understanding your business logic.

Business Questions + Gold SQL

Navigate through example business questions and their corresponding "Gold SQL" queries.

Question 1 of 425% Complete

What is our revenue by customer cohort with month-over-month retention?

WITH customer_cohorts AS (
  SELECT
    user_id,
    DATE_TRUNC('month', MIN(created_at)) as cohort_month,
    MIN(created_at) as first_purchase_date
  FROM orders
  WHERE status = 'completed'
  GROUP BY user_id
),
monthly_activity AS (
  SELECT
    o.user_id,
    DATE_TRUNC('month', o.created_at) as activity_month,
    SUM(o.amount) as monthly_revenue,
    COUNT(DISTINCT o.id) as order_count,
    AVG(o.amount) as avg_order_value
  FROM orders o
  WHERE o.status = 'completed'
  GROUP BY o.user_id, DATE_TRUNC('month', o.created_at)
),
cohort_analysis AS (
  SELECT
    cc.cohort_month,
    ma.activity_month,
    EXTRACT(MONTH FROM AGE(ma.activity_month, cc.cohort_month)) as months_since_first,
    COUNT(DISTINCT ma.user_id) as active_customers,
    SUM(ma.monthly_revenue) as cohort_revenue,
    AVG(ma.order_count) as avg_orders_per_customer,
    AVG(ma.avg_order_value) as avg_order_value
  FROM customer_cohorts cc
  JOIN monthly_activity ma ON cc.user_id = ma.user_id
  GROUP BY cc.cohort_month, ma.activity_month
)
SELECT
  cohort_month,
  activity_month,
  months_since_first,
  active_customers,
  cohort_revenue,
  avg_orders_per_customer,
  avg_order_value,
  cohort_revenue::FLOAT / NULLIF(LAG(cohort_revenue)
    OVER (PARTITION BY cohort_month ORDER BY activity_month), 0) - 1
    as mom_revenue_growth
FROM cohort_analysis
ORDER BY cohort_month DESC, activity_month DESC;

💡 Why "Gold SQL"? These 100% correct queries act as ground truth. By providing 30+ accurate questions with SQL, you teach MinusX your exact business definitions, ensuring the AI-generated dimensional model captures your logic precisely.

Step 2

AI Clarification Process

Our model analyzes your SQL to identify ambiguities and engages in an interactive dialogue to clarify business logic.

AI Clarification Process

MinusX asks SQL-focused questions to resolve ambiguities. Interact with the dialog below.

AI

Q1: I notice you calculate 'profit' differently across your queries. To ensure consistency in the dimensional model, please provide the SQL expression for calculating profit:

Or type your custom SQL expression above

🔄 Iterative: This continues for 15-30 questions until MinusX has complete understanding of your business logic, definitions, and SQL patterns.

Steps 3 - 4: ETL Generation & Dimensional Modeling

MinusX then generates a complete ETL pipeline and dimensional model that runs in your warehouse.

Step 3

Automated ETL Generation

SQL files organized as a dependency tree (like dbt/Quary) that transform your source tables into a dimensional model.

SQL Dependency Tree (DAG)

Click nodes to view SQL transformations. Data flows from source → intermediate → final.

Source
Staging
Final
Step 4

Dimensional Modeling

Properly structured facts and dimensions that represent your queries efficiently — the semantic layer for accurate AI.

Star Schema: Facts & Dimensions

The final dimensional model follows star schema design — a central fact table surrounded by dimension tables. Click any table to view its columns.

Fact Table (Metrics)
Dimension Tables (Attributes)

⭐ Star Schema Benefits: Optimized for analytical queries, easy to understand, fast aggregations, and minimal joins. Your 30+ business questions now run 10-100x faster using this model.

Before MinusX

WITH customer_orders AS (
  SELECT
    c.customer_id,
    c.customer_name,
    c.email,
    c.tier,
    o.order_id,
    o.order_date,
    o.status,
    o.subtotal,
    o.discount_amount,
    o.tax_amount
  FROM customers c
  LEFT JOIN orders o ON c.customer_id = o.customer_id
  WHERE c.is_active = true
),
order_details AS (
  SELECT
    co.customer_id,
    co.order_id,
    oi.product_id,
    oi.quantity,
    oi.unit_price,
    p.category,
    p.subcategory,
    (oi.quantity * oi.unit_price) as line_total
  FROM customer_orders co
  LEFT JOIN order_items oi ON co.order_id = oi.order_id
  LEFT JOIN products p ON oi.product_id = p.product_id
  WHERE co.status = 'completed'
),
customer_metrics AS (
  SELECT
    co.customer_id,
    COUNT(DISTINCT co.order_id) as order_count,
    SUM(co.subtotal - co.discount_amount + co.tax_amount) as total_revenue,
    AVG(co.subtotal - co.discount_amount + co.tax_amount) as avg_order_value,
    MAX(co.order_date) as last_order_date
  FROM customer_orders co
  WHERE co.status = 'completed'
    AND co.order_date >= '2024-01-01'
  GROUP BY co.customer_id
)
SELECT
  co.customer_name,
  co.email,
  co.tier,
  cm.order_count,
  cm.total_revenue,
  cm.avg_order_value,
  cm.last_order_date,
  COALESCE(cat_summary.categories, 0) as categories_purchased
FROM customer_orders co
INNER JOIN customer_metrics cm ON co.customer_id = cm.customer_id
LEFT JOIN (
  SELECT
    customer_id,
    COUNT(DISTINCT category) as categories
  FROM order_details
  GROUP BY customer_id
) cat_summary ON co.customer_id = cat_summary.customer_id
WHERE co.customer_id IS NOT NULL
GROUP BY co.customer_name, co.email, co.tier,
         cm.order_count, cm.total_revenue,
         cm.avg_order_value, cm.last_order_date,
         cat_summary.categories
ORDER BY cm.total_revenue DESC

Multiple CTEs, nested subqueries, complex joins - repeated across every similar query

After MinusX

SELECT
  c.customer_name,
  c.email,
  c.tier,
  f.order_count,
  f.total_revenue,
  f.avg_order_value,
  f.last_order_date,
  f.categories_purchased
FROM fact_orders f
JOIN dim_customers c ON f.customer_id = c.customer_id
WHERE f.status = 'completed'
  AND f.order_date >= '2024-01-01'

Simplified joins, pre-computed metrics, consistent logic

The Result: True company-wide data accessibility

With a well-modeled semantic layer in place, along with ground-truth business questions expressed using this layer, MinusX AI agents can now answer any business question accurately — from simple reporting to deep research queries — enabling your entire organization to get reliable insights anytime.

🎯

Accurate Answers

AI understands your business context and delivers consistent, reliable insights

Blazing Fast

Optimized dimensional queries return results in seconds, not minutes

🔒

Your Infrastructure

All data stays in your warehouse. Full control, zero vendor lock-in

🔄

Expandable

Ask new business questions anytime to extend your semantic layer as your needs evolve, or extend the version-controlled SQL files yourself

Backward Compatible

All transformations run in a new schema. Your existing data and queries remain untouched

📚

Legible Queries

Well-modeled data means ALL queries become more readable, maintainable, and easier to debug

See It In Action

Ask any question directly and get instant, accurate answers powered by your dimensional model

Ready to Build Your Semantic Layer?

Let us show you how MinusX can transform your data warehouse into a true talk-to-data experience in hours, not months.