Generate glossary entries for a self-serve data portal

beginnerClaude SonnetData & AnalyticsBusiness Intelligencedata-glossarydocumentationself-servebibusiness-intelligence

Use case

Use this prompt when building or filling a data glossary, internal wiki, or BI tool's metric tooltip. Generates entries that are short, accurate, and disambiguate the most common confusions — the thing that prevents three teams from each citing 'active users' differently.

The prompt

You are a senior BI analyst writing entries for a company data glossary. The audience is anyone who might encounter the term in a dashboard, query, or report — analysts, PMs, sales ops, CS, finance. Entries are short by design.

Inputs:
- Term: {{term}}
- Plain-language intent: {{intent}}
- Source of truth model / metric: {{source_of_truth}}
- Known confusions or competing definitions: {{competing_definitions}}
- Where this term appears: {{appearance_locations}} (dashboards, reports, tools)
- Owner team: {{owner}}

Produce a glossary entry with this structure (compact, scannable):

**{{term}}**

> One-sentence plain-language definition. Suitable for a tooltip.

- **Type:** what kind of thing this is — metric, dimension, segment, lifecycle stage, etc.
- **Unit:** count of X, USD, percentage, etc. (omit if not applicable)
- **Computed as:** plain-language formula. Pseudo-code or a sentence — not full SQL unless required for clarity.
- **Source of truth:** the canonical model or metric definition.
- **Includes / excludes:** the 3–5 most common edge cases, stated plainly.
- **Often confused with:** 1–2 closely related terms and the difference. Link if relevant.
- **Common misuse:** the most frequent way this term is cited incorrectly. State plainly.
- **Owner:** team responsible for the definition.

## Example In Context (optional, 1 sentence)
A short example showing the term used correctly. Skip if the definition is self-evident.

Rules:
- Keep the entry under 120 words.
- Plain language; no jargon without immediate translation.
- The "includes / excludes" section is the single most important — that's where most disagreements come from.
- Be honest about ambiguity. If the term has two valid uses in the company, document both and indicate which is canonical.
- If a term is being deprecated, say so explicitly.

Tone: precise, short, useful. Glossary entries are read in 30 seconds; write for that.

Variables

{{{{term}}}}Replace with your {{term}}
{{{{intent}}}}Replace with your {{intent}}
{{{{source_of_truth}}}}Replace with your {{source of truth}}
{{{{competing_definitions}}}}Replace with your {{competing definitions}}
{{{{appearance_locations}}}}Replace with your {{appearance locations}}
{{{{owner}}}}Replace with your {{owner}}

Example

Input

term: Active Customer
intent: A customer who is actively using the product. Used loosely across dashboards but means different things to different teams.
source_of_truth: dim_customers.is_active flag (computed daily); definition documented in metrics layer
competing_definitions: |
  Product team: customer with at least one session in the last 30 days
  CS team: customer whose subscription is in 'active' status (billing-based)
  Finance: customer who paid an invoice in the last 90 days
  Marketing: customer who hasn't been flagged as churned in our CRM
appearance_locations: |
  "Customer Health Overview" dashboard (Looker)
  "MRR Bridge" report (finance)
  "Account 360" view in CRM
  Multiple Slack reports
owner: Analytics engineering team owns the source of truth flag; product, CS, and finance each own their own derived view.

Output

**Active Customer**

> A customer with an active paid subscription AND at least one product session in the last 30 days.

- **Type:** segment / lifecycle status
- **Computed as:** `dim_customers.is_active = (subscription_status = 'active') AND (last_session_at_utc >= current_date - 30)`
- **Source of truth:** `dim_customers.is_active`. Recomputed daily.
- **Includes:** paying customers who have logged in within the last 30 days; multi-seat accounts where any seat has been active.
- **Excludes:** customers in trial, paused subscriptions, churned customers, and paying customers who haven't logged in for 30+ days (the latter are paying-but-dormant — flagged separately as `is_paying_dormant`).
- **Often confused with:**
- **Paying Customer** — broader; includes dormant payers. Use this when the question is about revenue, not engagement.
- **MAU (Monthly Active User)** — usage-only; ignores subscription status. Includes free / trial users.
- **Common misuse:** citing "active customers" when the question is really about revenue contribution. Use Paying Customer for revenue cuts; Active Customer for engagement and health.
- **Owner:** Analytics engineering owns the canonical flag. Each team consuming the metric must use the canonical version, not a redefined one.

*Example:* "We had 4,210 active customers in March" means paid, active subscriptions with at least one session — not just paid count.

---

*Note for entry maintenance:* the prior product-team and finance-team derived versions are deprecated; their dashboards have been re-pointed to the canonical flag as of April 2024. Any new dashboard citing "active customer" must source from `dim_customers.is_active`.

Tips for best results

1Lead with a one-sentence definition that fits a tooltip. If the user doesn't read past line 1, that line should still be correct.
2The includes/excludes section is the load-bearing part. Most metric disagreements trace back to a single edge case the definition didn't pin down.
3Always document what a term is 'often confused with.' This is where you redirect the misuse before it produces a wrong dashboard.
4Be explicit about deprecated definitions. Silently changing a metric and hoping nobody notices is the surest path to broken trust in the data team.
5Keep entries short. A glossary that's hard to scan is a glossary nobody opens; the wiki version of a 4-page metric spec belongs in the metric spec, not the glossary.

Related prompts

Write a rigorous metric definition spec

advanced

Produce a metric definition spec for a metrics layer (LookML, Cube, dbt Semantic Layer, MetricFlow) that closes the ambiguity that causes downstream disagreement.

Data & Analyticsmetrics-layersemantic-layerdata-modeling

Propose naming conventions for a semantic layer

intermediate

Produce a naming convention spec for tables, columns, metrics, and dimensions in a semantic layer or dbt project — covering prefixes, casing, and the awkward edge cases.

Data & Analyticsnaming-conventionssemantic-layerdbt

Spec a new dashboard with audience and layout

intermediate

Produce a dashboard design spec — audience, decisions, metrics, layout, and what to leave out — before any tile is built.

Data & Analyticsdashboardbidesign-spec

Need help implementing this prompt in your workflow?

Book a call

You are a senior BI analyst writing entries for a company data glossary. The audience is anyone who might encounter the term in a dashboard, query, or report — analysts, PMs, sales ops, CS, finance. Entries are short by design. Inputs: - Term: {{term}} - Plain-language intent: {{intent}} - Source of truth model / metric: {{source_of_truth}} - Known confusions or competing definitions: {{competing_definitions}} - Where this term appears: {{appearance_locations}} (dashboards, reports, tools) - Owner team: {{owner}} Produce a glossary entry with this structure (compact, scannable): **{{term}}** > One-sentence plain-language definition. Suitable for a tooltip. - **Type:** what kind of thing this is — metric, dimension, segment, lifecycle stage, etc. - **Unit:** count of X, USD, percentage, etc. (omit if not applicable) - **Computed as:** plain-language formula. Pseudo-code or a sentence — not full SQL unless required for clarity. - **Source of truth:** the canonical model or metric definition. - **Includes / excludes:** the 3–5 most common edge cases, stated plainly. - **Often confused with:** 1–2 closely related terms and the difference. Link if relevant. - **Common misuse:** the most frequent way this term is cited incorrectly. State plainly. - **Owner:** team responsible for the definition. ## Example In Context (optional, 1 sentence) A short example showing the term used correctly. Skip if the definition is self-evident. Rules: - Keep the entry under 120 words. - Plain language; no jargon without immediate translation. - The "includes / excludes" section is the single most important — that's where most disagreements come from. - Be honest about ambiguity. If the term has two valid uses in the company, document both and indicate which is canonical. - If a term is being deprecated, say so explicitly. Tone: precise, short, useful. Glossary entries are read in 30 seconds; write for that.

Variables

{{{{term}}}}Replace with your {{term}}

{{{{intent}}}}Replace with your {{intent}}

{{{{source_of_truth}}}}Replace with your {{source of truth}}

{{{{competing_definitions}}}}Replace with your {{competing definitions}}

{{{{appearance_locations}}}}Replace with your {{appearance locations}}

{{{{owner}}}}Replace with your {{owner}}

Example

Input

term: Active Customer
intent: A customer who is actively using the product. Used loosely across dashboards but means different things to different teams.
source_of_truth: dim_customers.is_active flag (computed daily); definition documented in metrics layer
competing_definitions: |
  Product team: customer with at least one session in the last 30 days
  CS team: customer whose subscription is in 'active' status (billing-based)
  Finance: customer who paid an invoice in the last 90 days
  Marketing: customer who hasn't been flagged as churned in our CRM
appearance_locations: |
  "Customer Health Overview" dashboard (Looker)
  "MRR Bridge" report (finance)
  "Account 360" view in CRM
  Multiple Slack reports
owner: Analytics engineering team owns the source of truth flag; product, CS, and finance each own their own derived view.

Output

**Active Customer**

> A customer with an active paid subscription AND at least one product session in the last 30 days.

*Example:* "We had 4,210 active customers in March" means paid, active subscriptions with at least one session — not just paid count.

---

Tips for best results

1Lead with a one-sentence definition that fits a tooltip. If the user doesn't read past line 1, that line should still be correct.

2The includes/excludes section is the load-bearing part. Most metric disagreements trace back to a single edge case the definition didn't pin down.

3Always document what a term is 'often confused with.' This is where you redirect the misuse before it produces a wrong dashboard.

4Be explicit about deprecated definitions. Silently changing a metric and hoping nobody notices is the surest path to broken trust in the data team.

5Keep entries short. A glossary that's hard to scan is a glossary nobody opens; the wiki version of a 4-page metric spec belongs in the metric spec, not the glossary.