10 Proven SQL Snippets Every Data Analyst Needs Now

Introduction

Data Analysts waste too many hours rewriting the same queries. After my recent “aha” moment switching more work into SQL, I started collecting small, DB‑agnostic SQL snippets that solve recurring problems fast. They’re practical, use standard SQL where possible, and are focused on results you can use today — paste them into your editor, swap table and column names, test on a dev set, and you’re off.

Quick rules before we start

The SQL snippets

Replace table and column names with your own.
Test snippets on a subset or a dev schema first.
For performance-sensitive queries, add indexes or use LIMIT during development.

1. Select latest row per group SQL Snippet (useful for “latest status per user”)

SELECT t.*
FROM (
  SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY updated_at DESC) AS rn
  FROM events
) t
WHERE rn = 1;

Explanation: This SQL snippet returns the most recent row for each group (here, user_id) using ROW_NUMBER() and partitioning by the group key. When to use: Ideal for state tables (latest order status, last activity per user) where you need the most recent record per entity. Performance note: Window functions can scan the partition; add appropriate indexes on the ORDER BY column (updated_at) and the partition key for large tables. For more information visit: window functions overview — Postgres docs: https://www.postgresql.org/docs/current/functions-window.html

2. Aggregate with top N per group (e.g., top 3 products per category)

SELECT *
FROM (
  SELECT p.*, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY sales DESC) AS rn
  FROM products p
) x
WHERE rn <= 3;

Explanation: This SQL snippet produces the top N entries per partition using ROW_NUMBER() and a WHERE rn <= N filter. When to use: Use for leaderboards (top-selling products per category), executive summaries, or to show a short list per group without complex joins. Performance note: Sorting per partition can be expensive; filter early if possible (e.g., limit candidate rows before windowing). For a deeper explanation, visit: read example patterns for top-N per group: https://use-the-index-luke.com/sql/window-functions/row-number

3. Rolling 7-day sum SQL Snippet (useful for rolling metrics)

SELECT day,
       SUM(value) OVER (ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7
FROM daily_metrics;

Explanation: Uses a sliding-window SUM() to compute a rolling total over the last 7 days for each day. When to use this SQL snippet: Time-series metrics like rolling active users, rolling revenue, or smoothing noisy daily measures. Performance note: Works well on pre-aggregated daily data; avoid running it on very large, row-level event tables without pre-aggregation. For more information, visit: Intro to SQL window framing: https://www.postgresql.org/docs/current/tutorial-window.html

4. Simple churn rate month-over-month SQL Snippet

WITH monthly_active AS (
  SELECT DATE_TRUNC('month', event_date) AS month, user_id
  FROM events
  GROUP BY 1, 2
)
SELECT curr.month,
       1.0 - COUNT(DISTINCT curr.user_id) * 1.0 / COUNT(DISTINCT prev.user_id) AS churn_rate
FROM monthly_active curr
LEFT JOIN monthly_active prev ON prev.month = curr.month - INTERVAL '1 month'
GROUP BY curr.month
ORDER BY curr.month;


Explanation: This SQL snippet builds month-level active user sets and compares current vs previous month to compute a churn rate. When to use: Monthly retention/churn dashboards and executive reports. Performance note: Use GROUP BY on truncated dates and ensure event_date has an index if the base table is large. For further information, visit: Churn basics and cohort analysis overview: https://www.customer.io/blog/churn-rate-formula/

5. Safe dedupe into a table (insert only newest rows)

WITH dedup AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY external_id ORDER BY created_at DESC) AS rn
FROM staging_table
)
INSERT INTO target_table (col1, col2, …)
SELECT col1, col2, …
FROM dedup
WHERE rn = 1;

Explanation: Deduplicates staging data by external_id, keeping the newest row per key, and inserts only the deduped rows into the target table. When to use this SQL snippet: ETL pipeline stages where you land messy data then upsert/insert deduped rows into the main table. Performance note: For large staging tables, consider using a staging index and batching the insert. If your DB supports MERGE/UPSERT, consider that for idempotency. For more information, visit: MERGE/UPSERT patterns: https://www.postgresql.org/docs/current/sql-merge.html

6. Calculate cohort retention (simple) SQL Snippet

WITH installs AS (
SELECT user_id, DATE_TRUNC(‘week’, install_date) AS cohort_week
FROM users
)
SELECT i.cohort_week, DATE_TRUNC(‘week’, e.event_date) AS event_week,
COUNT(DISTINCT e.user_id) AS retained_users
FROM installs i
LEFT JOIN events e ON i.user_id = e.user_id
GROUP BY 1, 2
ORDER BY 1, 2;

Explanation: This SQL snippet joins installs (cohorts) to event data and aggregates retained users by cohort and event week. When to use: Creating a simple cohort retention table for weekly or monthly cohorts. Performance note: Cohort calculations can explode in intermediate rows; pre-aggregate user counts where possible and use date_trunc to normalize. For more information, visit: Cohort analysis primer: https://www.ycombinator.com/library/4A-how-to-do-cohort-analysis

7. Compare two periods (YoY or WoW percentage change)

WITH sums AS (
  SELECT DATE_TRUNC('week', event_date) AS period,
         SUM(amount) AS total
  FROM payments
  GROUP BY 1
)
SELECT s.period,
       s.total,
       (s.total - lag(s.total) OVER (ORDER BY s.period)) / lag(s.total) OVER (ORDER BY s.period) AS pct_change
FROM sums s
ORDER BY s.period;

Explanation: Aggregates totals by period and uses LAG() to compute period-over-period percent change. When to use this SQL snippet: Reporting growth metrics like weekly revenue change, MAU change, or other rolling business KPIs. Performance note: LAG() requires ordering the periods; ensure period aggregation reduces dataset size before windowing. For more information, visit: Time-series comparison techniques: https://www.red-gate.com/simple-talk/sql/t-sql-programming/comparing-periods-in-sql/

8. Detect data anomalies (z-score method)

WITH stats AS (
  SELECT AVG(value) AS mean, STDDEV_POP(value) AS sd
  FROM metric_table
),
flagged AS (
  SELECT m.*, (m.value - s.mean) / NULLIF(s.sd, 0) AS z_score
  FROM metric_table m CROSS JOIN stats s
)
SELECT * FROM flagged WHERE ABS(z_score) > 3;

Explanation: Computes the mean and population standard deviation, then flags rows with an absolute z-score above a threshold (e.g., 3). When to use: Quick anomaly detection on a single metric before deeper investigation or alerting. Performance note: Sensitive to distribution — z-score assumes roughly normal distribution. For non-normal metrics prefer robust methods (IQR) or time-series anomaly detection tools. For a deeper exp: Z-score anomaly detection overview: https://en.wikipedia.org/wiki/Standard_score

9. Unpivot (wide → long) using UNION ALL (portable)

SELECT id, 'metric_a' AS metric, metric_a AS value FROM metrics
UNION ALL
SELECT id, 'metric_b', metric_b FROM metrics
UNION ALL
SELECT id, 'metric_c', metric_c FROM metrics;

Explanation: Converts wide tables into long format using UNION ALL, a portable approach that works where native UNPIVOT is not available. When to use: Preparing data for aggregation, time-series pivoting, or feeding BI tools that prefer long format. Performance note: UNION ALL is simple but verbose for many columns — consider using UNNEST or JSON methods for many columns or DB-specific UNPIVOT where available. For more information, visit: Unpivoting techniques: https://mode.com/sql-tutorial/sql-unpivot/

10. Efficient pagination with keyset (avoid OFFSET for large tables)

SELECT *
FROM events
WHERE (event_time, id) > (‘2026-01-01 00:00:00’, 12345)
ORDER BY event_time, id
LIMIT 100;

Explanation: Uses the last-seen (event_time, id) tuple in a WHERE clause to page more efficiently than OFFSET, which degrades on large offsets. When to use: APIs and UI pagination that scan large, append-only tables. Performance note: Keyset pagination is much faster for deep pages; you’ll need a stable ordering and construct the WHERE clause carefully to include tie-breakers (id). For a deeper explaination, visit: Keyset pagination best practices: https://www.citusdata.com/blog/2016/01/12/faster-pagination/

Summary

These ten proven SQL snippets are small, practical tools you can drop into your workflow to save time on common analyst tasks — from finding the latest row per entity to fast pagination and simple anomaly detection. Start by copying a snippet into a dev schema, adjust names and filters for your data, and add indexes where performance matters. If you build a snippet pack workflow (local .sql files + a short README), you’ll shave minutes — often hours — off repetitive work and make your analyses more reproducible.

Want help adapting these snippets to your schema or building your own snippet pack? Contact me here: Contact me — I’ll walk through your data model and help you get copies of the most useful queries working in your environment.

Tagged SQL