Introduction
Data Analysts waste too many hours rewriting the same queries. After my recent “aha” moment switching more work into SQL, I started collecting small, DB‑agnostic SQL snippets that solve recurring problems fast. They’re practical, use standard SQL where possible, and are focused on results you can use today — paste them into your editor, swap table and column names, test on a dev set, and you’re off.
Quick rules before we start
The SQL snippets
- Replace table and column names with your own.
- Test snippets on a subset or a dev schema first.
- For performance-sensitive queries, add indexes or use LIMIT during development.
1. Select latest row per group SQL Snippet (useful for “latest status per user”)
SELECT t.*
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY user_id ORDER BY updated_at DESC) AS rn
FROM events
) t
WHERE rn = 1;
Explanation: This SQL snippet returns the most recent row for each group (here, user_id) using ROW_NUMBER() and partitioning by the group key. When to use: Ideal for state tables (latest order status, last activity per user) where you need the most recent record per entity. Performance note: Window functions can scan the partition; add appropriate indexes on the ORDER BY column (updated_at) and the partition key for large tables. For more information visit: window functions overview — Postgres docs: https://www.postgresql.org/docs/current/functions-window.html
2. Aggregate with top N per group (e.g., top 3 products per category)
SELECT *
FROM (
SELECT p.*, ROW_NUMBER() OVER (PARTITION BY category_id ORDER BY sales DESC) AS rn
FROM products p
) x
WHERE rn <= 3;
Explanation: This SQL snippet produces the top N entries per partition using ROW_NUMBER() and a WHERE rn <= N filter. When to use: Use for leaderboards (top-selling products per category), executive summaries, or to show a short list per group without complex joins. Performance note: Sorting per partition can be expensive; filter early if possible (e.g., limit candidate rows before windowing). For a deeper explanation, visit: read example patterns for top-N per group: https://use-the-index-luke.com/sql/window-functions/row-number
3. Rolling 7-day sum SQL Snippet (useful for rolling metrics)
SELECT day,
SUM(value) OVER (ORDER BY day ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS rolling_7
FROM daily_metrics;
Explanation: Uses a sliding-window SUM() to compute a rolling total over the last 7 days for each day. When to use this SQL snippet: Time-series metrics like rolling active users, rolling revenue, or smoothing noisy daily measures. Performance note: Works well on pre-aggregated daily data; avoid running it on very large, row-level event tables without pre-aggregation. For more information, visit: Intro to SQL window framing: https://www.postgresql.org/docs/current/tutorial-window.html
4. Simple churn rate month-over-month SQL Snippet
WITH monthly_active AS (
SELECT DATE_TRUNC('month', event_date) AS month, user_id
FROM events
GROUP BY 1, 2
)
SELECT curr.month,
1.0 - COUNT(DISTINCT curr.user_id) * 1.0 / COUNT(DISTINCT prev.user_id) AS churn_rate
FROM monthly_active curr
LEFT JOIN monthly_active prev ON prev.month = curr.month - INTERVAL '1 month'
GROUP BY curr.month
ORDER BY curr.month;
Explanation: This SQL snippet builds month-level active user sets and compares current vs previous month to compute a churn rate. When to use: Monthly retention/churn dashboards and executive reports. Performance note: Use GROUP BY on truncated dates and ensure event_date has an index if the base table is large. For further information, visit: Churn basics and cohort analysis overview: https://www.customer.io/blog/churn-rate-formula/
5. Safe dedupe into a table (insert only newest rows)
WITH dedup AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY external_id ORDER BY created_at DESC) AS rn
FROM staging_table
)
INSERT INTO target_table (col1, col2, …)
SELECT col1, col2, …
FROM dedup
WHERE rn = 1;
Explanation: Deduplicates staging data by external_id, keeping the newest row per key, and inserts only the deduped rows into the target table. When to use this SQL snippet: ETL pipeline stages where you land messy data then upsert/insert deduped rows into the main table. Performance note: For large staging tables, consider using a staging index and batching the insert. If your DB supports MERGE/UPSERT, consider that for idempotency. For more information, visit: MERGE/UPSERT patterns: https://www.postgresql.org/docs/current/sql-merge.html
6. Calculate cohort retention (simple) SQL Snippet
WITH installs AS (
SELECT user_id, DATE_TRUNC(‘week’, install_date) AS cohort_week
FROM users
)
SELECT i.cohort_week, DATE_TRUNC(‘week’, e.event_date) AS event_week,
COUNT(DISTINCT e.user_id) AS retained_users
FROM installs i
LEFT JOIN events e ON i.user_id = e.user_id
GROUP BY 1, 2
ORDER BY 1, 2;
Explanation: This SQL snippet joins installs (cohorts) to event data and aggregates retained users by cohort and event week. When to use: Creating a simple cohort retention table for weekly or monthly cohorts. Performance note: Cohort calculations can explode in intermediate rows; pre-aggregate user counts where possible and use date_trunc to normalize. For more information, visit: Cohort analysis primer: https://www.ycombinator.com/library/4A-how-to-do-cohort-analysis
7. Compare two periods (YoY or WoW percentage change)
WITH sums AS (
SELECT DATE_TRUNC('week', event_date) AS period,
SUM(amount) AS total
FROM payments
GROUP BY 1
)
SELECT s.period,
s.total,
(s.total - lag(s.total) OVER (ORDER BY s.period)) / lag(s.total) OVER (ORDER BY s.period) AS pct_change
FROM sums s
ORDER BY s.period;
Explanation: Aggregates totals by period and uses LAG() to compute period-over-period percent change. When to use this SQL snippet: Reporting growth metrics like weekly revenue change, MAU change, or other rolling business KPIs. Performance note: LAG() requires ordering the periods; ensure period aggregation reduces dataset size before windowing. For more information, visit: Time-series comparison techniques: https://www.red-gate.com/simple-talk/sql/t-sql-programming/comparing-periods-in-sql/
8. Detect data anomalies (z-score method)
WITH stats AS (
SELECT AVG(value) AS mean, STDDEV_POP(value) AS sd
FROM metric_table
),
flagged AS (
SELECT m.*, (m.value - s.mean) / NULLIF(s.sd, 0) AS z_score
FROM metric_table m CROSS JOIN stats s
)
SELECT * FROM flagged WHERE ABS(z_score) > 3;
Explanation: Computes the mean and population standard deviation, then flags rows with an absolute z-score above a threshold (e.g., 3). When to use: Quick anomaly detection on a single metric before deeper investigation or alerting. Performance note: Sensitive to distribution — z-score assumes roughly normal distribution. For non-normal metrics prefer robust methods (IQR) or time-series anomaly detection tools. For a deeper exp: Z-score anomaly detection overview: https://en.wikipedia.org/wiki/Standard_score
9. Unpivot (wide → long) using UNION ALL (portable)
SELECT id, 'metric_a' AS metric, metric_a AS value FROM metrics
UNION ALL
SELECT id, 'metric_b', metric_b FROM metrics
UNION ALL
SELECT id, 'metric_c', metric_c FROM metrics;
Explanation: Converts wide tables into long format using UNION ALL, a portable approach that works where native UNPIVOT is not available. When to use: Preparing data for aggregation, time-series pivoting, or feeding BI tools that prefer long format. Performance note: UNION ALL is simple but verbose for many columns — consider using UNNEST or JSON methods for many columns or DB-specific UNPIVOT where available. For more information, visit: Unpivoting techniques: https://mode.com/sql-tutorial/sql-unpivot/
10. Efficient pagination with keyset (avoid OFFSET for large tables)
SELECT *
FROM events
WHERE (event_time, id) > (‘2026-01-01 00:00:00’, 12345)
ORDER BY event_time, id
LIMIT 100;
Explanation: Uses the last-seen (event_time, id) tuple in a WHERE clause to page more efficiently than OFFSET, which degrades on large offsets. When to use: APIs and UI pagination that scan large, append-only tables. Performance note: Keyset pagination is much faster for deep pages; you’ll need a stable ordering and construct the WHERE clause carefully to include tie-breakers (id). For a deeper explaination, visit: Keyset pagination best practices: https://www.citusdata.com/blog/2016/01/12/faster-pagination/
Summary
These ten proven SQL snippets are small, practical tools you can drop into your workflow to save time on common analyst tasks — from finding the latest row per entity to fast pagination and simple anomaly detection. Start by copying a snippet into a dev schema, adjust names and filters for your data, and add indexes where performance matters. If you build a snippet pack workflow (local .sql files + a short README), you’ll shave minutes — often hours — off repetitive work and make your analyses more reproducible.
Want help adapting these snippets to your schema or building your own snippet pack? Contact me here: Contact me — I’ll walk through your data model and help you get copies of the most useful queries working in your environment.
