Data & Databases · 10 min read · ~25 min study · intermediate
Time Series Databases in Finance
When relational is not enough — why finance reaches for purpose-built TSDBs for tick storage and monitoring.
Time Series Databases in Finance: When Relational Is Not Enough
Why financial firms use specialized time series databases for market data, tick storage, and monitoring — and when you should consider one.
The Problem With Relational Databases and Market Data
Relational databases are excellent general-purpose tools. But throw a few billion timestamped price ticks at PostgreSQL and you will start to feel the pain. Not because Postgres is bad — it is genuinely excellent — but because the access patterns for time series data are fundamentally different from transactional data.
With market data, you almost always query by time range: "give me AAPL prices between 9:30 and 10:00 on January 15th." You rarely update old data. You insert in chronological order. And the volumes can be staggering — a single liquid equity can generate thousands of ticks per second during market hours.
Time series databases (TSDBs) are engineered specifically for these patterns.
What Makes a TSDB Different
Time-Based Partitioning
Data is automatically chunked by time period. Query a one-hour window and the database only touches the relevant chunk, ignoring terabytes of irrelevant historical data. In a relational database, you would need to set up partitioning manually, and it is rarely as well optimized for time-range queries.
Columnar Compression
Time series data is highly compressible. Adjacent timestamps differ by small increments. Prices change by small amounts tick to tick. TSDBs exploit this with specialized compression algorithms that can achieve 10-20x compression ratios compared to row-based relational storage.
Built-In Downsampling
Converting tick data to 1-minute, 5-minute, or hourly candles is a core operation in finance. TSDBs have native functions for this, rather than requiring complex SQL aggregations or application-level processing.
Retention Policies
Automatically age out old data. Keep raw ticks for 30 days, 1-minute aggregates for a year, daily aggregates forever. TSDBs handle this automatically.
The Major Players
| Database | Best For | SQL Support | Notes |
|---|---|---|---|
| TimescaleDB | Teams already on PostgreSQL | Full SQL | Extension on Postgres — familiar tools and ecosystem |
| QuestDB | High-performance financial data | SQL-compatible | Designed for finance, very fast ingestion |
| InfluxDB | Monitoring and operational metrics | Flux/SQL | Popular, mature ecosystem |
| kdb+/q | Ultra-low-latency trading | q (custom) | Industry standard at banks and HFTs, steep learning curve |
| ClickHouse | Large-scale analytics | SQL | Columnar, extremely fast for aggregations |
If you are already invested in the PostgreSQL ecosystem, TimescaleDB is the path of least resistance — you keep standard SQL, your existing tooling, and your team's knowledge, while gaining time series optimizations.
For greenfield projects where performance is critical, QuestDB is worth evaluating. It was designed from the ground up for financial data and benchmarks extremely well for both ingestion and query speed.
A Practical Example
Here is what working with TimescaleDB looks like. Notice it is just SQL with some extra functions:
-- Create a hypertable (time-partitioned table)
CREATE TABLE ticks (
time TIMESTAMPTZ NOT NULL,
symbol TEXT NOT NULL,
price DOUBLE PRECISION,
volume INTEGER
);
SELECT create_hypertable('ticks', 'time');
-- Add a composite index for symbol + time queries
CREATE INDEX idx_ticks_symbol_time ON ticks (symbol, time DESC);
-- Query: OHLCV candles from raw ticks
SELECT
time_bucket('1 minute', time) AS candle_time,
symbol,
first(price, time) AS open,
max(price) AS high,
min(price) AS low,
last(price, time) AS close,
sum(volume) AS volume
FROM ticks
WHERE symbol = 'AAPL'
AND time >= NOW - INTERVAL '1 day'
GROUP BY candle_time, symbol
ORDER BY candle_time;
The time_bucket, first, and last functions are TSDB-specific and make time series aggregations trivial. Compare this to the raw SQL gymnastics you would need in a standard relational database.
Continuous Aggregates
Pre-compute common aggregations that update automatically:
-- Materialised view that automatically stays up to date
CREATE MATERIALIZED VIEW ohlcv_1min
WITH (timescaledb.continuous) AS
SELECT
time_bucket('1 minute', time) AS bucket,
symbol,
first(price, time) AS open,
max(price) AS high,
min(price) AS low,
last(price, time) AS close,
sum(volume) AS volume
FROM ticks
GROUP BY bucket, symbol;
When You Actually Need One
Not every project needs a TSDB. If you are working with daily data for a few hundred instruments, a well-designed relational database handles that perfectly well. TSDBs earn their keep when:
- You are storing tick-level or sub-second data
- Ingestion rates exceed thousands of rows per second
- You need fast time-range queries over billions of rows
- Downsampling and time-based aggregations are core workflows
- Storage costs matter (compression helps significantly)
For connecting your TSDB to the rest of your infrastructure, understanding API design and data pipeline architecture becomes important — the database is just one component of a larger system. And in trading infrastructure, where latency matters, the choice of database can directly impact how fast your systems react to market events.
Want to go deeper on Time Series Databases in Finance: When Relational Is Not Enough?
This article covers the essentials, but there's a lot more to learn. Inside , you'll find hands-on coding exercises, interactive quizzes, and structured lessons that take you from fundamentals to production-ready skills — across 50+ courses in technology, finance, and mathematics.
Free to get started · No credit card required
Keep Reading
[Data & Databases
Database Design for Trading Systems
How to structure databases for trading platforms — normalization, schema design, indexing strategies, and the tradeoffs that matter in financial systems.](/quant-knowledge/data/database-design-for-trading-systems)[Data & Databases
SQL for Financial Data: A Developer's Starting Point
Learn the SQL fundamentals that matter for finance — querying trade data, aggregating positions, joining reference data, and understanding relational databases.](/quant-knowledge/data/sql-for-financial-data)[Big Data
Big Data Pipelines in Finance
How financial firms process massive datasets — batch and streaming architectures, ETL patterns, data lakes, and the tools that power modern data infrastructure.](/quant-knowledge/big-data/big-data-pipelines-in-finance)[Networking
Network Speeds and Latency in Financial Systems
Why latency matters in trading, how to measure it, where the bottlenecks are, and what firms do to minimize it — from co-location to kernel bypass.](/quant-knowledge/networking/network-speeds-and-latency-in-financial-systems)
What You Will Learn
- Explain the problem with relational databases and market data.
- Build what makes a TSDB different.
- Calibrate the major players.
- Compute a practical example.
- Design when you actually need one.
Prerequisites
- Basic SQL — see Basic SQL.
- Python data tooling — see Python data tooling.
- Comfort reading code and basic statistical notation.
- Curiosity about how the topic shows up in a US trading firm.
Mental Model
Financial data is heavy, append-only, and queried at every horizon — milliseconds for risk, hours for research, years for compliance. The right database is the one that aligns its access pattern with the dominant query, not the one that scales the highest in a vendor pitch. For Time Series Databases in Finance, frame the topic as the piece that when relational is not enough — why finance reaches for purpose-built TSDBs for tick storage and monitoring — and ask what would break if you removed it from the workflow.
Why This Matters in US Markets
US trading firms run on KDB+, ClickHouse, Snowflake, Postgres, and Parquet on S3. Tick stores at Citadel and Jane Street hold petabytes of CME and US equities data going back decades; risk groups at GS and JPM run intraday VaR off Snowflake or Vertica. Knowing how to model and query this data is the line between research and ops.
In US markets, Time Series Databases in Finance tends to surface during onboarding, code review, and the first incident a junior quant gets pulled into. Questions on this material recur in interviews at Citadel, Two Sigma, Jane Street, HRT, Jump, DRW, IMC, Optiver, and the major bulge-bracket banks.
Common Mistakes
- Not partitioning by (symbol, date) and watching every query do a full scan.
- Treating timestamps without timezones — US markets are America/New_York, not UTC, on the wire.
- Joining on price columns subject to floating-point drift instead of integer ticks.
- Treating Time Series Databases in Finance as a one-off topic rather than the foundation it becomes once you ship code.
- Skipping the US-market context — copying European or Asian conventions and getting bitten by US tick sizes, settlement, or regulator expectations.
- Optimizing for elegance instead of auditability; trading regulators care about reproducibility, not cleverness.
- Confusing model output with reality — the tape is the source of truth, the model is a hypothesis.
Practice Questions
- Design a SQL query that returns the rolling 20-day VWAP for SPY by trade date.
- Why do most US tick stores partition by date and symbol rather than purely by date?
- Compare row-oriented Postgres and column-oriented ClickHouse for storing 5 years of NBBO quotes.
- Define an idempotent UPSERT in Postgres for a daily P&L table keyed by (account, trade_date).
- What is the difference between a CTE and a subquery from a query-planner perspective?
Answers and Explanations
- Use a window function:
SELECT trade_date, SUM(price*volume) OVER w / SUM(volume) OVER w AS vwap_20 FROM trades WHERE symbol='SPY' WINDOW w AS (ORDER BY trade_date ROWS 19 PRECEDING). The window function avoids self-joins and runs in a single pass. - Because the dominant query is
WHERE symbol = ? AND trade_date BETWEEN ? AND ?; double partitioning prunes most files at the planner stage. Pure date partitioning forces every per-symbol query to scan every file in the date range. - Postgres handles the workload but is 10-100× slower on aggregations because every row must be read even when only
bidandaskare projected. ClickHouse projects only the needed columns and applies block-level compression, so a sum-of-spread over a year completes in seconds rather than minutes. INSERT INTO pnl(account, trade_date, gross, fees, net) VALUES (...) ON CONFLICT (account, trade_date) DO UPDATE SET gross=EXCLUDED.gross, fees=EXCLUDED.fees, net=EXCLUDED.net;— running it twice produces the same final state.- A CTE is sometimes materialized (as in older Postgres) and sometimes inlined (modern Postgres, Snowflake, BigQuery), while a correlated subquery is typically inlined and may execute once per outer row. The semantic effect is the same; the performance profile is not.
Glossary
- OLTP — Online Transaction Processing; row-oriented, optimized for many small writes (order books, ledgers).
- OLAP — Online Analytical Processing; column-oriented, optimized for big aggregate queries (P&L, risk).
- Index — an auxiliary data structure (B-tree, hash, bitmap) that speeds up lookups at the cost of write throughput.
- Normalization — splitting data across tables to remove redundancy; trades disk for joins.
- Denormalization — collapsing tables for read speed; standard for time-series tick stores.
- CTE — Common Table Expression; a named subquery defined with
WITH, often used for readability and recursive logic. - Window function — SQL function that computes over a sliding subset (rolling means, lagged returns).
- Parquet — a columnar file format with compression and predicate pushdown; the de-facto research format on S3.
Further Study Path
- SQL for Financial Data — Querying trade data, aggregating positions, joining reference data — the SQL fundamentals that matter for finance.
- Advanced SQL for Financial Systems — CTEs, window functions, query optimization — the SQL patterns used in real trading platforms.
- Database Design for Trading Systems — Schema design, normalization, indexing, and the tradeoffs that matter when latency and correctness collide.
- Python for Quant Finance: Fundamentals — Variables, functions, data structures, classes, and error handling — the core Python every quant role expects.
- Advanced Python for Financial Applications — Decorators, generators, and context managers — the patterns that separate beginner Python from production quant code.
Key Learning Outcomes
- Explain the problem with relational databases and market data.
- Apply what makes a TSDB different.
- Recognize the major players.
- Describe a practical example.
- Walk through when you actually need one.
- Identify database as it applies to time series databases in finance.
- Articulate time-series as it applies to time series databases in finance.
- Trace tick-data as it applies to time series databases in finance.
- Map how time series databases in finance surfaces at Citadel, Two Sigma, Jane Street, or HRT.
- Pinpoint the US regulatory framing — SEC, CFTC, FINRA — relevant to time series databases in finance.
- Explain a single-paragraph elevator pitch for time series databases in finance suitable for an interviewer.
- Apply one common production failure mode of the techniques in time series databases in finance.
- Recognize when time series databases in finance is the wrong tool and what to use instead.
- Describe how time series databases in finance interacts with the order management and risk gates in a US trading stack.
- Walk through a back-of-the-envelope sanity check that proves your implementation of time series databases in finance is roughly right.
- Identify which US firms publicly hire against the skills covered in time series databases in finance.
- Articulate a follow-up topic from this knowledge base that deepens time series databases in finance.
- Trace how time series databases in finance would appear on a phone screen or onsite interview at a US quant shop.
- Map the day-one mistake a junior would make on time series databases in finance and the senior's fix.
- Pinpoint how to defend a design choice involving time series databases in finance in a code review.