← Back to Topics
Topic Overview
Time-series databases optimize for workloads where data points are indexed by timestamp and
queries typically aggregate or filter by time ranges. These systems handle write-heavy
ingestion patterns (millions of points per second) with efficient compression algorithms
that exploit temporal locality. Unlike general-purpose databases, time-series databases
implement time-based partitioning, automatic downsampling, and retention policies that
delete old data. InfluxDB uses a NoSQL line protocol optimized for high-throughput writes,
while TimescaleDB extends PostgreSQL with hypertables that automatically partition by time.
The choice between SQL and NoSQL approaches affects query expressiveness, tooling ecosystem,
and operational complexity. Students must evaluate when specialized time-series databases
provide necessary performance versus when general-purpose databases with time-series extensions
are sufficient.
Student Presentation Assignments
Student 1:Time-Series Data Fundamentals
Required Coverage:
- Must formally define time-series data, specifying characteristics (timestamp ordering, high cardinality, append-only writes) and distinguishing from other temporal data
- Must compare write-heavy vs read-heavy workloads, analyzing typical access patterns and how they affect database design
- Must explain retention policies and downsampling, specifying how they manage data volume and analyzing trade-offs in data loss vs storage costs
- Must justify time-series database selection for at least two applications (e.g., IoT, monitoring), explaining why general-purpose databases are inadequate
- Must analyze why general-purpose databases struggle with time-series data, specifying indexing, partitioning, and compression challenges
- Must explain time-series data modeling (tags, fields, measurements), analyzing design rationale and comparing with relational normalization approaches, demonstrating why normalization breaks down for time-series workloads
Student 2:InfluxDB (NoSQL Approach)
Required Coverage:
- Must explain line protocol and data ingestion, specifying format syntax and analyzing performance characteristics
- Must explain storage engine and compression, analyzing why time-series data compresses well (temporal locality, value similarity) and how this affects storage design, not merely stating compression ratios
- Must explain Flux query language as a query model, analyzing how it differs from SQL in expressing time-series operations and identifying limitations, not providing a syntax tutorial
- Must explain retention policies and automatic data management, specifying how they work and analyzing trade-offs
- Must analyze scaling and clustering challenges, identifying specific limitations and comparing solutions (InfluxDB Enterprise vs open source)
- Must explain InfluxDB architecture components and data flow, specifying how writes and queries are processed
Student 3:TimescaleDB (SQL-Based Approach)
Required Coverage:
- Must explain hypertables and chunking, specifying automatic partitioning strategy and how it differs from manual partitioning
- Must evaluate SQL compatibility advantages, analyzing how existing tools and ecosystem integration differ from NoSQL approaches
- Must compare indexing strategies for time-series (B-tree vs specialized indexes), analyzing performance trade-offs
- Must explain continuous aggregates and pre-computed rollups, specifying how they improve query performance and storage trade-offs
- Must compare TimescaleDB with InfluxDB, analyzing SQL vs NoSQL trade-offs in query expressiveness, performance, and operational complexity
- Must evaluate PostgreSQL extension model, specifying benefits (ecosystem compatibility) and limitations (PostgreSQL constraints)
Student 4:Choosing the Right Time-Series Database
Required Coverage:
- Must evaluate SQL vs NoSQL trade-offs, specifying when to choose each approach based on query patterns and tooling requirements
- Must analyze performance benchmarks, comparing ingestion rates and query performance with quantitative data where available
- Must evaluate cost considerations, quantifying storage, compute, and operational overhead for different deployment scenarios
- Must explain integration with dashboards (e.g., Grafana) as a system design constraint, analyzing how database choice affects visualization integration and operational workflows, not listing dashboard features
- Must evaluate at least one production case study, analyzing real-world deployment challenges and solutions
- Must identify scenarios where time-series databases are not necessary, justifying when general-purpose databases suffice
Presentation Requirements
All presentations must be 17–20 minutes in duration and include the following components:
- Problem Context: What problem this technology solves and why traditional databases struggle
- Core Concepts: Clear explanation with correct technical terminology
- System Details: How it works in practice with concrete examples
- Trade-offs: Strengths, limitations, and when it is appropriate vs not appropriate
- Real-World Perspective: At least one realistic application scenario and production considerations
Note: Presentations that only summarize definitions, list features, or copy diagrams without
interpretation will receive low marks. Each presentation must demonstrate analytical reasoning through
comparisons, trade-off analysis, and justification of design decisions. Reading slides verbatim or
presenting material that could be satisfied by reading documentation will be penalized.
Report Requirement: In addition to the presentation, each student must submit an individual PDF report.
See Seminar Report Requirements for format, content, and submission details.
Evaluation Criteria
| Criterion |
Weight |
Description |
| Technical Correctness |
30% |
Accuracy of technical content, correct use of terminology, absence of errors |
| Depth of Understanding |
25% |
Goes beyond surface-level definitions, demonstrates system-level comprehension |
| Clarity and Structure |
20% |
Logical flow, clear explanations, appropriate use of examples and visuals |
| Use of Examples and Trade-offs |
15% |
Concrete examples, discussion of limitations, comparison with alternatives |
| Slide Quality and Time Management |
10% |
Professional formatting, appropriate pacing, stays within time limit |
Recommended References
Books:
- Kleppmann, Martin. Designing Data-Intensive Applications. O'Reilly Media, 2017. (Note: Time-series databases are not directly covered; Chapter 3 on Storage and Retrieval provides relevant background)
Documentation:
Academic / Technical:
- Time-series database survey papers and benchmarks
- TimescaleDB research papers on hypertables and compression