Introduction
HeatWave does not simply copy InnoDB tables into memory. It restructures them using columnar storage optimized for analytical workloads. This article explains exactly how that process works.
1. Columnar Transformation
HeatWave converts each InnoDB row-based table into columnar segments:
-
Each column is stored separately
-
Data is aligned for vectorized processing
-
Compression algorithms minimize memory footprint
This drastically improves:
-
Scan speeds
-
Filter performance
-
Join processing
2. Encoding & Compression Techniques
Dictionary Encoding
Used for string columns.
Example:
| Original Value | Encoded Value |
|---|---|
| "USA" | 1 |
| "Canada" | 2 |
Bit-Packing
Numbers are stored using the smallest number of bits possible.
Example: A column containing values 1–10 needs only 4 bits per value.
Run-Length Encoding (RLE)
Used for repeated values in sorted columns.
Adaptive Compression
HeatWave chooses compression depending on data distribution.
3. HeatWave Partitioning
Data is divided into fixed-size segments:
-
Each segment contains a slice of columns
-
Stored redundantly across multiple worker nodes
-
Balanced for parallel processing
Partition sizes are optimized for:
-
Memory locality
-
CPU vectorized operations
-
Cache optimization
4. Memory Footprint Advantages
In many benchmarks:
-
InnoDB table (original): 120 GB
-
HeatWave in-memory (compressed): 30–40 GB
-
Improvement: 3–4× memory reduction
Conclusion
HeatWave’s internal storage layer is the secret to its extremely fast analytical performance. Knowing how encoding works helps you optimize schema design and column types.