Storage Spilling
Condition
Query in RUNNING
status write too much temporary data on disk. Spilling is usually caused by very large sorts (DISTINCT, ROW_NUMBER() window function, etc.).
How to fix
Consider adding more filters and processing smaller amount of data in one go.
Consider adding more steps with explicit pre-aggregation before sorting to reduce its complexity.
Example
Specific arguments
min_local_spilling_gb (int) - how much data should be spilled on local disk before condition matches, in gigabytes (recommended min value is at least 10Gb to prevent false positives)
min_remote_spilling_gb (int) - how much data should be spilled on remote disk before condition matches, in gigabytes (recommended value is 1Gb, since we do not want see remote spilling at all)
Last updated