Looking for DuckDB query optim

Following the discussions about RAM locking by DuckDB, it would be appropriate to check whether we can optimize the heaviest queries.

Pagination is one of the options to consider.

Below is an example of one of my heaviest pages for energy monitoring, each chart containing 8 series (which may be increased elsewhere) and everything being detailed for each phase (L1/L2/L3) + the total.

Time-based visualizations up to 3 months are smooth. But the yearly view is slower, around 10s.

Initial query when displaying the dashboard with charts for the last hour:


Start of loading when switching to year on the 1st chart

End of loading

Query details

Temporal detail of the query

Following the 4.66.3 update, I confirm that I now take « only Â» 1s more for the yearly view and no difference is felt for the rest:

@pierre-gilles, feel free to tell me if you want more details

No need for pagination, we do sampling, we only return at most 300 rows per chart :slight_smile:

I’ll take a look with this information, thanks for the detailed post!

1 Like

Can you run tests on your database?

I’d like to see the result of this query:

SELECT
    TIME_BUCKET(INTERVAL ? MINUTES, created_at) AS created_at,
    AVG(value) AS value,
    MAX(value) AS max_value,
    MIN(value) AS min_value,
    SUM(value) AS sum_value,
    COUNT(value) AS count_value
FROM
    t_device_feature_state
WHERE device_feature_id = ?
AND created_at \u003e ?
GROUP BY 1
ORDER BY created_at;

You need to replace the ?:

  • Replace the first with 1752
  • Replace the second with the most complete device_feature_id you have
  • Replace the third with the date from one year ago, i.e. '2024-12-14T10:04:11.478Z'

If you want to compare, here’s the current query:

  WITH intervals AS (
        SELECT
            created_at,
            value,
            NTILE(300) OVER (ORDER BY created_at) AS interval
        FROM
            t_device_feature_state
        WHERE device_feature_id = ?
        AND created_at \u003e ?
    )
    SELECT
        MIN(created_at) AS created_at,
        AVG(value) AS value,
        MAX(value) AS max_value,
        MIN(value) AS min_value,
        SUM(value) AS sum_value,
        COUNT(value) AS count_value
    FROM
        intervals
    GROUP BY
        interval
    ORDER BY
        created_at;
1 Like

One question, why 1752? That makes buckets of 29h12??

Otherwise, here are the results:

  1. Aggregation by time bucket - TimeBucket Query
SELECT
    TIME_BUCKET(INTERVAL 1752 MINUTES, created_at) AS created_at,
    AVG(value) AS value,
    MAX(value) AS max_value,
    MIN(value) AS min_value,
    SUM(value) AS sum_value,
    COUNT(value) AS count_value
FROM
    t_device_feature_state
WHERE
    device_feature_id = 'e9079170-4654-4fad-b726-23a36c793905'
    AND created_at \u003e TIMESTAMP '2024-12-14 10:04:11.478'
GROUP BY 1
ORDER BY created_at;

image
Result

  • 301 rows returned
  • Execution time ≈ 0.486 s
  • Fixed time buckets (~29h)
  1. Quantile partitioning - NTILE(300) Query
WITH intervals AS (
    SELECT
        created_at,
        value,
        NTILE(300) OVER (ORDER BY created_at) AS interval
    FROM
        t_device_feature_state
    WHERE
        device_feature_id = 'e9079170-4654-4fad-b726-23a36c793905'
        AND created_at \u003e TIMESTAMP '2024-12-14 10:04:11.478'
)
SELECT
    MIN(created_at) AS created_at,
    AVG(value) AS value,
    MAX(value) AS max_value,
    MIN(value) AS min_value,
    SUM(value) AS sum_value,
    COUNT(value) AS count_value
FROM
    intervals
GROUP BY
    interval
ORDER BY
    created_at;

image
Result

  • 300 rows returned
  • Execution time ≈ 0.370 s
  • Intervals balanced by number of points

Exactly! It’s 525600/300 :smiley:

However, the performance isn’t better, and both are actually pretty good performance-wise


Also, in your graph, you have 10 features, so you have to multiply by 10, which explains your performance.

The solution might not be there, but maybe in a single common query for the 10 features

1 Like

Well, I just ran some tests
 And from that I would say that the problem is most likely not the queries to DuckDB (I believe we had already run this kind of tests directly in dev in Gladys ^^). It must simply be the rendering or rather the construction of the series
 :

  1. Aggregation by time bucket - TimeBucket Query
SELECT
    device_feature_id,
    TIME_BUCKET(INTERVAL 1752 MINUTES, created_at) AS created_at,
    AVG(value) AS value,
    MAX(value) AS max_value,
    MIN(value) AS min_value,
    SUM(value) AS sum_value,
    COUNT(value) AS count_value
FROM t_device_feature_state
WHERE device_feature_id IN (
	'eaa982b7-a3c8-4b4d-b2d0-5a5cf5245938',
    'e9079170-4654-4fad-b726-23a36c793905',
    'c59fca37-54f3-418b-b27c-857e08e2a0fb',
    'da79df66-dcff-438e-859c-39b641fd15e9',
    '5eba10ac-4c3a-479c-aba7-a1e74a00d914',
    '4b5e90f0-a86b-405b-8ca1-953705ef79a3',
    'c92f8fa3-d841-4692-bd3a-e5addb760a73',
    'b5445f10-7aa9-41bf-8cae-d2f30498652f',
    '08e90478-4bb2-4a08-986f-3d6f212c95f1',
    '85311f8c-334f-4b7e-8294-2b841c0b8fe8'
)
AND created_at > TIMESTAMP '2024-12-14 10:04:11.478'
GROUP BY
    device_feature_id,
    created_at
ORDER BY
    device_feature_id,
    created_at;

image

  1. Partitioning by quantiles - NTILE(300) Query
WITH intervals AS (
    SELECT
        device_feature_id,
        created_at,
        value,
        NTILE(300) OVER (
            PARTITION BY device_feature_id
            ORDER BY created_at
        ) AS interval
    FROM t_device_feature_state
    WHERE device_feature_id IN (
		'eaa982b7-a3c8-4b4d-b2d0-5a5cf5245938',
	    'e9079170-4654-4fad-b726-23a36c793905',
	    'c59fca37-54f3-418b-b27c-857e08e2a0fb',
	    'da79df66-dcff-438e-859c-39b641fd15e9',
	    '5eba10ac-4c3a-479c-aba7-a1e74a00d914',
	    '4b5e90f0-a86b-405b-8ca1-953705ef79a3',
	    'c92f8fa3-d841-4692-bd3a-e5addb760a73',
	    'b5445f10-7aa9-41bf-8cae-d2f30498652f',
	    '08e90478-4bb2-4a08-986f-3d6f212c95f1',
	    '85311f8c-334f-4b7e-8294-2b841c0b8fe8'
    )
    AND created_at > TIMESTAMP '2024-12-14 10:04:11.478'
)
SELECT
    device_feature_id,
    MIN(created_at) AS created_at,
    AVG(value) AS value,
    MAX(value) AS max_value,
    MIN(value) AS min_value,
    SUM(value) AS sum_value,
    COUNT(value) AS count_value
FROM intervals
GROUP BY
    device_feature_id,
    interval
ORDER BY
    device_feature_id,
    created_at;

image

So I re-ran a performance test on the dashboard:
image

ChatGPT Study ^^ :

:magnifying_glass_tilted_left: Direct reading of your Performance recording

:stopwatch: Total duration

  • Total ≈ 13,190 ms (13.2 s)

:bar_chart: Breakdown (left panel)

  • Rendering (Affichage): 2,330 ms
  • Paint / Layout (Rendu): 791 ms
  • Script (JS): 563 ms
  • Network loading (Chargement rĂ©seau): 27 ms
  • System / unassigned (SystĂšme / non attribuĂ©): ~5,200 ms
  • Total “useful” visible: ~4 s
  • Big gap “unassigned / idle / waiting”: ~9 s

:backhand_index_pointing_right: Immediate conclusion:
:cross_mark: it’s neither the network,
:cross_mark: nor DuckDB,
:cross_mark: nor pure JavaScript,
:backhand_index_pointing_right: it’s mostly waiting time before the useful rendering starts.


:magnifying_glass_tilted_right: Focus on the Network part (key for Gladys)

In your table on the right:

  • Request: aggregated_states (10.5.0.227)
  • Transferred size: 213 KB
  • Execution time: ~260 ms

:backhand_index_pointing_right: That confirms what we suspected:

  • the backend query is fast
  • the payload is reasonable
  • < 300 ms, so not the problem

:right_arrow: We can now say without ambiguity:

Chrome Performance measurements show that:

  • the backend query is < 300 ms
  • the payload is ~213 KB
  • pure JS < 600 ms

The ~13 s experienced are mainly due to a waiting phase + front-end chart rendering.
Optimization should therefore be done mostly on the React / chart rendering pipeline, not on DuckDB.


:brain: So where do the ~9 seconds go?

According to the flame chart and timeline:

:one: “Waiting / unassigned” phase (~9 s)

It’s typically:

  • waiting for a complete React state
  • waiting for all series to be ready
  • synchronization before mounting the chart
  • or logic like:
    • “waiting for all promises”
    • “waiting for X to be computed before rendering”

:backhand_index_pointing_right: Classic anti-pattern on complex dashboards.


:two: “Rendering + Layout” phase (~3 s)

Here, we are clearly seeing:

  • many points
  • many series
  • probably:
    • recalculation of scales
    • recalculation of legends
    • tooltips
    • min/max
    • animations

:backhand_index_pointing_right: This is consistent with an annual multi-curve graph.

:receipt: Conclusion (which you can already state)

You can already say, with evidence:

Chrome Performance measurements show that:

  • the backend query is < 300 ms
  • the payload is ~213 KB
  • pure JS < 600 ms

The ~13 s experienced are mainly due to a waiting phase + front-end chart rendering.
Optimization should therefore be done mostly on the React / chart rendering pipeline, not on DuckDB.

:compass: Next concrete steps (if you want to go further)

Without even seeing the code, the obvious levers are:

  • render the chart progressively (series one by one)
  • limit the number of points before rendering (hard cap)
  • disable animations/transitions on long views
  • virtualize / pure canvas if not already the case

I don’t agree with ChatGPT, we can clearly see in your screenshots that it’s the backend response that takes 11 seconds to reply:

Which is not surprising, you showed me that one request took 400-500ms, so 500 ms * 10 = 5 seconds at best :slight_smile:

Exactly, you prove my point: for now, we’re not making one request with 10 IDs, we’re making 10 separate requests!

Ok, I understand indeed.
Here is the result for the 10 separate requests:
image

That actually proves the point, since you have to take into account that when you display your dashboard, you don’t have just this widget, but potentially 5 other widgets, so we’re more like 50 requests :slight_smile:

Yes yes, I agree with what you’re saying!!

Mmmmh I don’t agree with that, because the yearly display is sent after the first load (hourly display on page refresh). When I select the yearly graph, I only activate that one. We can also clearly see the other â€č live â€ș requests being added afterwards and queued.

But I’m ok to say that currently we’re more in the order of 3 to 5s waiting for DuckDB requests.

1 Like