Live Coding: Discovering DuckDB on Thursday, June 20 at 10 AM!

pierre-gilles · August 5, 2024, 3:57pm

Hi everyone!

I’ll be testing this version in the coming days on my production (PROD) at home.

If the tests are successful, I’d like to know whether a deployment during the last week of August is feasible or if many of you will still be on vacation?

As this is a major update, I want to be careful and avoid causing issues for users on vacation who won’t be home to monitor the update

Quick poll:

OK starting August 26
OK starting August 29
OK starting September 2

0 votant

Thanks everyone for your replies!

pierre-gilles · August 6, 2024, 7:39pm

I just installed the DuckDB test image on my prod

My prod has been running since February with more than thirty Zigbee devices + one camera.

I currently have 996k sensor values in the database:

The migration took 1 minute 20 seconds on my setup:

2024-08-06T21:10:19 DuckDB: Migrating data from SQLite
...
2024-08-06T21:11:47 DuckDB: Finished migrating DuckDB.

(As a reminder, the migration is non-blocking so during that time my instance was running without issue)

Before / after:

SQLite database: 905 MB
DuckDB database: 18 MB

That’s impressive!

What’s even more impressive is the responsiveness of the charts.

I can easily display months of temperature history, it’s very smooth and the values look much cleaner because 2 issues are now solved by this tech:

No weird rounding outside the 24h view on the charts
There are only "live" data anymore, so the last hour’s data is visible on all charts!

Cleaning up old values

I clicked on "purge SQLite states":

The purge took 12 minutes on my setup.

Database cleanup

I ran a database cleanup, which was almost instantaneous for me.

Then I restarted Gladys.

The result

1008K Aug  6 19:34 gladys-production.db
32K Aug  6 19:34 gladys-production.db-shm
278K Aug  6 19:34 gladys-production.db-wal
3.1M Aug  6 19:11 gladys-production.duckdb
15M Aug  6 19:34 gladys-production.duckdb.wal

Total = 19 MB

With a previous size of 905 MB, that’s -97.9%

I’ll keep monitoring my instance over the next few days and see how it goes

guim31 · August 6, 2024, 8:52pm

That’s so great, well done for all that work!!

GBoulvin · August 7, 2024, 10:04am

Uh…
Is that possible?!

Migration start 9:53
End 11:47

Edit: However, I’m not sure this is very effective:

I don’t know how that’s possible. I think I refreshed the page and the request was resent…
Otherwise, it’s moving. Slowly but it’s moving 4% after an hour) !

pierre-gilles · August 7, 2024, 5:42pm

Unbelievable!

It’s

GBoulvin · August 7, 2024, 8:25pm

I was talking about having two processes running in parallel…
After 10:00, I’m at 20%
12:00 – 57%.
I’ll do a full

pierre-gilles · August 8, 2024, 5:42pm

When I see the difference in Gladys Plus backups before/after

(the backups are compressed)

Will_71 · August 8, 2024, 6:01pm

There will be plenty of space on your servers like this for backups of new Gladys Plus users👌

GBoulvin · August 8, 2024, 9:15pm

Here’s a more detailed look at how the migration went.
Overall, no problem.
First of all, I was surprised that the migration started as soon as the container was started. In itself, that doesn’t cause a problem.

However, the migration immediately started at 23%.
In the tasks, you can clearly see the progress.

It’s slow, but it’s moving forward.
Meanwhile, Gladys is slowed but not unusable.

1GB of RAM used.

The result, crazy, a DB of 65MB instead of 11.4GB!

However, the next step, the purge… was painful!
Several very long hours during which Gladys was, on my setup, close to paralysis…
This morning, I see that

And it doesn’t move anymore.
Then I receive a restart notification for Gladys (maybe watchtower?). And in the tasks:

(There’s a typo: ‹ a échoué › would be more correct)
However:

So I start a database cleanup which, as expected, blocks Gladys for about an hour.
But after checking, I find that the old DB is not empty:

So I restarted the purge and DB cleanup steps and there you go (oh yes, and a manual container restart to clear the db-wal).

And since we were talking about backups, a screenshot of mine:

Obviously, it’s very small, if not uncompressed (is that still necessary?)
Total time… Nearly 30h…
We agree that the purge and cleanup steps are not essential but they allowed me to solve my storage problem

pierre-gilles · August 9, 2024, 7:54am

Yes, the migration is mandatory, the old code for SQLite is no longer present, and as long as the migration hasn’t finished, you have an incomplete Gladys operation (the graphs will be empty otherwise).

The migration is not linear; the percentage represents the percentage of features migrated. Some features have many states, others fewer.

Trying to make an even more precise progress bar would be counterproductive — it would take time to calculate how many states each feature has, and that would slow the migration.

Yes, starting the task twice must have doubled the load I don’t know how you did that ^^ Same for the percentage that got stuck — the two tasks each did the work and so halfway the job must have been finished, and they couldn’t progress further. In itself, it’s not very serious ^^ You did well to restart the task after the reboot; it finished the job.

Yes, I deployed 2 updates to the DuckDB image, Wednesday evening and Thursday evening, to fix various issues I had noticed on my prod. Watchtower must have run afterwards.

It’s an old translation, not new to this release, but I’ll correct it ^^

Now, the backup file exported by DuckDB is in .parquet format, and this format is heavier than DuckDB’s storage format which is very compressed. So there will be less difference between the Gladys Plus backup and the local files, even the file on Gladys Plus will be a bit heavier. But when you look at the file sizes…

Thanks for your very detailed feedback!

Despite the little hiccup of the purge task being launched twice, the migration is clearly a success for you.

With 13 million states, 11 GB of SQLite DB, and all that on a setup that’s not among the most powerful (Raspberry Pi 4 + SSD over USB I think), it’s really neat!

If you ever want to keep this setup as PROD (so you don’t have to do the migration a second time), you can. You will just need to switch back to the image tagged v4 as soon as I publish that to PROD (end of the month).

A big thank you for testing

GBoulvin · August 9, 2024, 10:46am

Me neither, but when restarting the task after Gladys rebooted, it happened again — duplicated again, and in both cases about fifteen minutes apart. The second time, I’m sure I didn’t click again or refresh this page…

pierre-gilles · August 9, 2024, 11:23am

OK, I’ll add code to prevent duplicate launches.

Terdious · August 15, 2024, 8:30am

Hello !!

Back from vacation, I’m jumping in as well:

At the image launch: 10:35 - Quite a significant slowdown but Gladys still available (as with @GBoulvin)

Progress preview unavailable on the system page during the first 5 minutes:

Then:

EDIT1: 9% after 10 minutes, not bad ^^

pierre-gilles · August 15, 2024, 9:04am

So cool! Can’t wait to see how it looks at your place

Terdious · August 15, 2024, 9:52am

After 1 hour it seems to be struggling, page refreshes are getting slower and slower (about 1 minute or so) and the % isn’t moving forward (stuck at 31% for 15 minutes ^^) but the states are increasing fine ^^ So no worries ^^

On the resource side, it doesn’t change much:

EDIT: After 2h10 :

Terdious · August 15, 2024, 12:54pm

Migration completed in 3h30 (approximately ^^) :

Result :

As already said above and agreed by everyone: Impressive ^^

Well… now the part I’ve been dreading for a while, the purge!^^

For me, Gladys has been unavailable for 5 minutes (start of the purge)

EDIT: 400k states purged in 15 minutes … we’ll let it keep working ^^ because if we take a flat reasoning: 26,700 states per minute => 3,000 minutes => 50h ^^

Gladys is available again.
1% after 20 minutes

pierre-gilles · August 15, 2024, 1:55pm

For 80 million states that’s clean!

Can’t wait to see the final result once the DB is

Terdious · August 15, 2024, 5:52pm

We shouldn’t be far from that for

pierre-gilles · August 15, 2024, 6:13pm

I’m not surprised; the purge is being done very gradually so the instance can continue to operate (continue to receive sensor values, run scenes)

Everything that touches the SQLite DB remains slowed down, of course, but it’s not entirely blocked either!

Keep us posted

Terdious · August 16, 2024, 11:44am

Small question @pierre-gilles,
Is it normal to no longer have a graph/history after migration and during the purge? Because for my part I currently have no data except the last value.

So the purge would therefore be mandatory before using the DuckDB database or was it supposed to take over after the migration?

Since the message no longer appears at the top of the page once the migration is finished and there is no indication to restart, I have doubts ^^

EDIT: Well, I have my answer once the SQLite states dropped to 0 everything came back ^^ So purge mandatory before using DuckDB. However, the purge in the tasks is still in progress (69%):

I shouldn’t touch anything for now (no DB cleanup)?

Result of deleting the SQLite states: 23h for 80 million states (2024-08-15 12:55:28.297 +00:00 => 2024-08-16 11:56:34.779 +00:00) … Waiting for the end of the final purge.

Topic		Replies	Views
Gladys Assistant 4.45 : DuckDB, une révolution dans Gladys ! ⚡ Actualités	106	942	September 14, 2024
26 août 2024 : Mise à jour majeure de Gladys! Actualités	25	759	August 30, 2024
Gladys Assistant v4.8.6, avec un nouvel algorithme de sauvegardes Gladys Plus! Actualités	17	992	May 21, 2022
Suite Bug backup Gladys Plus -> Bug global Gladys Gladys Plus	21	1609	March 29, 2022
Problème de performance sur dashboard avec beaucoup de graphiques Configuration	52	2465	October 3, 2022

Live Coding: Discovering DuckDB on Thursday, June 20 at 10 AM!

Related topics