For your information I’m working on this topic today, I made good progress this morning and I created a PR
My approach:
From now on, when saving a device where several features have the « Yes, keep states » box unchecked, I launch background jobs that will clean past states, in a somewhat smarter way than what was done up until now.
The job will count in the DB how many states / aggregated states there are for each feature, then will clean the states in small batches of 1000, in order to avoid overloading the DB. Between each batch, Gladys waits 100ms to give Gladys some breathing room.
To clean 5 million states, you therefore need 5 million / 1000 = 5,000 batches.
5,000 * 0.1 = 500 seconds = 8.3 minutes minimum waiting between the batches; if we add 100ms per clean, that makes 16 minutes to clean 5 million states in the background, in a way that is non-blocking for Gladys and gentle
On the Gladys side, any background job can be monitored live in the « Background tasks » tab: