Unable to reconnect to Gladys Plus after internet outage

Hi @pierre-gilles,

I had issues with my Free connection today.
Gladys is accessible locally so no problem, however each time I tried to reconnect with Gladys Plus (on my phone or PC) I had this message:

But in the dropdown I don’t see any available users.
To resolve the issue I have to:

  • connect locally
  • disconnect from Gladys Plus
  • reconnect to Gladys Plus
  • as a result I get a new key
  • I reauthorize my user
  • I access Gladys Plus again and this time my user appears in the list

So this worries me a bit because I think that if I’m on vacation or otherwise I won’t be able to reconnect.

Hi @Nagromdark :slight_smile:

Interesting — normally Gladys Plus reconnects on its own, but you might have been in a particular case that prevented reconnection.

By any chance, would you still have access to the Gladys logs from when it happened? (docker logs gladys) That would help me reproduce it!

My bad I created a duplicate

https://community

Another disconnection on my end without a power outage :frowning:

It may have nothing to do with it, but when I encountered a similar problem, it was because I had left (by mistake) a Gladys test image running on my NAS.
So I had 2 instances of Gladys trying to connect at the same time.

Maybe you’re in that situation?

I did indeed make changes related to Gladys Plus in 4.56, so it’s possible there’s an issue.

I’m investigating!

On the server side, I notice that some clients reconnect to Gladys a lot and that creates server load problems, so there seems to be a real issue :sweat_smile: (or someone is DoS-ing the server, but I doubt it — it coincides with the release of 4.56)

CPU load of the Gladys Plus load balancer:

And in the server logs, I have endless lines of websocket reconnections happening quite violently!

@Nagromdark @spenceur Do you happen to have any Gladys logs you could give me? :slight_smile:

Ok there’s a real problem with reconnection, I restarted the Gladys Plus architecture, and it disconnected my personal instance which can no longer reconnect! I think I’ll just roll back the Gladys Plus changes I had made while I try to find the source of the bug!

PR Revert:

Gladys Assistant 4.56.1 is being built:

In the meantime, I think I’ve identified the bug.

In Gladys Assistant 4.56, I introduced a new authentication logic on the WebSockets, allowing a faster connection: ideal for instant access to the dashboard on mobile.

The problem? If the instance loses the connection, it tries to reconnect with the same access_token used during the first connection. Except that this access_token has expired in the meantime and is not renewed. I’m using a new logic present in the socket.io library and I didn’t understand its behavior on disconnection.

Result: the Gladys Plus backend rejects the connection (expired JWT), and the instance enters an infinite reconnection loop.

It’s a good lesson, and a few avenues for improvement:

  1. Renew the access_token in case of connection loss to resume with a valid token.

  2. Add a delay before reconnecting, to avoid overloading the server in case of an infinite loop.

  3. Strengthen unit tests to better cover connection loss scenarios and prevent this bug from recurring.

Sorry for the inconvenience!

I’ll keep you informed as soon as version 4.56.1 is available :slight_smile:

3 Likes

Gladys Assistant 4.56.1 is available and does indeed fix the bug :white_check_mark:

To update quickly, I recommend the following command:

(With sudo in front if needed in your setup)

docker run --rm \
    -v /var/run/docker.sock:/var/run/docker.sock \
    containrrr/watchtower \
    --run-once
2 Likes

Thanks for the quick fix!

I was away from home and had the same issue on my end… I thought my home automation server with Gladys had crashed. And I just got back and found that everything was working fine locally.

Well, update in progress :wink:

1 Like

Thanks @pierre-gilles for your responsiveness!

2 Likes

Can you also add alerting for the load on the load balancer?

I already have quite a bit of monitoring, that’s what tipped me off that there was a widespread issue :smiley:

Gradually, the more instances were upgraded to Gladys Assistant 4.56, the more often an instance would temporarily lose its connection, and those instances would enter an aggressive infinite reconnection loop.

I was receiving more and more emails, and I realized there was a problem!

1 Like

Hello @pierre-gilles,
Sorry for not replying but I wasn’t available this weekend. I got disconnected several times this weekend from Gladys Plus.
I just updated to 4.56.1 and the reconnection to Gladys Plus happened automatically.
Thanks a lot for the update :wink:

1 Like

Hello,
I also had two instances of Gladys Plus not reconnecting. On Saturday, following an update to my router, and on Sunday for no apparent reason.
Thanks @pierre-gilles for addressing it quickly and over a weekend. At this time the current version is still 4.56.0.
For reasons other than the resolved issue, what are the options to access Gladys Plus again when you’re not on site to restart the mini PC?

Thanks @Jluc :slight_smile:

[quote=« Jluc, post:18, topic:9458 »]
For reasons other than the resolved issue, what are the solutions to access Gladys Plus again when you’re

Free also offers the option to configure a VPN. Could this information be expanded in the training?