Add Bluetooth to Gladys 4

I will raise a ticket on the library’s GitHub, but it’s the weekend and family time, so I won’t have much availability.

Oh man, I just saw your message!

Did it crash immediately without any action on your part? What hardware are you using exactly?

I’m running a Gladys on Ubuntu server as well, and I haven’t had the issue. There is an error on startup but it doesn’t crash Gladys.

We need to find a way to solidify the core code so that this kind of issue doesn’t affect the entire Gladys instance. I already have a to-do item pending before the RC to add a listener on the unHandledPromiseRejection error to catch the promises in error, but here I have the impression that it’s a system crash that can’t be caught in node, am I wrong @AlexTrovato?

@pierre-gilles indeed I feel like we’re in a case that can’t be caught. I’ve opened a ticket on noble’s GitHub to see if anyone knows how to handle this.

Otherwise, I was thinking of implementing a system in Gladys, a parameter that indicates whether the service startup was successful. If not, on restart, « ignore » the service, even if it means creating a page to « force » the service restart. And if it crashes again, repeat this process (change the service parameter to indicate that loading failed…).

Great idea! This would save us a lot of trouble, because right now, if a novice user encounters the same bug as @VonOx, they wouldn’t be able to fix it without getting their hands dirty (Watchtower only updates running containers, not those that restart continuously. In other words, if a version of Gladys crashes, everyone’s installation is bricked…).

Another option I was thinking of (in addition to your idea) is to run the entire service part in a worker node. We would then have a main worker that handles the library/API part, and a secondary worker for the services. However, this is a major undertaking, so it’s more of a « future » idea.

I’m not against a different worker, but at this point, we might even need one worker per service.
However, in both cases, it’s work, but I think I’ll look into the first option ASAP, though I can’t give you a timeline.
If you take on the subject, let me know, so we don’t double up on work, even though I think you have plenty of other tasks.

None, watchtower performed the update and crashed

The host is nothing exotic, it’s an Ubuntu 20.04 Xeon x64 server, no Bluetooth. I only have Docker on it + Plex.

I noticed it when I wanted to manage my hue because gladys no longer displayed an empty dashboard for me (by the way, there is a lack of information if the instance is not up)

I haven’t tried recreating the container yet, I’ll try that tomorrow.

I created this issue

https://github.com/GladysAssistant/Gladys/issues/900

I tried a fresh container (new db), same crash

@AlexTrovato My idea for the worker wasn’t to directly address this problem, as in that case, as you say, you would need a worker per service, which isn’t realistic (RAM usage would constantly increase depending on the number of services used).

The idea behind the worker was simply to separate the core from the services so that if the service part receives too many events, blocks the event loop, or crashes, at least the user still has access to the UI and can manage their Gladys instance.

After that, it’s a longer-term project and it doesn’t solve the problem here. I agree that your idea 1 is good.

Ok, how do you see this in terms of specs? Because at first I was thinking of doing something like this:

  1. Have a boolean attribute in the « t_service » table in the DB, « has_crashed_last_boot »
  2. On each service startup, set the boolean to true in the DB (so if Gladys crashes during service startup, then on the next restart Gladys will see the service with « has_crashed_last_boot » = true)
  3. Once the service has started successfully, update t_service with has_crashed_last_boot = false.

The problem is that the crash doesn’t necessarily occur at service startup! For example, in the case of @VonOx, the crash occurred after:

2020-10-16T22:22:24+0200 <info> index.js:13 (Object.start) Starting usb service
2020-10-16T22:22:24+0200 <info> index.js:16 (Object.start) Starting zwave service
2020-10-16T22:22:24+0200 <info> index.js:15 (Object.start) Starting Bluetooth service
2020-10-16T22:22:24+0200 <info> index.js:19 (Object.start) Starting telegram service
2020-10-16T22:22:24+0200 <info> index.js:20 (Object.start) Starting Open Weather service

/src/server/services/bluetooth/node_modules/@abandonware/noble/lib/hci-socket/hci.js:100
    this._deviceId = this._socket.bindRaw(deviceId);

So this idea doesn’t work.. :confused: Do you have any other ideas?

Oh actually you’ve already made a PR! I just saw, sorry ^^

I’ll check out your PR!

Ok, I’ve done some tests, and in the end, this solution is a good first solution @AlexTrovato! It’s true that it won’t catch everything, but at least for hardware « incompatibility » issues, it does the job.

I just added a comment about a migration you removed; I’m not sure we want that in this PR, but otherwise, I’m okay with merging.

However, we’ll need to find a way to reactivate the integrations in the UI, because otherwise, they are « bricked » for life.

For info, I tried to launch Gladys outside Docker, same error

Edit: Problem solved with a plugged-in dongle

On the host

sudo rfkill unblock all
sudo hciconfig hci0 up

With this, Gladys starts (there’s something wrong but at least it runs)

<info>index.js:20 (Object.start) Starting Open Weather service
<info>index.js:16 (Object.start) Starting zwave service
<info>index.js:15 (Object.start) Starting Bluetooth service
<info>index.js:13 (Object.start) Starting usb service
<info>index.js:19 (Object.start) Starting telegram service
<error>bluetooth.connectDevices.js:37 () Could not start scanning, state is unknown (not poweredOn)
<error>bluetooth.connectDevices.js:37 () Could not start scanning, state is unknown (not poweredOn)
<error>bluetooth.connectDevices.js:37 () Could not start scanning, state is unknown (not poweredOn)
<info>connect.js:38 (MqttClient.<anonymous>) Connected to MQTT server mqtt://127.0.0.1:1883
<info>subscribe.js:12 (MqttHandler.subscribe) Subscribing to MQTT topic stat/+/+
<info>subscribe.js:12 (MqttHandler.subscribe) Subscribing to MQTT topic tele/+/+
<info>subscribe.js:12 (MqttHandler.subscribe) Subscribing to MQTT topic gladys/master/#
<info>index.js:63 (Server.<anonymous>) Server listening on port 80

Then during a scan....

<error>bluetooth.connectDevices.js:37 () Bluetooth: peripheral node_id not found
<error>bluetooth.connectDevices.js:37 () Bluetooth: peripheral 001788286ece not found
<error>bluetooth.connectDevices.js:37 () Bluetooth: peripheral bridge not found
<error>bluetooth.connectDevices.js:37 () Bluetooth: Peripheral undefined not connectable
<error>bluetooth.connectDevices.js:37 () Bluetooth: peripheral 001788286ece not found

In the host’s CLI, it’s OK
image

Issue resolved for you, but it’s not normal that this crash occurred. This issue remains a critical issue for me.

@AlexTrovato’s solution is good for preventing unexpected crashes on some new services, but this bug is still there; plugging in a dongle is not a solution.

@AlexTrovato Do you know what could resolve this bug? Is it 100% on Noble’s side, or is this an issue we could resolve on our side? (Even if it means not launching the Bluetooth service in some cases, the solution might simply be to disable the Bluetooth service in some cases—without restarting Gladys, as a reboot can take several minutes on a Pi; that’s not a solution.)

Just in terms of the roadmap, I’ve finished all the remaining tasks before the RC, and for me, I only have documentation/site/marketing todos left for the RC launch.

So this whole week, I’ll be working on the site/the documentation/contacting bloggers, and then as soon as everything is ready, I’ll launch Gladys v4.

Stability is quite important since we’re going to have an influx of new users in the coming weeks :smiley:

I never said the opposite :confused: , I should have written « I resolved my issue ». The GitHub issues remain open

For the issue, I created a ticket on noble, I think that by playing with the noble config, we might be able to avoid the problem, but the code is quite complex and very specific to the host architecture.
I am waiting for answers on the noble github.

I understand!

And do you know if there is a way, outside of Noble, to detect the presence or absence of Bluetooth on the host, and to disable the service if Bluetooth is not present?

@VonOx I found the « graal » PR for us on the watchtower repo, which will even solve the most serious crashes!

Oh yes, it’s all fresh too, I’m going to give a thumbs up so that it merges quickly ^^

I conducted 5 Bluetooth device searches:

  • The JBL headset appears as many times as I performed searches, with different addresses:
    bluetooth:5fcf91887a7e bluetooth:7592c20fbffa bluetooth:72274a02287e bluetooth:4534f2918e7b bluetooth:70d66dcd52a0
  • My Automower robot mower appears with its name.
  • 5 other unnamed devices.
    My phone is not detected. Its Bluetooth address is not in the list, and conversely, it does not detect Gladys.
    Neither does my PC.
    Nor does my Nut Mini key fob.

Great job!

For my part, Bluetooth searches with the name:
[TV] Samsung Q80 Series (75)
Nut

I have detected 4 additional devices but I don’t know what they correspond to (probably phone, connected speaker …)

What are the use cases for the service already planned? To be planned?
For example, I don’t think I can test the presence of the Nut key fob