Z-Wave and Google Home issue

Bonjour !
Il y a quelques temps que je rencontre un problème notamment avec des appareils Z-Wave.

Lorsque je demande à Google d’éteindre avec la voix, elle me répond « OK, j’éteins » mais rien ne se passe. En revanche, si je le fais via l’application (Google ou Gladys), c’est OK.

Parfois, ma lumière s’éteint mais elle se rallume aussitôt puis s’éteint de nouveau, bref ^^'.

En vérifiant, j’ai trouvé ceci dans mes logs, mais je n’ai jamais regardé juste après un ‹ incident ›.

2022-02-24T19:18:17+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:41 (sendCurrentState) Gladys Gateway: Unable to forward google home reportState
2022-02-24T19:18:17+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
  errno: -3001,
  code: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'api.gladysgateway.com',
  config: {
    url: 'https://api.gladysgateway.com/google/report_state',
    method: 'post',
    data: '{"devices":{"states":{"mqtt-zwave-chambre-enfant-plafonnier":{"online":true,"on":false}}}',
    headers: {
      Accept: 'application/json, text/plain, */*',
      'Content-Type': 'application/json;charset=utf-8',
      authorization: 'iYi05OTFkLTM3OTczMmZkYjdlZSIsImlhdCI6MTY0NTcyNTk5MCwiZXhwIjoxNjQ1NzI5NTkwLCJhdWQiOiJp.hddKDhIiF4znPRHKBU8-E84bhHyGBKy142ScP_vJjm0',
      'User-Agent': 'axios/0.21.1',
      'Content-Length': 90
    },

Côté DNS, pas de requête bloquée pourtant :thinking:

This kind of error is rather clear, it’s a DNS issue.

What is interesting is that we notice something: there are dozens of DNS resolution requests in a row, without any respect for the TTL I give, as there are sometimes 2 requests in the same second!

I conclude that Gladys keeps asking for the server address of the Gateway with each request. Not crazy!

When investigating, this seems to be a known issue on the internet:

While working on a big node eCOM backend that had a lot of traffic, from time to time we found getaddrinfo EAI_AGAIN error in our logs, quick googling explains that this means that our DNS server can’t currently serve our request.

(See this article: https://medium.com/@amirilovic/how-to-fix-node-dns-issues-5d4ec2e12e95)

And when investigating further, I came across this:

EDIT: This behavior is by design: Problème zwave et google home - #6 par pierre-gilles

In short, apparently the Docker Node.js alpine image that we use as a base for Gladys (cf Dockerfile) does not have a package that manages DNS caching, and thus each request does its own DNS resolution: a bit heavy ^^

I continue to investigate the issue to find the right package to add on the Docker side.

cc @VonOx this will interest you :slight_smile:

@spenceur Thanks for the feedback! We’ll fix that!

Is the DNS issue I’m experiencing coming from the ano on the node image or from AdGuard blocking at a certain time?

Thanks for your response

Hard to say, it could come from several points:

  • Locally, the local DNS resolver service that trips over its own feet because there are too many requests in a short time
  • Adguard having trouble responding to the load, as we can see, adguard is contacted for each request

Just to be sure it’s coming from us and not your installation, what specific DNS configuration did you do on your Pi?

If I’m not mistaken, I just added this variable to the resolve.conf

static domain_name_servers
It was a while ago ^^’

I came across comments like this:

So I understand better why the TTL is not respected (both in the Docker Alpine image and on the Debian of Raspberry Pi OS), it’s not a bug, it’s by design on most Linux systems

@spenceur Could you temporarily disable the use of your Adguard on Gladys (and replace it with a popular DNS like Cloudflare DNS (1.1.1.1)), and see if you continue to see these errors.

  • If you continue to see these errors, it will mean that your Adguard is not at fault. We may be too aggressive on the Gladys side in the frequency of calling the Gladys Gateway for the Google Home reportState
  • If you no longer see errors, it means your Adguard has some issues responding at times.

PS: If you change DNS, make sure to check that the new DNS is being used.

I’ll take care of it as soon as possible and keep you informed

PS: I still haven’t had the time :smiley:

I did the manipulation as I mentioned here:

And I did check that I was going through quad9:

Just have to see if I encounter the same issues in the medium term

Great! Keep us updated, I hope this will solve the problem :slight_smile:

bon finalement je peux répondre même sous quad9, j’ai une erreur qui pop :

2022-03-22T20:58:10+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
  errno: -3001,
  code: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'api.gladysgateway.com',
  config: {
    url: 'https://api.gladysgateway.com/google/report_state',
    method: 'post',
    data: '{"devices":{"states":{"fgd212-dimmer-2-2":{"online":true,"brightness":99}}}}',

Run docker inspect gladys I’m almost sure the container has a different DNS.

Here’s what I have with a grep on DNS

docker inspect gladys | grep -iF dns
            "Dns": [],
            "DnsOptions": [],
            "DnsSearch": [],

Still having issues this morning:

2022-03-23T10:22:24+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:41 (sendCurrentState) Gladys Gateway: Unable to forward google home reportState
2022-03-23T10:22:24+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
    at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
  errno: -3001,
  code: 'EAI_AGAIN',
  syscall: 'getaddrinfo',
  hostname: 'api.gladysgateway.com',
  config: {
    url: 'https://api.gladysgateway.com/google/report_state',
    method: 'post',
    data: '{"devices":{"states":{"fgd212-dimmer-2-2":{"online":true,"brightness":99}}}',

No change despite changing DNS

It’s strange, I also have quad9 by default.

Do you know how to connect interactively to the container? The idea would be to retrieve the contents of /etc/resolv.conf

Edit:

The command => docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"

Example output:

vonox@odin in  ~  1 ❯ docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"
search lan
nameserver 192.168.1.1
nameserver 9.9.9.9

Yes, I will take care of it this afternoon

Thank you for leaving my messages as they are :smiley:

Here is the feedback (thanks)

docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"
# Generated by resolvconf
nameserver 9.9.9.9

I just checked your ‹ dig ›

You don’t have a dns response :confused:

On the Quad9 side the domain is ok

On my side:

vonox@odin in  ~ ❯ dig api.gladysgateway.com

<<>> DiG 9.16.1-Ubuntu <<>> api.gladysgateway.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59709
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.gladysgateway.com.         IN      A

;; ANSWER SECTION:
api.gladysgateway.com.  300     IN      A       142.93.160.146

;; Query time: 15 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:00:46 CET 2022
;; MSG SIZE  rcvd: 66

Really weird your problem (I’m in a similar conf Quad9/Pihole)

Testing your route

pi@raspberrypi:~ $ dig api.gladysgateway.com

<<>> DiG 9.11.5-P4-5.1+deb10u5-Raspbian <<>> api.gladysgateway.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55078
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.gladysgateway.com.         IN      A

;; ANSWER SECTION:
api.gladysgateway.com.  119     IN      A       142.93.160.146

;; Query time: 50 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:04:12 CET 2022
;; MSG SIZE  rcvd: 66

No issues either ^^

I’m not 100% convinced that we’re starting with the right assumption.

Merde, I was on the API


<<>> DiG 9.16.1-Ubuntu <<>> plus.gladysassistant.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26101
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;plus.gladysassistant.com.      IN      A

;; ANSWER SECTION:
plus.gladysassistant.com. 300   IN      A       188.114.97.3
plus.gladysassistant.com. 300   IN      A       188.114.96.3

;; Query time: 27 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:06:50 CET 2022
;; MSG SIZE  rcvd: 85