Bonjour !
Il y a quelques temps que je rencontre un problème notamment avec des appareils Z-Wave.
Lorsque je demande à Google d’éteindre avec la voix, elle me répond « OK, j’éteins » mais rien ne se passe. En revanche, si je le fais via l’application (Google ou Gladys), c’est OK.
Parfois, ma lumière s’éteint mais elle se rallume aussitôt puis s’éteint de nouveau, bref ^^'.
En vérifiant, j’ai trouvé ceci dans mes logs, mais je n’ai jamais regardé juste après un ‹ incident ›.
2022-02-24T19:18:17+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:41 (sendCurrentState) Gladys Gateway: Unable to forward google home reportState
2022-02-24T19:18:17+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
errno: -3001,
code: 'EAI_AGAIN',
syscall: 'getaddrinfo',
hostname: 'api.gladysgateway.com',
config: {
url: 'https://api.gladysgateway.com/google/report_state',
method: 'post',
data: '{"devices":{"states":{"mqtt-zwave-chambre-enfant-plafonnier":{"online":true,"on":false}}}',
headers: {
Accept: 'application/json, text/plain, */*',
'Content-Type': 'application/json;charset=utf-8',
authorization: 'iYi05OTFkLTM3OTczMmZkYjdlZSIsImlhdCI6MTY0NTcyNTk5MCwiZXhwIjoxNjQ1NzI5NTkwLCJhdWQiOiJp.hddKDhIiF4znPRHKBU8-E84bhHyGBKy142ScP_vJjm0',
'User-Agent': 'axios/0.21.1',
'Content-Length': 90
},
Côté DNS, pas de requête bloquée pourtant
This kind of error is rather clear, it’s a DNS issue.
What is interesting is that we notice something: there are dozens of DNS resolution requests in a row, without any respect for the TTL I give, as there are sometimes 2 requests in the same second!
I conclude that Gladys keeps asking for the server address of the Gateway with each request. Not crazy!
When investigating, this seems to be a known issue on the internet:
While working on a big node eCOM backend that had a lot of traffic, from time to time we found getaddrinfo EAI_AGAIN error in our logs, quick googling explains that this means that our DNS server can’t currently serve our request.
(See this article: https://medium.com/@amirilovic/how-to-fix-node-dns-issues-5d4ec2e12e95 )
And when investigating further, I came across this:
EDIT: This behavior is by design: Problème zwave et google home - #6 par pierre-gilles
In short, apparently the Docker Node.js alpine image that we use as a base for Gladys (cf Dockerfile ) does not have a package that manages DNS caching, and thus each request does its own DNS resolution: a bit heavy ^^
I continue to investigate the issue to find the right package to add on the Docker side.
cc @VonOx this will interest you
@spenceur Thanks for the feedback! We’ll fix that!
Is the DNS issue I’m experiencing coming from the ano on the node image or from AdGuard blocking at a certain time?
Thanks for your response
Hard to say, it could come from several points:
Locally, the local DNS resolver service that trips over its own feet because there are too many requests in a short time
Adguard having trouble responding to the load, as we can see, adguard is contacted for each request
Just to be sure it’s coming from us and not your installation, what specific DNS configuration did you do on your Pi?
If I’m not mistaken, I just added this variable to the resolve.conf
static domain_name_servers
It was a while ago ^^’
I came across comments like this:
So I understand better why the TTL is not respected (both in the Docker Alpine image and on the Debian of Raspberry Pi OS), it’s not a bug, it’s by design on most Linux systems
@spenceur Could you temporarily disable the use of your Adguard on Gladys (and replace it with a popular DNS like Cloudflare DNS (1.1.1.1)), and see if you continue to see these errors.
If you continue to see these errors, it will mean that your Adguard is not at fault. We may be too aggressive on the Gladys side in the frequency of calling the Gladys Gateway for the Google Home reportState
If you no longer see errors, it means your Adguard has some issues responding at times.
PS: If you change DNS, make sure to check that the new DNS is being used.
I’ll take care of it as soon as possible and keep you informed
PS: I still haven’t had the time
I did the manipulation as I mentioned here:
Hello, désolé d’avoir pris du temps pour te répondre.
=> j’ai tjrs des soucis de DNS j’ai pas encore pris le temps de passer sur quad9 (pour le test), je viens de le faire et j’ai encore le soucis côté G+.
=> Yes j’ai bien systématiquement le soucis sur G+ en local tout est ok donc rien d’urgent
c’est un peu compliqué, mais peut être vendredi si je n’ai pas de changement en cours de route ^^’
And I did check that I was going through quad9:
Just have to see if I encounter the same issues in the medium term
Great! Keep us updated, I hope this will solve the problem
bon finalement je peux répondre même sous quad9, j’ai une erreur qui pop :
2022-03-22T20:58:10+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
errno: -3001,
code: 'EAI_AGAIN',
syscall: 'getaddrinfo',
hostname: 'api.gladysgateway.com',
config: {
url: 'https://api.gladysgateway.com/google/report_state',
method: 'post',
data: '{"devices":{"states":{"fgd212-dimmer-2-2":{"online":true,"brightness":99}}}}',
VonOx
March 22, 2022, 8:36pm
11
Run docker inspect gladys I’m almost sure the container has a different DNS.
Here’s what I have with a grep on DNS
docker inspect gladys | grep -iF dns
"Dns": [],
"DnsOptions": [],
"DnsSearch": [],
Still having issues this morning:
2022-03-23T10:22:24+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:41 (sendCurrentState) Gladys Gateway: Unable to forward google home reportState
2022-03-23T10:22:24+0100 <warn> gateway.forwardDeviceStateToGoogleHome.js:42 (sendCurrentState) Error: getaddrinfo EAI_AGAIN api.gladysgateway.com
at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
errno: -3001,
code: 'EAI_AGAIN',
syscall: 'getaddrinfo',
hostname: 'api.gladysgateway.com',
config: {
url: 'https://api.gladysgateway.com/google/report_state',
method: 'post',
data: '{"devices":{"states":{"fgd212-dimmer-2-2":{"online":true,"brightness":99}}}',
No change despite changing DNS
VonOx
March 23, 2022, 10:44am
14
It’s strange, I also have quad9 by default.
Do you know how to connect interactively to the container? The idea would be to retrieve the contents of /etc/resolv.conf
Edit:
The command => docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"
Example output:
vonox@odin in ~ 1 ❯ docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"
search lan
nameserver 192.168.1.1
nameserver 9.9.9.9
Yes, I will take care of it this afternoon
Thank you for leaving my messages as they are
Here is the feedback (thanks)
docker exec -it gladys /bin/ash -c "cat /etc/resolv.conf"
# Generated by resolvconf
nameserver 9.9.9.9
VonOx
March 23, 2022, 1:01pm
17
I just checked your ‹ dig ›
You don’t have a dns response
On the Quad9 side the domain is ok
On my side:
vonox@odin in ~ ❯ dig api.gladysgateway.com
<<>> DiG 9.16.1-Ubuntu <<>> api.gladysgateway.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 59709
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.gladysgateway.com. IN A
;; ANSWER SECTION:
api.gladysgateway.com. 300 IN A 142.93.160.146
;; Query time: 15 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:00:46 CET 2022
;; MSG SIZE rcvd: 66
Really weird your problem (I’m in a similar conf Quad9/Pihole)
Testing your route
pi@raspberrypi:~ $ dig api.gladysgateway.com
<<>> DiG 9.11.5-P4-5.1+deb10u5-Raspbian <<>> api.gladysgateway.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55078
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;api.gladysgateway.com. IN A
;; ANSWER SECTION:
api.gladysgateway.com. 119 IN A 142.93.160.146
;; Query time: 50 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:04:12 CET 2022
;; MSG SIZE rcvd: 66
No issues either ^^
I’m not 100% convinced that we’re starting with the right assumption.
VonOx
March 23, 2022, 1:07pm
20
Merde, I was on the API
<<>> DiG 9.16.1-Ubuntu <<>> plus.gladysassistant.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26101
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;plus.gladysassistant.com. IN A
;; ANSWER SECTION:
plus.gladysassistant.com. 300 IN A 188.114.97.3
plus.gladysassistant.com. 300 IN A 188.114.96.3
;; Query time: 27 msec
;; SERVER: 9.9.9.9#53(9.9.9.9)
;; WHEN: Wed Mar 23 14:06:50 CET 2022
;; MSG SIZE rcvd: 85