https://github.com/influxdata/influxdb/issues/9122
Found out that my subqueries were bogging down my DB. I adjusted them as recommended in the above issue report.
There were also 2 widgets with hardcoded and incorrect datasources.
I only changed the gateway variable to:
SHOW TAG VALUES FROM "gateways" WITH KEY = "gateway_name" WHERE "host" =~ /^$Host$/
Changed group by on Gateway RTT and Gateway Loss charts to gateway_name
Added gateway_name to Gateway Summary. Added interface to this list so both gateway_name and interface can be seen.
telegraf --test --config /usr/local/etc/telegraf.conf
before:
> gateways,host=fw.anson.lan,interface=igb0 defaultgw=1,delay=0,gwdescr="Interface WAN_DHCP Gateway",loss=100,monitor="192.168.0.1",source="192.168.0.30",status="down",stddev=0,substatus="highloss" 1620242074000000000
> gateways,host=fw.anson.lan,interface=igb0 defaultgw=0,delay=0,gwdescr="Interface WAN_DHCP6 Gateway",loss=0,monitor="",source="",status="",stddev=0,substatus="N/A" 1620242074000000000
The changes to the telegraf output can be seen below:
telegraf --test --config /usr/local/etc/telegraf.conf
after:
> gateways,gateway_name=WAN_DHCP,host=fw.anson.lan,interface=igb0 defaultgw=1,delay=0,gwdescr="Interface WAN_DHCP Gateway",loss=100,monitor="192.168.0.1",source="192.168.0.30",status="down",stddev=0,substatus="highloss" 1620242589000000000
> gateways,gateway_name=WAN_DHCP6,host=fw.anson.lan,interface=igb0 defaultgw=0,delay=0,gwdescr="Interface WAN_DHCP6 Gateway",loss=0,monitor="",source="",status="",stddev=0,substatus="N/A" 1620242589000000000
There's absolutely potential to include way too much data in the database. If I had the time, I would look into retention policies and data rollup, but I do this for fun and don't have the time to spare to read up on influx. I'll add the defaultgw and gwdesc back in when I get the time since it's specific to the gateway. The ifdesc belongs on the main interface table and would be redundant here. Sometimes I get too caught up in what I can collect and monitor.
Thanks