Discussion:
[bind-users] Slow reply under heavy load (on a specific NIC ip)
Ict Security
2018-06-04 16:04:16 UTC
Permalink
Dear guys,

thank you for answering.
We are using a CentOS 7.2 distribution, x64 architecture.
We use generic e1000 network driver, the Virtual machine runs under VMware 5.5.

We use netfilter on the Firewall machine, another machine, we raised
up the "somaxconn" parameter.
We do not see, right now, any warning about conntrack table full.

We are also trying to resolve from the SAME machine where Bind 9.x
runs, to avoid firewall/nat problems during the testing.
The NIC primary IP address, also from internal, returns delay under heavy load.
If i switch querying an alias IP address - on the same NIC - everything is fast.

Thank you!!
F
Can you please provide some specifics about your setup that is experiencing
the problem?
HW - Sparc, PPC, Intel x86/x64, ARM ?
OS - what OS is the problem occurring on?
specific BIND version?
anything about the NIC in question, possibly to include mfg && model number,
if relevant?
Thanks
Hi guys,
we are running a Bind 9.x Server, everything is going fine.
Under particular heavy load mometns, with some hundreds of concurrent
queries coming in, sometime Bing stops answering for some seconds or
answer with important delays.
But, when i try to query the same server/same Bind on a NIC alias IP
during congestion on the main IP, everything is fast!
max-connections in /proc
txqueue in network
ipv4_ports
and i mitigate something.
But it is not completely solved.
Do you think Bind could have some NIC IP limit?
Some ideas?
Really thank you!
Francesco
_______________________________________________
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to
unsubscribe from this list
bind-users mailing list
https://lists.isc.org/mailman/listinfo/bind-users
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
Ict Security
2018-06-05 15:28:42 UTC
Permalink
Dear guys,

thank you for answering.
We are using a CentOS 7.2 distribution, x64 architecture.
We use generic e1000 network driver, the Virtual machine runs under VMware 5.5.

We use netfilter on the Firewall machine, another machine, we raised
up the "somaxconn" parameter.
We do not see, right now, any warning about conntrack table full.

We are also trying to resolve from the SAME machine where Bind 9.x
runs, to avoid firewall/nat problems during the testing.
The NIC primary IP address, also from internal, returns delay under heavy load.
If i switch querying an alias IP address - on the same NIC - everything is fast.

Thank you!!
F
Post by Ict Security
Dear guys,
thank you for answering.
We are using a CentOS 7.2 distribution, x64 architecture.
We use generic e1000 network driver, the Virtual machine runs under VMware 5.5.
We use netfilter on the Firewall machine, another machine, we raised
up the "somaxconn" parameter.
We do not see, right now, any warning about conntrack table full.
We are also trying to resolve from the SAME machine where Bind 9.x
runs, to avoid firewall/nat problems during the testing.
The NIC primary IP address, also from internal, returns delay under heavy load.
If i switch querying an alias IP address - on the same NIC - everything is fast.
Thank you!!
F
Can you please provide some specifics about your setup that is experiencing
the problem?
HW - Sparc, PPC, Intel x86/x64, ARM ?
OS - what OS is the problem occurring on?
specific BIND version?
anything about the NIC in question, possibly to include mfg && model number,
if relevant?
Thanks
Hi guys,
we are running a Bind 9.x Server, everything is going fine.
Under particular heavy load mometns, with some hundreds of concurrent
queries coming in, sometime Bing stops answering for some seconds or
answer with important delays.
But, when i try to query the same server/same Bind on a NIC alias IP
during congestion on the main IP, everything is fast!
max-connections in /proc
txqueue in network
ipv4_ports
and i mitigate something.
But it is not completely solved.
Do you think Bind could have some NIC IP limit?
Some ideas?
Really thank you!
Francesco
_______________________________________________
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to
unsubscribe from this list
bind-users mailing list
https://lists.isc.org/mailman/listinfo/bind-users
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
Warren Kumari
2018-06-04 15:28:49 UTC
Permalink
Hi guys,
we are running a Bind 9.x Server, everything is going fine.
Under particular heavy load mometns, with some hundreds of concurrent
queries coming in, sometime Bing stops answering for some seconds or
answer with important delays.
But, when i try to query the same server/same Bind on a NIC alias IP
during congestion on the main IP, everything is fast!
​This sounds suspiciously like ​conntrack (or some other state / connection
tracking) becoming full.

Depnding on OS / version, does:
conntrack -L
or
cat /proc/net/ip_conntrack
produce a lot of output?

Are you running iptables / ipfw / <similar>? Behind a firewall? Are there
any interesting messages in [/var/log/message|/var/log/syslog/|<wherever
you have BIND logging>] ?

W
max-connections in /proc
txqueue in network
ipv4_ports
and i mitigate something.
But it is not completely solved.
Do you think Bind could have some NIC IP limit?
Some ideas?
Really thank you!
Francesco
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to
unsubscribe from this list
bind-users mailing list
https://lists.isc.org/mailman/listinfo/bind-users
--
I don't think the execution is relevant when it was obviously a bad idea in
the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair of
pants.
---maf
Ict Security
2018-06-04 15:19:02 UTC
Permalink
Hi guys,

we are running a Bind 9.x Server, everything is going fine.
Under particular heavy load mometns, with some hundreds of concurrent
queries coming in, sometime Bing stops answering for some seconds or
answer with important delays.

But, when i try to query the same server/same Bind on a NIC alias IP
during congestion on the main IP, everything is fast!

I changed some tunings in:
max-connections in /proc
txqueue in network
ipv4_ports

and i mitigate something.
But it is not completely solved.

Do you think Bind could have some NIC IP limit?
Some ideas?

Really thank you!
Francesco
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users
Klaus Darilion
2018-06-15 22:14:45 UTC
Permalink
Hi guys,
we are running a Bind 9.x Server, everything is going fine.
Under particular heavy load mometns, with some hundreds of concurrent
queries coming in, sometime Bing stops answering for some seconds or
answer with important delays.
But, when i try to query the same server/same Bind on a NIC alias IP
during congestion on the main IP, everything is fast!
Just a shot in the dark: I guess Bind opens multiple sockets, one for
each IP address. Every socket has its own receive-buffer. Under heavy
load, the receive buffer gets full and you have packet loss. So your
client send retransmissions and sometimes the packet goes through and
you receive an answer (high delay)

When you query the other IP, the socket is idle, hence no full buffers
and your query gets received and processed immediately.

Checking the state of the receive buffer can be done with netstat, and
packet loss due to full receive buffer can be seens in the "dropped
packets" of the respective interface.

regards
Klaus
_______________________________________________
Please visit https://lists.isc.org/mailman/listinfo/bind-users to unsubscribe from this list

bind-users mailing list
bind-***@lists.isc.org
https://lists.isc.org/mailman/listinfo/bind-users

Loading...