在运行Haproxy的Ubuntu 14.04上,在服务haproxy重新加载之后,Haproxy突然报告其背后的所有服务器.
经过一番挖掘后,我注意到ping无法正常工作,有时它能够成功ping通,然后几秒钟后我们得到错误ping:sendmsg:不允许操作.
它也无法解析subdomain.domain.com.
iptables -L没有显示任何规则. iptables –flush没有帮助.
有任何想法吗?
- root@some-test:~# ping 107.1.1.1
- PING 107.1.1.1 (107.1.1.1) 56(84) bytes of data.
- 64 bytes from 107.1.1.1: icmp_seq=1 ttl=63 time=0.425 ms
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- 64 bytes from 107.1.1.1: icmp_seq=6 ttl=63 time=0.390 ms
- 64 bytes from 107.1.1.1: icmp_seq=7 ttl=63 time=0.533 ms
- 64 bytes from 107.1.1.1: icmp_seq=8 ttl=63 time=0.357 ms
- 64 bytes from 107.1.1.1: icmp_seq=9 ttl=63 time=0.343 ms
- 64 bytes from 107.1.1.1: icmp_seq=10 ttl=63 time=0.380 ms
- 64 bytes from 107.1.1.1: icmp_seq=11 ttl=63 time=0.398 ms
- 64 bytes from 107.1.1.1: icmp_seq=12 ttl=63 time=0.423 ms
- 64 bytes from 107.1.1.1: icmp_seq=13 ttl=63 time=0.293 ms
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- 64 bytes from 107.1.1.1: icmp_seq=16 ttl=63 time=0.371 ms
- 64 bytes from 107.1.1.1: icmp_seq=17 ttl=63 time=0.374 ms
- 64 bytes from 107.1.1.1: icmp_seq=18 ttl=63 time=0.305 ms
- 64 bytes from 107.1.1.1: icmp_seq=19 ttl=63 time=0.259 ms
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- 64 bytes from 107.1.1.1: icmp_seq=24 ttl=63 time=0.370 ms
- 64 bytes from 107.1.1.1: icmp_seq=25 ttl=63 time=0.316 ms
- 64 bytes from 107.1.1.1: icmp_seq=26 ttl=63 time=0.412 ms
- 64 bytes from 107.1.1.1: icmp_seq=27 ttl=63 time=0.512 ms
- 64 bytes from 107.1.1.1: icmp_seq=28 ttl=63 time=0.375 ms
- 64 bytes from 107.1.1.1: icmp_seq=29 ttl=63 time=0.352 ms
- 64 bytes from 107.1.1.1: icmp_seq=30 ttl=63 time=0.331 ms
- 64 bytes from 107.1.1.1: icmp_seq=31 ttl=63 time=0.290 ms
- 64 bytes from 107.1.1.1: icmp_seq=32 ttl=63 time=0.353 ms
- 64 bytes from 107.1.1.1: icmp_seq=33 ttl=63 time=0.378 ms
- 64 bytes from 107.1.1.1: icmp_seq=34 ttl=63 time=0.523 ms
- 64 bytes from 107.1.1.1: icmp_seq=35 ttl=63 time=0.351 ms
- 64 bytes from 107.1.1.1: icmp_seq=36 ttl=63 time=0.302 ms
- 64 bytes from 107.1.1.1: icmp_seq=37 ttl=63 time=0.496 ms
- 64 bytes from 107.1.1.1: icmp_seq=38 ttl=63 time=0.377 ms
- 64 bytes from 107.1.1.1: icmp_seq=39 ttl=63 time=0.357 ms
- 64 bytes from 107.1.1.1: icmp_seq=40 ttl=63 time=0.396 ms
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- ping: sendmsg: Operation not permitted
- 64 bytes from 107.1.1.1: icmp_seq=52 ttl=63 time=0.372 ms
- 64 bytes from 107.1.1.1: icmp_seq=53 ttl=63 time=0.412 ms
- 64 bytes from 107.1.1.1: icmp_seq=54 ttl=63 time=0.321 ms
- 64 bytes from 107.1.1.1: icmp_seq=55 ttl=63 time=0.366 ms
- 64 bytes from 107.1.1.1: icmp_seq=56 ttl=63 time=0.379 ms
- 64 bytes from 107.1.1.1: icmp_seq=57 ttl=63 time=0.395 ms
- 64 bytes from 107.1.1.1: icmp_seq=58 ttl=63 time=0.488 ms
- 64 bytes from 107.1.1.1: icmp_seq=59 ttl=63 time=0.513 ms
- 64 bytes from 107.1.1.1: icmp_seq=60 ttl=63 time=0.435 ms
- ^C
- --- 107.1.1.1 ping statistics ---
- 60 packets transmitted,39 received,35% packet loss,time 59008ms
- rtt min/avg/max/mdev = 0.259/0.385/0.533/0.067 ms
解决方法
我认为问题是因为conntrack中的连接数超过 – 然后在旧的过期之前无法建立新的连接.可能你可以在dmesg中看到类似的东西:
- [1824447.285257] nf_conntrack: table full,dropping packet.
- [1824447.522502] nf_conntrack: table full,dropping packet.
你可以看到当前最大的conntrack:
- undefine@uml:~$sudo sysctl net.nf_conntrack_max
- net.nf_conntrack_max = 65536
和当前的conntrack计数:
- undefine@uml:~$sysctl net.netfilter.nf_conntrack_count
- net.netfilter.nf_conntrack_count = 157
您可以使用conntrack -L(来自conntrack包的工具)显示Currenct连接.看看那里并检查它们的类型是有用的 – 有些可能是不必要的.
你有三个可能性:
>不要使用conntrack(简单 – 不要使用nat表并卸载nf_conntrack模块
>为outgoint连接禁用conntrack(在原始表中使用-j NOTRACK用于有问题的连接
>增加连接数:
undefine @ uml:〜$sudo sysctl net.nf_conntrack_max = 512000net.nf_conntrack_max = 512000或者将net.nf_conntrack_max = 512000放入/etc/sysctl.conf,然后调用sysctl -w重新加载它.