On Sat, Jun 3, 2023 at 12:56 PM Alex <mysqlstudent(a)gmail.com> wrote:
On Sat, Jun 3, 2023 at 11:58 AM Tim via users <users(a)lists.fedoraproject.org>
wrote:
>
> On Sat, 2023-06-03 at 09:46 -0400, Alex wrote:
> > I have an E3-1240 fedora37 postfix system using SSDs connected to a
> > cable modem that's having problems with dropped packets. There is one
> > other fedora37 server (E5-1650) directly connected to the cable modem
> > that is not having the same problem, although it's just routing
> > packets, not really doing much processing of data. The server with
> > the problem is using libreswan to create a VPN between itself and
> > an i7-7700K with fedora37 managed at OVH, thinking it would be more
> > resilient than the cable connection itself. The problem also happens
> > without the VPN, but perhaps not to the same degree.
>
> Asking the obvious questions:
>
> Have you swapped the ports around on the cable modem between the two
> computers, to see if the cable modem has a bad port?
>
> Have you swapped cables, to see if you have a bad cable?
Yes, great questions, and I realized that after I posted.
The server has two ethernet ports - I've tried connecting a completely different
ethernet cable to the other port on the server to a completely different port on the cable
modem and the problem persists.
To add to the complexity of this, I have a SuperMicro E5-1650 2U that I just brought out
of retirement and booted with a USB stick running sysrescue and it also has the same
"problem," without even doing anything. I don't understand how now these two
servers are any different than the other server on the same cable modem that doesn't
experience any of these issues.
As I posted in my follow-up, the problem doesn't happen when tcpdump is running (now
on both servers).
tcpdump likely ignores the checksum because it is usually offloaded
and calculated by the NIC. Since the NIC performs the calculation,
tcpdump only sees the memory with the header fields and checksum = 0.
The OS sets it to 0 because the NIC will calculate it.
I think you can disable TPC checksum offloads, and have the OS perform
the calculation. See
https://serverfault.com/questions/421995/disable-tcp-offloading-completel...
.
Maybe it is a software problem? iptables/etables? ethtool settings?
I'm using shorewall on the server without the problem, but the problem persists even
with disabling all firewall rules.
The "real" problem I'm trying to address is why I'm seeing so many DNS
timeouts:
Jun 3 10:07:34 mail03 postfix/postscreen[413190]: warning: dnsblog reply timeout 10s for
bb.barracudacentral.org
Jun 3 10:07:34 mail03 postfix/postscreen[413190]: warning: dnsblog reply timeout 10s for
list.dnswl.org
Jun 3 10:07:29 mail03 postfix/dnsblog[413191]: warning: dnsblog_query: lookup error for
DNS query
109.178.102.199.bl.mailspike.net: Host or domain name not found. Name service
error for
name=109.178.102.199.bl.mailspike.net type=A: Host not found, try again
You might also be interwsted in
https://unix.stackexchange.com/questions/205141/what-exactly-is-an-ifconf...
.
When I read your first post, I thought you might need to increase the
number of receive buffers or the size of the receive buffers. Change
them from 256kb to 512kb or 1024kb. Also see
https://www.cyberciti.biz/faq/linux-tcp-tuning/ .
Jeff