1

Topic: backup mx - spooled volume mail: connection refused

==== REQUIRED BASIC INFO OF YOUR IREDMAIL SERVER ====
- iRedMail version (check /etc/iredmail-release):
- Deployed with iRedMail Easy or the downloadable installer?
- Linux/BSD distribution name and version:
- Store mail accounts in which backend (LDAP/MySQL/PGSQL):
- Web server (Apache or Nginx):
- Manage mail accounts with iRedAdmin-Pro?
- [IMPORTANT] Related original log or error message is required if you're experiencing an issue.
====

- iRedMail version (check /etc/iredmail-release): 0.9.9 MARIADB edition
- Deployed with iRedMail Easy or the downloadable installer?downloadable installer
- Linux/BSD distribution name and version: Centos 7
- Store mail accounts in which backend (LDAP/MySQL/PGSQL):  MySQL
- Web server (Apache or Nginx): Nginx
- Manage mail accounts with iRedAdmin-Pro? No

I have an iRedMail server that I've configured as a Backup MX server for our domain. I performed spool and delivery tests from the Backup MX server to the primary server prior to deployment and everything worked fine. Over the weekend we experienced an outage, and I found that after some of the spooled email delivered successfully from the Backup MX server, mail delivery for the remaining spooled mail stopped. I wonder if the issue is due to the volume of email currently in the queue. My spool/delivery tests only included ~5 messages, but during the recent outage, we had ~400 spooled mail. Here's what I've found:

Configuration:

Primary server: Site A (WAN IP 1/network 1)
Backup MX server: Site B (WAN IP 1/network 2)

I've tested flushing one mail from the queue on the Backup MX server.

Sep  6 11:30:09 [Site B mailserver] clamd[8799]: SelfCheck: Database status OK.
Sep  6 11:40:00 [Site B mailserver] postfix/qmgr[10591]: 46Q0Xm39dmzYt8h2: from=<[sender]@[Site B mailserver].[domain]>, size=980, nrcpt=1 (queue active)


My firewall hardware located at Site A indicates that the connection is successful sourcing from the server in Site B to the server in Site A.
Site A firewall hardware log for WAN IP entries:
2019-09-06 11:28:24 Allow [Site B IP] [Site A IP] smtp/tcp 48612 25 Allowed
2019-09-06 11:40:00 Allow [Site B IP] [Site A IP] smtp/tcp 49612 25 Allowed 

The mail server at Site B indicates that the mail was refused

Sep  6 11:40:00 [Site B mailserver] postfix/qmgr[10591]: 46Q0Xm39dmzYt8h2: from=<[sender]@[Site B mailserver].[domain]>, size=980, nrcpt=1 (queue active)
Sep  6 11:40:00 [Site B mailserver] postfix/smtp[11459]: connect to [Site A IP][[Site A IP]]:25: Connection refused
Sep  6 11:40:00 [Site B mailserver] postfix/smtp[11459]: 46Q0Xm39dmzYt8h2: to=<[recipient]@[domain]>, relay=none, delay=3704, delays=3704/0.02/0.05/0, dsn=4.4.1, status=deferred (connect to [Site A IP][[Site A IP]]:25: Connection refused)

The mail server at Site A is receiving emails from other external domains without issue, but there are no log entries for traffic from Site B on Site A's mail server to indicate a delivery rejection at the destination.

I attempted to flush all mail within the Backup MX's queue and found the below entry in the mail log.


Sep  6 11:55:35 [Site B mailserver] postfix/error[12099]: 46NJm55VZMzYt9tg: to=<[recipient]@[domain]>, relay=none, delay=242001, delays=242001/0.67/0/0.04, dsn=4.4.1, status=deferred (delivery temporarily suspended: connect to [Site A IP][[Site A IP]]:25: Connection refused)

However Site A's firewall indicates that connectivity is working.

2019-09-06 11:55:34 Allow [Site B IP] [Site A IP] smtp/tcp 50958 25 Allowed

Are there any settings that I can configure to address the volume of mail attempting to deliver? Is it possible that connectivity is failing because too many emails are trying to deliver at once? I added the below entries to main.cf but that didn't help.


smtpd_timeout=1200s
smtp_pix_workaround_delay_time = 300s
smtp_pix_workaround_threshold_time = 86400s

----

Spider Email Archiver: On-Premises, lightweight email archiving software developed by iRedMail team. Supports Amazon S3 compatible storage and custom branding.

2

Re: backup mx - spooled volume mail: connection refused

Did you check the primary server's iptables to see if the backup mx ip address is not being rejected.

Also have you tried doing telnet/nc on port 25 from the backup mx to the primary server?

3

Re: backup mx - spooled volume mail: connection refused

chaz wrote:

Did you check the primary server's iptables to see if the backup mx ip address is not being rejected.

I checked the firewall log and nothing is showing up for this server or port (or anything else being blocked/dropped).

Also have you tried doing telnet/nc on port 25 from the backup mx to the primary server?

Connection is refused, however firewall shows port 25 traffic allowed at the exact time of my telnet attempt. So maybe something at the primary server is dropping the attempts, but I can't find it anywhere in the logs.

4

Re: backup mx - spooled volume mail: connection refused

traffic via tcpdump on primary server during telnet attempt

tcpdump -i any port 25
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
13:45:57.927771 IP [Site B address] > [Site A server].smtp: Flags [S], seq 1201253025, win 29200, options [mss 1380,sackOK,TS val 9700073 ecr 0,nop,wscale 7], length 0
^C
1 packet captured
1 packet received by filter
0 packets dropped by kernel

5

Re: backup mx - spooled volume mail: connection refused

nhamilton wrote:

traffic via tcpdump on primary server during telnet attempt

tcpdump -i any port 25
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
13:45:57.927771 IP [Site B address] > [Site A server].smtp: Flags [S], seq 1201253025, win 29200, options [mss 1380,sackOK,TS val 9700073 ecr 0,nop,wscale 7], length 0
^C
1 packet captured
1 packet received by filter
0 packets dropped by kernel

From the tcpdump it looks like the smtp packet is received by the primary server. Does the postfix log show the incoming connection?

6

Re: backup mx - spooled volume mail: connection refused

No it doesn't and that's what's strange. Also I have a second server located on the same network as the Backup MX server, and it is able to send emails to the primary server without issue. My first guess is that when a rate or threshold was crossed by my Backup MX attempting to send the high volume of email, it was "shutdown" by the primary server, however I don't know where to look to see if that's the case. The backup MX server had a bit over 400 emails in the queue...then it stopped and now it's got 384 emails in the queue. Does Postfix have a way to view time limits for servers that maybe its throttling or delaying?

7

Re: backup mx - spooled volume mail: connection refused

Check iptables firewall rules, is your backup mx blocked in firewall due to trigger some Fail2ban filter?

8

Re: backup mx - spooled volume mail: connection refused

Firewall rules look ok...I will see what I can find via Fail2ban. I haven't used it before, so any info that you may have regarding how to see if it's been blocked will be helpful.

9

Re: backup mx - spooled volume mail: connection refused

So here's what I've found...I manually moved all of the messages to a temp folder. I copied ~10 back to the postfix maildrop folder, changed permissions, restarted postfix and those messages delivered almost immediately.  Can you confirm if postfix has any built in throttling options? I know that smtp has a session limit...could that be an issue? And if so, how do I modify it.

10

Re: backup mx - spooled volume mail: connection refused

i don't think Postfix is the issue. I suggest checking log files and extract the error log for troubleshooting.

11

Re: backup mx - spooled volume mail: connection refused

I've continued to troubleshoot but can't find any errors. Fail2ban/iptables show no errors. No errors appear on the firewall logs, no errors appear within the maillog. The only way I was able to resolve was to move all queued mail to separate folder and manually requeue in small batches. I thought perhaps there was a bottleneck on the server, and I monitored performance as the mail was released, but nothing was there as well. The mail that was requeued delivered pretty much instantly.  Not sure what the issue was but I'll move on for now.