Topic: Connection Timeout errors shortly after installing Let's Encrypt Certs
==== REQUIRED BASIC INFO OF YOUR IREDMAIL SERVER ====
- iRedMail version (check /etc/iredmail-release): 1.6.2
- Deployed with iRedMail Easy or the downloadable installer? Downloadable
- Linux/BSD distribution name and version: Ubuntu 20.04
- Store mail accounts in which backend (LDAP/MySQL/PGSQL): MariaDB
- Web server (Apache or Nginx):Nginx
- Manage mail accounts with iRedAdmin-Pro? No
- [IMPORTANT] Related original log or error message is required if you're experiencing an issue.
====
LOGS
====
Following are excerpts from logs I knew of to check... Only thing I can make of them is there's seems to vbe a bad cert/key combination (or is that the dkim key?)
------
# cat /var/log/mail.log | grep error | more
Jan 21 04:08:52 casa postfix/smtps/smtpd[3086]: SSL_accept error from unknown[15
2.231.17.195]: lost connection
-----
cat /var/log/nginx/ | grep error | more
2023/01/21 03:48:59 [error] 757#757: *1489 open() "/var/www/html/media/wp-includ
es/wlwmanifest.xml" failed (2: No such file or directory), client: 159.223.45.10
6, server: _, request: "GET //media/wp-includes/wlwmanifest.xml HTTP/1.1", host:
"emailaddress.one"
2023/01/21 03:48:59 [error] 757#757: *1489 open() "/var/www/html/wp2/wp-includes
/wlwmanifest.xml" failed (2: No such file or directory), client: 159.223.45.106,
server: _, request: "GET //wp2/wp-includes/wlwmanifest.xml HTTP/1.1", host: "em
ailaddress.one"
2023/01/21 03:48:59 [error] 757#757: *1489 open() "/var/www/html/site/wp-include
s/wlwmanifest.xml" failed (2: No such file or directory), client: 159.223.45.106
, server: _, request: "GET //site/wp-includes/wlwmanifest.xml HTTP/1.1", host: "
emailaddress.one"
2023/01/21 03:49:00 [error] 757#757: *1489 open() "/var/www/html/cms/wp-includes
/wlwmanifest.xml" failed (2: No such file or directory), client: 159.223.45.106,
server: _, request: "GET //cms/wp-includes/wlwmanifest.xml HTTP/1.1", host: "em
ailaddress.one"
2023/01/21 03:49:00 [error] 757#757: *1489 open() "/var/www/html/sito/wp-include
s/wlwmanifest.xml" failed (2: No such file or directory), client: 159.223.45.106
, server: _, request: "GET //sito/wp-includes/wlwmanifest.xml HTTP/1.1", host: "
emailaddress.one"
2023/01/21 04:37:12 [error] 711#711: *860 FastCGI sent in stderr: "Primary scrip
t unknown" while reading response header from upstream, client: 194.38.20.161, s
erver: _, request: "GET /joobi/inc/openflashchart/php-ofc-library/ofc_upload_ima
ge.php HTTP/1.1", upstream: "fastcgi://127.0.0.1:9999", host: "ninesongmedia.com
"
2023/01/21 04:51:49 [error] 717#717: *155 open() "/var/www/html/actuator/health"
failed (2: No such file or directory), client: 198.199.95.27, server: _, reques
t: "GET /actuator/health HTTP/1.1", host: "170.187.142.226"
-----
cat /var/log/dovecot/*.log |grep error|more
Jan 21 01:23:58 casa dovecot: pop3-login: Disconnected (no auth attempts in 0 se
cs): user=<>, rip=24.239.78.182, lip=170.187.142.226, TLS handshaking: SSL_accep
t() failed: error:14094412:SSL routines:ssl3_read_bytes:sslv3 alert bad certific
ate: SSL alert number 42, session=<J4M2BbzyGQUY7062>
-----
cat /var/log/syslog | grep error | more
Jan 21 04:08:52 casa postfix/smtps/smtpd[3086]: SSL_accept error from unknown[15
2.231.17.195]: lost connection
Jan 21 04:44:41 casa systemd[1]: Condition check resulted in Process error repor
ts when automatic reporting is enabled (file watch) being skipped.
Jan 21 04:44:41 casa kernel: [ 16.902720] EXT4-fs (sda): re-mounted. Opts: err
ors=remount-ro
====
END OF LOGS
====
Visble Problem is that after a basic install, before installing Let's Encrypt Certificates, One can access the admin and webmail web pages (after approving the Exception for the self-signed ssl certs), add/edit domains, add/edit users, even send email through web interface to outside domains that aren't persnicative like Google, Microsft, and so on--then following a succesful certbot install and Let's Encrypt certificate generation, upon backing up the self signed certs,
# mv /etc/ssl/certs/iRedMail.crt{,.bak}
# mv /etc/ssl/private/iRedMail.key{,.bak}
then doing the symbolics links,
# ln -s /etc/letsencrypt/live/mx.mydomain.tld/fullchain.pem /etc/ssl/certs/iRedMail.crt
# ln -s /etc/letsencrypt/live/mx.mydomain.tld/privkey.pem /etc/ssl/private/iRedMail.key
/mail & /iredadmin pages load with the SSL locks appearing (suggesting that the SSL certs are making the browser happy), one can still login a few times/for a few minutes, THEN all the pages start returning "Connection Timed Out" errors.
I wondered if, given the inconsistent behavior, perhaps a service was stopping, so I did this :
systemctl status
State: running
Jobs: 0 queued
Failed: 0 units
Since: Sat 2023-01-21 04:44:38 UTC; 2h 22min ago
CGroup: /
├─user.slice
│ └─user-0.slice
│ ├─session-4.scope
│ │ ├─ 754 /bin/login -p --
│ │ ├─ 2640 -bash
│ │ └─10887 systemctl status
│ └─user@0.service …
│ └─init.scope
│ ├─2632 /lib/systemd/systemd --user
│ └─2635 (sd-pam)
├─init.scope
│ └─1 /sbin/init
└─system.slice
├─fail2ban.service
│ └─700 /usr/bin/python3 /usr/bin/fail2ban-server -xf start
├─haveged.service
│ └─504 /usr/sbin/haveged --Foreground --verbose=1 -w 1024
├─clamav-daemon.service
│ └─706 /usr/sbin/clamd --foreground=true
├─systemd-networkd.service
│ └─526 /lib/systemd/systemd-networkd
├─amavis.service
│ ├─10741 /usr/sbin/amavisd-new (master)
│ ├─10741 /usr/sbin/amavisd-new (master)
│ ├─10745 /usr/sbin/amavisd-new (virgin child)
│ ├─10746 /usr/sbin/amavisd-new (virgin child)
│ ├─10747 /usr/sbin/amavisd-new (virgin child)
│ ├─10748 /usr/sbin/amavisd-new (virgin child)
│ ├─10749 /usr/sbin/amavisd-new (virgin child)
│ ├─10750 /usr/sbin/amavisd-new (virgin child)
│ ├─10751 /usr/sbin/amavisd-new (virgin child)
│ └─10752 /usr/sbin/amavisd-new (virgin child)
├─systemd-udevd.service
│ └─439 /lib/systemd/systemd-udevd
├─cron.service
│ └─671 /usr/sbin/cron -f
├─nginx.service
│ ├─716 nginx: master process /usr/sbin/nginx -g daemon on; master_process on;
│ └─717 nginx: worker process
├─mariadb.service
│ └─846 /usr/sbin/mysqld
├─polkit.service
│ └─687 /usr/lib/policykit-1/polkitd --no-debug ├─networkd-dispatcher.service
│ └─684 /usr/bin/python3 /usr/bin/networkd-dispatcher --run-startup-triggers
├─ModemManager.service
│ └─737 /usr/sbin/ModemManager
├─systemd-journald.service
│ └─394 /lib/systemd/systemd-journald
├─atd.service
│ └─696 /usr/sbin/atd -f
├─unattended-upgrades.service
│ └─743 /usr/bin/python3 /usr/share/unattended-upgrades/unattended-upgrade-shutdown --wait-for-signal
├─ssh.service
│ └─841 sshd: /usr/sbin/sshd -D [listener] 0 of 10-100 startups
├─snapd.service
│ └─689 /usr/lib/snapd/snapd
├─clamav-freshclam.service
│ └─809 /usr/bin/freshclam -d --foreground=true
├─rsyslog.service
│ └─688 /usr/sbin/rsyslogd -n -iNONE
├─netdata.service
│ ├─2001 /opt/netdata/bin/srv/netdata -P /opt/netdata/var/run/netdata/netdata.pid -D
│ ├─2030 /opt/netdata/bin/srv/netdata --special-spawn-server
│ ├─2202 /opt/netdata/usr/libexec/netdata/plugins.d/go.d.plugin 3
│ ├─2204 /opt/netdata/usr/libexec/netdata/plugins.d/ebpf.plugin 3
│ ├─2210 /opt/netdata/usr/libexec/netdata/plugins.d/apps.plugin 3
│ ├─2216 /usr/bin/python3 /opt/netdata/usr/libexec/netdata/plugins.d/python.d.plugin 3
│ └─9128 bash /opt/netdata/usr/libexec/netdata/plugins.d/tc-qos-helper.sh 3
├─iredadmin.service
│ ├─718 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
│ ├─960 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
│ ├─961 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
│ ├─962 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
│ ├─964 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
│ └─966 /usr/bin/uwsgi --ini /opt/www/iredadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/iredadmin/iredadmin.pid
├─system-postfix.slice
│ └─postfix@-.service
│ ├─ 1981 /usr/lib/postfix/sbin/master -w
│ ├─ 1983 qmgr -l -t unix -u
│ ├─ 3852 tlsmgr -l -t unix -u
│ ├─ 7727 pickup -l -t unix -u -o content_filter=smtp-amavis:[127.0.0.1]:10026
│ └─10064 showq -t unix -u
├─dovecot.service
│ ├─814 /usr/sbin/dovecot -F
│ ├─871 dovecot/lmtp -L│ └─10064 showq -t unix -u
├─dovecot.service
│ ├─814 /usr/sbin/dovecot -F
│ ├─871 dovecot/lmtp -L
│ ├─872 dovecot/anvil
│ ├─873 dovecot/log
│ ├─874 dovecot/lmtp -L
│ ├─875 dovecot/lmtp -L
│ ├─876 dovecot/lmtp -L
│ ├─877 dovecot/lmtp -L
│ ├─878 dovecot/config
│ └─883 dovecot/stats
├─iredapd.service
│ └─1108 /usr/bin/python3 /opt/iredapd/iredapd.py
├─systemd-resolved.service
│ └─3672 /lib/systemd/systemd-resolved
├─php7.4-fpm.service
│ ├─ 686 php-fpm: master process (/etc/php/7.4/fpm/php-fpm.conf)
│ ├─9690 php-fpm: pool inet
│ ├─9692 php-fpm: pool inet
│ ├─9694 php-fpm: pool inet
│ ├─9697 php-fpm: pool inet
│ ├─9699 php-fpm: pool inet
│ └─9892 php-fpm: pool inet
├─dbus.service
│ └─672 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
├─systemd-timesyncd.service
│ └─491 /lib/systemd/systemd-timesyncd
├─system-getty.slice
│ └─getty@tty1.service
│ └─815 /sbin/agetty -o -p -- \u --noclear tty1 linux
├─mlmmjadmin.service
│ ├─719 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
│ ├─898 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
│ ├─899 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
│ ├─900 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
│ ├─901 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
│ └─902 /usr/bin/uwsgi --ini /opt/mlmmjadmin/rc_scripts/uwsgi/debian.ini --pidfile /var/run/mlmmjadmin/mlmmjadmin.pid
└─systemd-logind.service
└─693 /lib/systemd/systemd-logind
THEN I thought maybe the machine was running out of memory, so I rescaled my Linode from 4GB 2 Core to 8GB 4 Core--No Change.
Next I read up on and tried various tweeks of the /etc/hosts file...it may have mistakes, BUT, wouldn't they either work or not, versus work for a little then stop?
Then I scoured back and forth through the DNS records; DMarc, DKIM, SPF, MX, A...just can't see anything wrong, and even if there is, again, why would it work fine for a bit then fail...I'm sure there are ways, but, I'm just not guessing what they might be.
In the process I tried many rebuilds of the node from scratch, employing Ubuntu18.02, 20.04, and 22.04; I tried different sources for certbot; and several restarts ago I tried all sorts of config file changes I read about different places on this forum and elsewhere, and many times I used the command below yielding no errors... Nevertheless, I just tried again an hour or so ago and got this... A clue perhaps?
#amavisd-new testkeys
TESTING#1 MyDomain.TLD: dkim._domainkey.mydomain.tld => fail (bad RSA signature)
I put up iRedMail 1/4/0 a few months ago, on a Ubuntu instance, with Let's Encrypt cert, and five domains. Took all of two days, and ran trouble free until Jan 2, then suddenly started rejecting SMTP connections, and I came to find that one of My registrars had reset the DNS records for three of my domains that the mail server had been serving, so I blamed that for the crash, at first, even though the troubles weren't limited to those thre, then I thought something in the network policy at Linode may have changed, perhaps via their Network Helper feature that is on by default, and actually changes config files on client machines, or perhaps it was attributable to some aspect of the gestapo ramrodding of ipv6 services on the world, but, again, none of that seemed to pan out.
So, as of this writing, I've been at this night and day for about 20 days now... I can't shake the feeling there is something super simple which I am daftly missing, perhaps due in part to a dose of fatugue, adding to my already generous endowment of inate ineptitude.
'Be most grateful if someone will be so kind as to enlighten me. Thanks to all in advance!
----
Spider Email Archiver: On-Premises, lightweight email archiving software developed by iRedMail team. Supports Amazon S3 compatible storage and custom branding.