1

Topic: policyd stops responding

==== Required information ====
- iRedMail version:
- Store mail accounts in which backend (LDAP/MySQL/PGSQL):
- Linux/BSD distribution name and version:
- Related log if you're reporting an issue:
======== Required information ====
- iRedMail version: newest
- Store mail accounts in which backend (LDAP/MySQL/PGSQL): MySQL
- Linux/BSD distribution name and version: CentOS 6.5
- Related log if you're reporting an issue: /var/log/maillog, cbpolicyd
====

I've had to remove policyd from the main.cf config, because after a short period of time it appears to stop responding and interrupts the flow of mail.  I have a feeling it is due to it not spawning sufficient processes to handle the load, but I'm not positive.  /var/log/cbpolicyd at loglevel 4 (most verbose) is comprised almost entirely of lines like this:

[2014/08/08-10:31:56 - 25141] [CBPOLICYD] INFO: Got request #4 (pipelined)
[2014/08/08-10:31:57 - 18227] [CBPOLICYD] INFO: Got request #10 (pipelined)
[2014/08/08-10:31:57 - 25141] [CBPOLICYD] INFO: Got request #5 (pipelined)
[2014/08/08-10:31:58 - 25725] [CBPOLICYD] INFO: Got request #4 (pipelined)
[2014/08/08-10:31:59 - 25141] [CBPOLICYD] INFO: Got request #6 (pipelined)
[2014/08/08-10:31:59 - 25266] [CBPOLICYD] INFO: Got request #2 (pipelined)
[2014/08/08-10:31:59 - 25141] [CBPOLICYD] INFO: Got request #7 (pipelined)
[2014/08/08-10:32:00 - 23590] [CBPOLICYD] INFO: Got request #8 (pipelined)
[2014/08/08-10:32:01 - 25176] [CBPOLICYD] INFO: Got request #2 (pipelined)
[2014/08/08-10:32:02 - 24343] [CBPOLICYD] INFO: Got request #8 (pipelined)
[2014/08/08-10:32:05 - 21324] [CBPOLICYD] INFO: Got request #4 (pipelined)
[2014/08/08-10:32:05 - 18224] [CBPOLICYD] INFO: Got request #4 (pipelined)

And the output of `grep ':10031' /var/log/maillog` is:

Aug  8 10:34:18 mail01 postfix/smtpd[28812]: warning: problem talking to server 127.0.0.1:10031: Connection timed out
Aug  8 10:34:18 mail01 postfix/smtpd[27695]: warning: problem talking to server 127.0.0.1:10031: Connection timed out
Aug  8 10:34:19 mail01 postfix/smtpd[27687]: warning: problem talking to server 127.0.0.1:10031: Connection timed out
Aug  8 10:34:19 mail01 postfix/smtpd[29422]: warning: connect to 127.0.0.1:10031: Connection timed out
Aug  8 10:34:19 mail01 postfix/smtpd[29422]: warning: problem talking to server 127.0.0.1:10031: Connection timed out
Aug  8 10:34:19 mail01 postfix/smtpd[27341]: warning: problem talking to server 127.0.0.1:10031: Connection reset by peer
Aug  8 10:34:19 mail01 postfix/smtpd[28207]: warning: connect to 127.0.0.1:10031: Connection timed out
Aug  8 10:34:19 mail01 postfix/smtpd[28207]: warning: problem talking to server 127.0.0.1:10031: Connection timed out

The relevant cluebringer.conf config sections are:

# Protocols to load
protocols=<<EOT
Postfix
EOT

# Modules to load
modules=<<EOT
Core
AccessControl
#CheckHelo
#CheckSPF
Greylisting
Quotas
EOT

min_servers=8
min_spare_servers=8
max_spare_servers=16
max_servers=300
max_requests=2000

I'm not positive which process is handing mail over to policyd, though, to try to make sure it is spawning enough listening processes.  Can you offer any advice?  Once I put the directive 'check_policy_service inet:127.0.0.1:10031' in 'smtpd_end_of_data_restrictions' and at the end of 'smtpd_recipient_restrictions' and run `postfix reload`, I begin receiving timeouts around four or five minutes later.

----

Spider Email Archiver: On-Premises, lightweight email archiving software developed by iRedMail team. Supports Amazon S3 compatible storage and custom branding.

2

Re: policyd stops responding

*) How many emails per-day does your server process?
*) Cluebringer should run well on busy server, not sure why it happened. Did you update your OS regularly? maybe some dependent Perl modules require update. But we still need to debug Cluebringer to figure it out.

3 (edited by jford 2014-08-09 00:09:17)

Re: policyd stops responding

ZhangHuangbin wrote:

*) How many emails per-day does your server process?
*) Cluebringer should run well on busy server, not sure why it happened. Did you update your OS regularly? maybe some dependent Perl modules require update. But we still need to debug Cluebringer to figure it out.

Per the stats in the control panel, we usually process around 150,000 emails per day (sent and received).  I'm not sure if that is absolutely accurate, as grepping the logs indicates around 212,000 messages passed 'CLEAN', 175,000 were marked 'SPAM' and blocked, and around 1,035,000 'NOQUEUE' messages are in the logs.  The logs are rotated daily.  The OS is fully updated (although we do need to reboot for a recent kernel update). 

I tested further, and began to get the connection errors again in /var/log/maillog.  The number of processes listed for cbpolicyd was 302, and a max of 300 was set in the config.  I've upped it to 500 and am retesting.  I'm just not sure what a good guideline is to set this number.  Should it be pegged to the number of smtp-amavis processes, for example?  I'm presuming that when postfix gets an error talking to policyd it simply defers the message and does not drop it, right?

4

Re: policyd stops responding

I didn't test Cluebringer with heavy traffic before, so i suggest you report this in Cluebringer mailing list to get support from its developers directly:
http://wiki.policyd.org/support

jford wrote:

Should it be pegged to the number of smtp-amavis processes, for example?

Just Postfix talks to Cluebringer, not Amavisd-new.

jford wrote:

I'm presuming that when postfix gets an error talking to policyd it simply defers the message and does not drop it, right?

Yes.