1 (edited by lol7344 2020-08-19 20:48:45)

Topic: SpamAssassin training not working

==== REQUIRED BASIC INFO OF YOUR IREDMAIL SERVER ====
- iRedMail version (check /etc/iredmail-release): 1.2.1
- Deployed with iRedMail Easy or the downloadable installer? downloadable installer
- Linux/BSD distribution name and version: Ubuntu 20.04 LTS
- Store mail accounts in which backend (LDAP/MySQL/PGSQL): MariaDB
- Web server (Apache or Nginx): Apache
- Manage mail accounts with iRedAdmin-Pro? No
- [IMPORTANT] Related original log or error message is required if you're experiencing an issue.
====

Hello, I have a problem with SpamAssassin.
I followed this guide: https://docs.iredmail.org/dovecot.imapsieve.html
And set up everything as it says.

I confirmed that everything works, since when I move an email to "Spam", "/var/log/dovecot/imap.log" reports:

Jan 31 21:10:42 c7 dovecot: imap(<email>): sieve: pipe action: piped message to program `imapsieve_copy'
Jan 31 21:10:42 c7 dovecot: imap(<email>): sieve: left message in mailbox 'Junk'
Jan 31 21:10:42 c7 dovecot: imap(<email>): expunge: box=INBOX, uid=7, msgid=, size=7805, from=<email>, subject=<subject>

And "/var/log/syslog" reports:

Jan 31 05:03:16 mail scan_reported_mails: [SPAM] Learned tokens from 1 message(s) (1 message(s) examined)

However, after sending literally the same email again, it gets delivered in the "Inbox" folder.
I tried moving it to spam multiple times, but the situation does not improve.


Any idea why?
Thanks!


P.S.: The email is a spammy email that contains phishing links and grammar errors, so it shouldn't be too difficult to identify.
Sending the GTUBE ( XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-S.........) in an empty mail's body gets it delivered in the Spam folder, so SpamAssassin and Amavisd are working correctly.

----

Spider Email Archiver: On-Premises, lightweight email archiving software developed by iRedMail team. Supports Amazon S3 compatible storage and custom branding.

2

Re: SpamAssassin training not working

The bayesian classifier can only score new messages after it already learn 200 known spams and 200 known hams.

You may want to check this tutorial also:
https://docs.iredmail.org/store.spamass … n.sql.html

3

Re: SpamAssassin training not working

ZhangHuangbin wrote:

The bayesian classifier can only score new messages after it already learn 200 known spams and 200 known hams.

You may want to check this tutorial also:
https://docs.iredmail.org/store.spamass … n.sql.html

Understood, thanks for the clarification.
I have enabled saving bayes in MySQL, however after manually running

# sa-learn --spam --username=amavis <spammy_email>

Learned tokens from 1 message(s) (1 message(s) examined)

I noticed that, in MySQL, both "spam_count" and "ham_count" values increase:

MariaDB [sa_bayes]> SELECT username,spam_count,ham_count FROM bayes_vars;
+----------+------------+-----------+
| username | spam_count | ham_count |
+----------+------------+-----------+
| amavis   |          4 |         4 |
+----------+------------+-----------+

Is this how it's meant to work?
Thanks!

4

Re: SpamAssassin training not working

lol7344 wrote:

Is this how it's meant to work?

No.
But are you sure there's only one message processed by SpamAssassin while testing? Maybe your server receives a message at the same time?

5

Re: SpamAssassin training not working

ZhangHuangbin wrote:
lol7344 wrote:

Is this how it's meant to work?

No.
But are you sure there's only one message processed by SpamAssassin while testing? Maybe your server receives a message at the same time?

Yeah, I'm pretty sure no other email is getting processed. I've been looking around for the past few days, but I really don't know what to try anymore.

6

Re: SpamAssassin training not working

Just try to move spams from your Junk folder to INBOX (or other folder, except Junk/Spam), then wait for few minutes and check again.

7 (edited by lol7344 2020-10-03 05:11:22)

Re: SpamAssassin training not working

Ok, quick update.
After a few weeks, the MySQL situation is this:

+----------+------------+-----------+
| username | spam_count | ham_count |
+----------+------------+-----------+
| amavis   |          4 |       243 |
+----------+------------+-----------+

I am sure not a single user has moved spam outside of the SPAM folder. The only possible explanation is that every new received email is counted as ham, instead of only the ones moved from SPAM to INBOX (or any other folder except TRASH).
This would also explain why, when I new moved Spam eMails in the Spam folder, the "spam_count" would increase, but the "ham_count" also.
Yet, I'm unable to find any possible solution to this problem.

8

Re: SpamAssassin training not working

Wait for "spam_count" to be 200 or more, then the bayes system will start tagging new spams.

9

Re: SpamAssassin training not working

Yes, I understood that. But I think there's something else I didn't get.
AFAIK, only emails that I explicitly move from "SPAM" to "INBOX" should get flagged as "ham", and only emails i move from "INBOX" to "SPAM" should get flagged as spam.
This is working correctly for spam emails (the number only increases if I manually move an email into the SPAM folder), but all new emails are automatically flagged as ham (I haven't moved 200+ emails outside of the spam folder, so the value shouldn't be that high).
Unless it automatically flags all non-spam new emails as spam, which then confirms that everything is working correctly.
I'm still wondering, however, why both "ham" and "spam" count increase when I move an email from inbox to spam.
It means that:

1. I receive the email
2. the email is automatically flagged as HAM
3. I move the email into the SPAM folder
4. the email is now also tagged as SPAM

So, both counters increase.
Isn't this really confusing? Spam emails should not also be flagged as HAM, unless I'm missing something...