SpamAssassin is a great piece of software. However the documentation is not up to par. It to me some trial and erroring to find out how to train the Bayes filter. This is what I found out. You have to feed sa-learn with a file containing a folder on each line (and not prefixed as the sa-learn manual says).
sa-learn --spam -C /etc/mail/spamassasin --folders=/path/to/spam-folders &
| SA needs to have learned at least 200 ham and 200 spam messages to operate. |
The bayes path, that is the place where the token database is stored can be configures in /etc/mail/spamassassin/local.cf
bayes_path /var/spamassassin/bayes-db/bayes
Now the spamassassin daemon knows where to look.
| Make sure the spamassassin user is able to read the database files. |