This step-by-step tutorial will explain how to establish statistics and fuzzy storage replication on FreeBSD. Configuration procedures for other operating systems are very similar.
This tutorial describes a centralized model when Bayesian classifier and fuzzy storage learning is performed on one host and distributed among Rspamd installations at remote locations. For simplicity, the tutorial only covers replication to one slave
database for each of the masters
.
To accomplish this we should replicate bayes and fuzzy storage backend data to the remote host. As the rest of the Redis cache should not be mirrored we need to use dedicated Redis instances. Probably separating bayes and fuzzy is good idea as well.
We will create 3 Redis instances on both master
and slave
sides: bayes
, fuzzy
and redis
for the rest of the cache:
instance | Redis socket |
---|---|
redis |
localhost:6379 |
bayes |
localhost:6378 |
fuzzy |
localhost:6377 |
First install the databases/redis
package:
# pkg install redis
Create separate working directories for instances:
# cd /var/db/redis && mkdir bayes fuzzy && chown redis bayes fuzzy
To enable redis
and its specific instances add the following lines to the /etc/rc.conf
:
redis_enable="YES"
redis_profiles="redis bayes fuzzy"
Enable Redis logs rotation by creating a newsyslog configuration file /usr/local/etc/newsyslog.conf.d/redis.newsyslog.conf
:
# logfilename [owner:group] mode count size when flags [/pid_file] [sig_num]
/var/log/redis/redis.log redis:redis 644 5 100 * J
/var/log/redis/bayes.log redis:redis 644 5 100 * J
/var/log/redis/fuzzy.log redis:redis 644 5 100 * J
Create the default configuration on both master
and slave
(common for all of the instances):
# cp /usr/local/etc/redis.conf.sample /usr/local/etc/redis.conf
Since it is not safe to expose Redis to external interfaces, configure Redis to listen on loopbacks and use stunnel
to create TLS tunnels between slave
and master
hosts. As this approach has security vulnerabilities as well, you should not setup replication if the slave host’s users cannot be trusted.
Set listening sockets and memory limit (optional) as follows:
# diff -u1 /usr/local/etc/redis.conf.sample /usr/local/etc/redis.conf
--- /usr/local/etc/redis.conf.sample 2016-11-03 06:30:49.000000000 +0300
+++ /usr/local/etc/redis.conf 2016-11-27 13:10:43.671584000 +0300
@@ -60,3 +60,3 @@
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-bind 127.0.0.1
+bind 127.0.0.1 ::1
@@ -537,2 +537,3 @@
# maxmemory <bytes>
+maxmemory 200M
Configure the redis
instance on both master
and slave
in a manner to keep it compatible with a single instance configuration. So if you already had a single instance database it will keep working.
/usr/local/etc/redis-redis.conf
:
include /usr/local/etc/redis.conf
/usr/local/etc/redis-bayes.conf
:
include /usr/local/etc/redis.conf
port 6378
pidfile /var/run/redis/bayes.pid
logfile /var/log/redis/bayes.log
dbfilename bayes.rdb
dir /var/db/redis/bayes/
maxmemory 600M
/usr/local/etc/redis-fuzzy.conf
:
include /usr/local/etc/redis.conf
port 6377
pidfile /var/run/redis/fuzzy.pid
logfile /var/log/redis/fuzzy.log
dbfilename fuzzy.rdb
dir /var/db/redis/fuzzy/
If needed, the maxmemory
is adjusted for specific instances according to expected database size.
# service redis start
Please refer to the Setting up encrypted tunnel using stunnel guide.
/usr/local/etc/redis-bayes.conf
:
include /usr/local/etc/redis.conf
port 6378
pidfile /var/run/redis/bayes.pid
logfile /var/log/redis/bayes.log
dbfilename bayes.rdb
dir /var/db/redis/bayes/
slaveof localhost 6478
maxmemory 600M
/usr/local/etc/redis-fuzzy.conf
:
include /usr/local/etc/redis.conf
port 6377
pidfile /var/run/redis/fuzzy.pid
logfile /var/log/redis/fuzzy.log
dbfilename fuzzy.rdb
dir /var/db/redis/fuzzy/
slaveof localhost 6477
As slaves
do not connect to masters
directly, stunnel's
sockets are specified in the slaveof
directives.
# service redis start
Check slave instances logs. If resynchronization with the masters was successful, you are done.
On the master
side configure Rspamd to use distinct Redis instances respectively:
local.d/redis.conf
:
servers = "localhost";
local.d/classifier-bayes.conf
:
backend = "redis";
servers = "localhost:6378";
override.d/worker-fuzzy.inc
:
backend = "redis";
servers = "localhost:6377";
On the slave
side Rspamd should use local redis
instance for both reading and writing as it is not replicated.
local.d/redis.conf
:
servers = "localhost";
Since local bayes
and fuzzy
Redis instances are slaves, Rspamd should use them for reading, but write to the replication master
.
local.d/classifier-bayes.conf
:
backend = "redis";
read_servers = "localhost:6378";
write_servers = "localhost:6478";
override.d/worker-fuzzy.inc
:
backend = "redis";
read_servers = "localhost:6377";
write_servers = "localhost:6477";