This file README.performance is part of the amavisd-new distribution, which can be found at http://www.ijs.si/software/amavisd/ Updated: 2002-05-13, 2002-08-01, 2003-01-09, 2005-01-19 Here are some excerpts from my mail(s) on the topic of performance. Mark [...] | What I use now is FreeBSD+Postfix+amavisd+Sophie, Good choice in my opinion. (P.S.: add clamd to the mix) Hopefully hardware matches expectations, fast disks and enough memory are paramount. You may want to put Postfix spool on different disk than /var/amavis, where amavisd does mail unpacking. | is there any suggested configuration for this | environment? Especially if my server is a high loaded | busy mail hub/gateway? Any parameters for performance tuning? | Do I need to increase this number to fit a busy server? | Or any other related parameters should I notice? How many messages per day are we talking about? Both the amavisd child processes, and (to much lesser degree) the Postfix smtpd services consume quite some chunks of memory, so the memory size can determine how many parallel processes you can run. Note that the Perl interpreter in amavisd-new processes occupies the same memory if fork on a Unix system uses copy-on-write for memory pages, as most modern Unixes do. This however does not apply to memory allocated after the child processes have forked. I would start small, e.g. by 2 or 3 child processes per CPU (parameter $max_servers), then see how machine behaves. If you see heavy swapping or load regularly going beyond 2 or 3 (per CPU), decrease the number of parallel streams, otherwise increase it - gradually. This number is probably the most important tuning parameter. Going beyond 10 usually brings no more improvements in overall system throughput, it just wastes memory. If this does not come close to your needs, you may want to place amavisd-new with Sophie on a different host than Postfix. They talk via SMTP so there is no particular advantage in having both MTA and amavisd on the same host. Actually there are now three quite independent modules, which can share the same host, or not: incoming Postfix (MTA-IN) -> amavisd+Sophie -> outgoing Postfix (MTA-OUT) Both MTA-IN and MTA-OUT can be the same single Postfix, but need not be. If you decide to split MTA-IN and MTA-OUT, you can position one of them on the same host as amavisd, although I guess it would be better to either have three boxes, or have MTA-IN and MTA-OUT be a single Postfix, as in the normal setup, while optionally moving amavisd+Sophie to a different host. As amavisd-new is just a regular SMTP server/client to Postfix, one can use the usual load sharing mechanisms as available for normal mail delivery, like having multiple MX records for the content filter (applies to feeding amavisd by the Postfix service smtp, but not to lmtp which does not care for MX). [...] | I would like to know the possibility of email loss? Especially | under unawareness! What if amavisd or Sophie suddenly/abnormally | terminated? Is there any recovery procedures should be take? Mail loss should not be possible (except with disk failure holding MTA spool directories). I am continually testing some awkward situations like disk full, process restarts, child dies, even programming errors :) ... Amavisd never takes the responsibility for mail delivery away from MTA, it just acts as an intermediary between MTA-IN and MTA-OUT. Only when MTA-OUT confirms it has received mail, the MTA-IN does a SMTP session close-down with a success status code. All breakdowns and connection losses are handled by MTA, and Postfix is very good in doing it in a reliable way. The only cause of concern is DoS in some unpackers. This part of code in amavisd-new is still mostly the same as in the amavisd version, and although it does exercise some care, there is still a lot to be desired. Let me tell a heretic secret: if your AV scanner (e.g. Sophie) can handle all archives used by current viruses (except MIME decoding, which is done by amavisd), it is reasonably safe, good and fast to set $bypass_decode_parts to 1 (see amavisd.conf). And more: later Postfix versions can do the MIME syntax checking and enforce 7bit header RFC 2822 requirements (see parameters like: $ postconf | egrep 'mime|[78]bit' ) so you can block invalid MIME even before it hits the MIME::Parser Perl module. Instead of wasting 5 minutes for some particularly nasty archive, Sophie can do it in 5 seconds !!! I have yet to see a virus (in the wild) that Sophos would ONLY detect if first unpacked by amavisd. (P.S. not always true, but most of the time this is so) This does not take care of manual malicious intents, but one can always bring in a virus on a floppy, or download it some other way (e.g. PGP encrypted), if one really wants to. --------- See article by Cor Bosman for a high-end installation: http://www.xs4all.nl/~scorpio/sane2002/paper.ps --------- Limit the number of AV-scanning processes, don't let MTA run arbitrary number of AV-scanning processes (P.S. this is easy to ensure with Postfix, hard to do with pre-queue content filtering like sendmail milter or Postfix smtp proxy). Also limiting based on CPU load (like in sendmail) is not a good idea in my opinion - set the fixed limit based on the number of concurrent AV-checking processes you host (memory,disk,cpu) can handle, not on the current load or mail rate, otherwise when the situation goes bad, it is more likely it will go bad all the way - disk and memory thrashing is the last thing you desire when load goes high. --------- | I have a question about how to distribute amavisd-new directories across | different disks for optimal performance. There are usually 4 directories | in the amavisd-new mail path. | 1) The amavis TEMPBASE directory (Where incoming emails are scanned) | 2) The postfix queue directory | 3) The directory for amavis and mail system logs | 4) The directory where mail is delivered. | What would be the best distribution of these directories over multiple disks? | Obviously, having each one on a different disk would be best. However, if | you only have 3 disks to use, which two services should be combined? If you | have only two disks, which services should be put together? !!! Let amavisd-new log via syslog and make sure your syslogd does not call flush for every log entry !!! (as Linux does by default, but is configurable per log file)! This way the disk with log files becomes non-critical. The disk with Postfix mail queue is likely to be most heavily beaten by file creates/deletes. I would put it on its own disk. The $TEMPBASE (amavis work directory) is probably not as heavily exercised (in the SMTP-in/SMTP-out amavisd-new setup, as with Postfix), unless your mail messages often contain many MIME parts that need to be decoded. If you can afford it, it can even reside on a RAM disk / tempfs or with delayed-syncing without risking any mail loss. --------- Perl running in Unicode mode is reported to be noticably slower than otherwise. It is wise to disable it, e.g. by setting environment variable LC_ALL=C before starting amavisd on systems where this is not a default (Linux RedHat 8.0). See also 'Speed up amavisd-new by using a tmpfs filesystem for $TEMPDIR' at http://www.stahl.bau.tu-bs.de/~hildeb/postfix/amavisd_tmpfs.shtml by Ralf Hildebrandt --------- | define(`confMAX_DAEMON_CHILDREN', 20) | should we limit MaxDaemonChildren in MTA-RX ? ... what would be a magic | formula to define it? I assume it should be based on the number of | amavisd-new child processes (which should match queue runners) | and Max No. of msgs per connection ? Here the charm of dual-sendmail setup (or Postfix setup) is most apparent. The MaxDaemonChildren sendmail option is almost completely independent from the number of amavisd-new child processes. The MaxDaemonChildren in MTA-RX should be sufficiently large so that most of the time all incoming mail connections can each get its own sendmail process which is willing to accept the mail trickle. These smtp server processes are relatively lightweight (hopefully sharing the program code in memory), so they don't cost much. The upper limit is the number of sendmail receiving processes the host can comfortably handle, including disk I/O they produce. One may set this value high and observe the usual number of incoming parallel SMTP sessions during normally busy hours, then set the limit comfortably above that value. This applies to Postfix as well (maxproc for smtpd service on port 25). The number of amavisd-new child processes and the number of queue runners is another matter. Since content filtering (especially with SA enabled) is CPU and memory intensive, the number of content filtering processes is limited by the host power and its memory. Never have this number so high that swapping occurs, or that the time for each individual mail check gets too large, say over a couple of seconds. Long content checking times can also increase the locking contention on the SA Bayes database. P.S.: It is advised to move Bayes database to a SQL server, it need not be on a separate host. A very rough rule of thumb may be that the MaxDaemonChildren can easily be 10 times the number of content filtering processes. --------- > How did you determine this optimal number of child processes? > Is there a nice scientific way to do it or via simple trial and error. Measure and plot a diagram of maximum sustained mail troughput (msgs/h) for a couple of values of max_proc (and a matching value of $max_servers). About 5 or 10 strategically placed data points could already give a useful picture. Measuring throughput for each max_proc value takes a restart of amavisd and a postfix reload (and possibly: postfix flush), the measuring period should last for say 10 minutes or preferably more, assuming that the supply of mail to be processed does not run out and keeps mail processing saturated, i.e. that there are plenty of mail messages in the mail queue waiting to be processed by the content filter. An opportunity for such an experiment on a production machine arises when some backlog of mail accumulates, e.g. after a network outage or some other problem occurred that stopped mail flow. It is certainly possible to create a synthetic mail traffic. For mail sink (if needed) one could use the src/smtpstone/smtp-sink.c from a Postfix distribution, but for generating synthetic mail the smtp-source.c is not realistic regarding the mail contents, so it is best to use a real ham/spam/viruses mail mix, either on a running system, or mail collected and saved for the purpose during normal business hours. The exact mail rate can be deduced from the mail log for each measurement period. I usually choose to do a plot of cumulative message count vs. wall-clock time, which makes it easier to find the slope and ignore startup or other anomalies which stand out more obviously in the plot. A good enough mail rate fugure is given by the amavisd-agent program, which for the first couple of screens shows the counter averages since the start of amavisd (which is probably what we need here), then after about 2.5 minutes its starts reporting exact last 5 minute running averages. The usual counter of interest is the 'InMsgs' (which also happens to be the same as the 'CacheAttempts', which is the first counter on the reported list). One should get a diagram like: msgs/h | | * * * * | * | * | * | | * best |* -----------------------------> max_proc The optimum max_proc is where the function starts to level off, it gives the best throughput at a minimum waste of memory for excessive processes that gain no benefit. > We're running dual 3.0Ghz xeons (64bit fedora core 3) with 2GB or RAM - > what do you think the optimal number should be? > We have local copies of surbl.org and a local dcc server. Not using razor. I wouldn't dare to guess, you just have to measure it. It depends on too many factors. The more network-based high latency SA tests you have enabled, the higher would be the optimum max_proc value. The exact spot varies over daily network conditions and usage patterns, so don't bother to narrow it down too exactly. --------- LDAP lookups note from Michael Hall in reply to Matt Juszczak on Aug 12, 2005: > >You might try to make better indexes on the LDAP server before > >upgrading the hardware. > > Yep, tried this yesterday :) that was the problem. Added another index > for mailRoutingAddress (I already had mailLocalAddress created but I > guess the mailRoutingAddress needed one too) and now we're experiencing > instant mail delivery with no queues :) You definitely want to make sure you have indexes on any attribute used in searches, it can make a huge difference as you found out. If you're using OpenLDAP and 'bdb' databases you also want to make sure and configure the Berkely DB environment with a DB_CONFIG file, and use db_stat to check things. Below is an excerpt from one of our mail servers at work: $ db_stat-4.2 -h /var/db/openldap-data -m ... Correctly sizing the cache can make a big difference as answers can be pulled from it vs accessing the disks.