Server is still experiencing mail problems, I'll try some further tweaks shortly.
In the mean time I have asked DediPower to move the server onto a remotely controllable power supply. This means I can reboot the server as and when required 24/7. This should ensure we're not left without service over a weekend.
Obelisk has been rebooted again :(
I've tracked the problem down tho :)
It appears some mail is getting stuck in a loop between the mail sorting script and spamassassin and thus invoking hundreds of instances of spamc and the sorting script. This is leathering the box until it reaches the point where it can't fork any more processes.
I've temporarily limited the number of concurrent spamc processes to 10 so some of your mail might not get spam checked. I'll try to find the root cause shortly.
Cheers and apologies for the continued problems issues.
Update: Tracked the errant mail to a mail that cron is sending. I've stoped the cron job that's causing it for the time being.
Postfix is back online. I've also added an additional 1gig of swap which should help out. It appears large mail queues may have been screwing the box over.
All pending mail has been delivered.
BTW I have patched the kernel for http://www.gentoo.org/security/en/glsa/glsa-200407-16.xml
I've also enabled support for LVM and will probably move some stuff around over the next few weeks as time allows. /var's beginning to get a bit tight.
Postfix is currently down.
Unfortunatly attempts to reboot the server earlier were thwarted by an errent module that wouldn't 'rmmod', system seems a little unstable since so I've stopped postfix. Will fix in the morning.
I've upgraded msnt (the Jabber MSN transport) to 1.3-cvs3 which should hopefully be a bit more stable.
named seems to have been partially implicated in some of the recent problems. I.e. when named dies, jabber will stop working and mail will not be delivered (it's recieved, it's just not delivered).
I've added a short script to check named is running and restart it if not. Hopefully this should help a bit.
Well it all screwed up *again* this morning. Hmm....
I've built a new kernel based on 2.4.26-gentoo-r3 which should hopefully prove to be more stable. If not then I guess we'll have to start thinking about possible hardware issues.
Apologies for any inconvenience caused recently.
Server stopped spawning new processes again. Server has been rebooted and appears to be okay now.
This is now the third time this has happened, I think a new kernel might be in order.
Looks like named died for some reason on Friday night. The impact of which is that no mail was recieved after this point and jabber didn't work.
It should all be okay now, any probs let me know.