July 28, 2004

Remote reboot

Server is still experiencing mail problems, I'll try some further tweaks shortly.

In the mean time I have asked DediPower to move the server onto a remotely controllable power supply. This means I can reboot the server as and when required 24/7. This should ensure we're not left without service over a weekend.

Posted by gordonj at 11:30 AM

July 27, 2004

Another Outage

Obelisk has been rebooted again :(

I've tracked the problem down tho :)

It appears some mail is getting stuck in a loop between the mail sorting script and spamassassin and thus invoking hundreds of instances of spamc and the sorting script. This is leathering the box until it reaches the point where it can't fork any more processes.

I've temporarily limited the number of concurrent spamc processes to 10 so some of your mail might not get spam checked. I'll try to find the root cause shortly.

Cheers and apologies for the continued problems issues.

Update: Tracked the errant mail to a mail that cron is sending. I've stoped the cron job that's causing it for the time being.

Posted by gordonj at 12:22 PM | Comments (0)

July 23, 2004

postfix online

Postfix is back online. I've also added an additional 1gig of swap which should help out. It appears large mail queues may have been screwing the box over.

All pending mail has been delivered.

BTW I have patched the kernel for http://www.gentoo.org/security/en/glsa/glsa-200407-16.xml

I've also enabled support for LVM and will probably move some stuff around over the next few weeks as time allows. /var's beginning to get a bit tight.

Posted by gordonj at 11:50 AM

July 22, 2004

postfix down

Postfix is currently down.

Unfortunatly attempts to reboot the server earlier were thwarted by an errent module that wouldn't 'rmmod', system seems a little unstable since so I've stopped postfix. Will fix in the morning.

Posted by gordonj at 10:15 PM

msnt: Upgraded

I've upgraded msnt (the Jabber MSN transport) to 1.3-cvs3 which should hopefully be a bit more stable.

Posted by gordonj at 12:11 PM

July 20, 2004

named monitoring

named seems to have been partially implicated in some of the recent problems. I.e. when named dies, jabber will stop working and mail will not be delivered (it's recieved, it's just not delivered).

I've added a short script to check named is running and restart it if not. Hopefully this should help a bit.

Posted by gordonj at 05:15 PM

July 19, 2004

Kernel Upgrade

Well it all screwed up *again* this morning. Hmm....

I've built a new kernel based on 2.4.26-gentoo-r3 which should hopefully prove to be more stable. If not then I guess we'll have to start thinking about possible hardware issues.

Apologies for any inconvenience caused recently.

Posted by gordonj at 12:49 PM

July 15, 2004

Hmm...

Server stopped spawning new processes again. Server has been rebooted and appears to be okay now.

This is now the third time this has happened, I think a new kernel might be in order.

Posted by gordonj at 02:55 PM

July 03, 2004

DNS -> Postfix & Jabber screw up

Looks like named died for some reason on Friday night. The impact of which is that no mail was recieved after this point and jabber didn't work.

It should all be okay now, any probs let me know.

Posted by gordonj at 08:13 PM