Maintenance


Just like with backups, you should be able to run any maintenance script at any time under any load. You never know when things will be run on accident! If you need tips on how to run these processes under live load, please review the backups section.

 

Packing Zeo 

The number one rule to maintain a healthy system is "you must pack your zeos!". Zeodb keeps eons of revisions in its database, many more than you will ever need. If you leave it running without packing, the size can double quickly if you have lots of edits. Think of it as a colon cleans for your data. Many will testify that after a pack your site will seem speedy and responsive.

 

If you are worried about recovering data for legal reasons, have no fear. Before packing, zeo makes a copy of itself "in place" before packing called Data.fs.old. In theory, you can just rename the extension and get this baby up and running with the before pack database. In reality, there is a lot that could happen here so be careful that you are still taking regular backups and not depending on these as backups.

 

One thing to remember is if you have just packed your database, all of your backups with repozo will need to start with a full copy again. This means that there will be more tax on your disk at this time as a full backup is taken. After each pack, repozo does a full backup and then incrementals until the next pack. It would be who of you to schedule packing at hours of system slowness, immediately followed by a fresh backup.

 

If you have not been a careful little code monkey or your sys admins are plotting your demise, things can go very wrong. In particular, if you are mounting multiple databases and have cut & pasted content between multiple mounts you will start to see POSKeyErrors and the like as soon as you pack/restart. Make sure you have those backups in place or you will immediately regret it.

 

Sample packing script for linux that you can run with cron weekly, nightly, or ever so rightly.

 

Managing Multiple Boxes 

When talking about building systems, you move from building modules of software to automating system management tasks. Anything that could be automated should. Scripts for deploys, packing, backups, etc should all be in place. It becomes particularly interesting when you need to run these on multiple machines - i.e. to do a rolling restart of all zope servers. There are many tools out there to help with that.

 

Running Batch/Cron Jobs Smartly

Summary: When writing maintenance, backup, etc... scripts make sure they can run at ANY time of the day under ANY load without affecting the stability and responsiveness of the system.

 

Explanation: When running any maintenance jobs that tend to take up a lot of resources, be it cpu or disk, there is a strong tendancy to want to just say - screw it. We'll run everything at night since no one is on the system really at that time.  This works for a small period of time, and as long as you aren't doing a lot of maintenance and your site stays small it won't hurt you. However there is a point where there is just too much to be done and it can't always happen at night. Then you are forced to sit back and say "I need to do the same thing but it needs to run during the day and not affect the system". Turns out this is almost always a trivial change with a little bit of smart system thinking.

 

After going through this process, I wish I would have done it from the beginning, and not because it saved me coding time later. I can't count how many times the system almost collapsed because a newb accidentally triggered a backup in the middle of the day or I personally misconfigured cron to run at peak load (noon) instead of midnight. Turns out that if you, mr(s) system administrator, know that anything can be run at any time without consequence, you'll sleep a whole lot better at night.

 

Links