| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Introducing Dokkio, a new service from the creators of PBworks. Find and manage the files you've stored in Dropbox, Google Drive, Gmail, Slack, and more. Try it for free today.

View
 

The Software Stack

Page history last edited by Jean 7 years, 8 months ago

Break down of each piece of the stack and options of software. Reduce the emphasis on config files and more emphasis on when each piece is right.

 

Web Servers

Some general advice about picking a web server - no matter what you chose, 99.9999% of the time you shouldn't be looking at speed. Any one of these servers will handle what you throw at it just fine. That being said, look for features and ease of configuration first. Is it possible that you will need HTTPS now or in the future? That could be a problem if you pick Mr fast slick server. 

  • Apache - Apache is an all purpose web server that has been around since you were born. It's incredibly stable and full featured, which also means that it is slightly slower than some of the young gun servers out there (at least so people say). With Apache you get some awesome support for things like HTTPS(including wildcards OOB), virtual hosting, rewrite rules, a proxy & load balancing module (you probably don't want to use it), and the under appreciated mod_python in case you need to take it to the next level later on. It's installed by default on almost every Linux install, and there are gagillions of installers for every system to get up and going.

    • Recommended for: newbies, old school sys admins, people working with sloppy requirements and flaky customers, anyone who doesn't know what else to use.

    • Cons: many people complain about the huge config files but veterans will argue its as clean and clear as a school girl on a Sunday. acls and rewrite rules will blow your mind at first but they are very powerful.

    • System Resources: thread intensive, it tends to work your cpu. Not recommended on the same box as a Zope instance.

  • Nginx - Nginx is a single threaded asynchronous server (just like twisted) and is very good at reverse proxying. Its basically a drop in replacement for Apache and is wicked good at serving static files. Typical configurations use a fraction the memory of Apache. URL rewriting, HTTPS support, gzip compression, logging are all solid.

    • Recommended for: people who need to serve mega mega load

    • Cons: newer than the rest of these guys

    • System Resources: cpu and if you are rocking high load, memory; both typically less than Apache.

  • Squid - Squid is an old favourite, and can double as both a server and a cache. It doesn't have module support for serving anything CGI style but you can do some wicked stuff with request routing. The main benefit is its caching support, which I'll talk about in the caching section

    • Recommended for: People with weird network requirements, want to be lazy simple and have only one layer that does all caching, load balancing, etc... I don't know when else to recommend this

    • Cons: The config can be like getting a lobotomy - setting up load balancing with cache-peers almost never works after the first hour of messing with it, ESPECIALLY if you are running everything on one machine. 

    • System Resources: squid likes to eat everything, and still has manual configuration of how much RAM it uses which is not en vogue these days (see Varnish architecture notes for an explanation). Squid performs best on a box all by itself and will interfere with almost any other servers you use so watch out

  • Zope

  • A million other options

 

Proxies/Load Balancers

A good proxy/load balancer will make your life so much easier. A good balancer will detect when hosts are down, rewrite dropped calls, slowly warmup when a server comes back online, reconfigure without restarting, and do all sorts of crazy stuff at the routing level. When it comes to speed, if you don't need a lot of fancy url manoeuvring, you want to look for an L4 balancer. Otherwise get with the big dogs and just hit up the L7. It's not that much slower and you'll get a lot more flexibility in the long run. 

  • HAProxy - Example config for a basic setup here. If you don't know where to start and you know you are building a big system, just go here and kiss all your worries goodbye. Seriously. It's the shit. Offers both L4 and L7 configurations.

    • Recommended for: People expecting to manage multiple Zope servers, frequent restarts, big systems, lovers.

    • Cons: I can't complain about anything.

  • Perlbal - I've never used perlbal but the guy who wrote it taught me everything I know about load balancing (from afar - he has great talks) and so I'm positive that perlbal kick ass. I just didn't need a L7 balancer at the time. 

    • Recommended for: Those that like to try new and exciting things and believe in theories that the writer preaches.

    • Cons: No buildout recipes yet, not a lot of documentation compared to other proxies. Perl.

  • Pound  - Oh Pound. Pound, Pound, Pound.... L4 Cache. People love it or hate it. 

    • Recommended for: Simple solutions, masochists.

    • Cons: Configuration give me a headache and I've NEVER got Pound to work the way I wanted.

  • Apache - Apache actually does a great job at proxying and an O.K. job at load balancing. It's not as sophisticated as HAProxy et al but it will get the job done for smaller setups. It is an L7 cache and is actually quite efficient if you are already opening up headers to do rewriting or any of the other cool Apache features.

  • Squid - Same as Apache really.

  • Enfold Proxy - Possibly the best solution if you must use IIS and Windows. It's built specifically for Plone by Alan Runyan's company, so even though it's not Free, buying it is supporting Plone. It's very simple to set up and configure, and does a respectable job. Can proxy http and https, or basically anything IIS can do.

     

 

External Web Caches

Web caches sit in between the web server and Plone and caches pages post production. Sometimes they even serve the pages right to the users. Sometimes web caches are built into proxy servers and web servers and load balancers - there is really a lot of variety here. You can get a lot of value if you have a lot of non-personalized and/or anonymous content. If you serve everything under login, just stick with in browser cache optimizations and bypass this part of the system. The more obvious things you want to be sure to cache are CSS and JS files, as well as any anonymous content and content that doesn't change very much. Sound like a lot of configuring? You bet. A lot of monitoring? Yep. A giant headache? Sometimes. If you are looking to optimize, I don't recommend starting here. You can throw one in at any time.

 

  • Varnish - Varnish is the defacto Plone cache these days and  Varnish has nice tools to see what's in the cache and blah blah. However, if you find yourself doing some complicated stuff, cough, purging, cough, it may be more trouble than its worth (I know I know its like I'm trolling for comments with that statement...). The good thing is that there are lots of recipes for including Varnish in your setup already if you want to give it a shot. Just remember to monitor and make sure you're actually getting value out of it! Here's a good tutorial about setting up Varnish with Plone

  • Apache Traffic-Server - Wyn (asigottech) considers it much better than Varnish for medium to large sites (Yahoo for instance) and the next generation on from Varnish or Squid with all of there features and more.  He says ATS supports forward and reverse proxy, SSL termination / HTTPS(authenticating client ,serving from cache or re-encrypting connection and retrieving object from origin server to store in its cache)  hierarchical caching (it can search parent or child caches in other locations for the object) and cache clusters in the same or different physical locations as well handling URL re-writing ,cookies and more, his only major advice is use 2.1.1 unstable or greater and if using Ubuntu use higher releases then 9.04., they are deploying it across a Global Cache/CDN system.

  • Apache - There is a mod_cache module, I don't know jack about it, and most people don't use it. Q.E.D.

  • Squid  - I know what you're thinking - what doesn't squid do?!?! Again, it can be quite a drain on your system if your disk is already spinning a lot. There is a lot of fanboyism between squid and Varnish and the "RIGHT" way to do caching. Decide for yourself. Then dump squid and get on the boat with the cool kids...

 

Other Cache Options

  • Memcache

 

Other Database Options 

 

Example setups

  • Plone.org (using Nginx, Varnish, Pound and XDV)

 

Comments (2)

Nate Aune said

at 3:14 pm on Dec 18, 2009

Here is a good blog post about Apache vs. Nginx. http://www.joeandmotorboat.com/2008/02/28/apache-vs-nginx-web-server-performance-deathmatch/
According to this post, it appears that Nginx has a much better page response time and uses fewer CPU resources. Also can make better use of multi-core machines.

Nejc Zupan said

at 3:37 am on Dec 20, 2009

We've been using Nginx for a year and a half now and it's brilliant. Never crashed and consuming almost no resources at all. Another blog post of benchmarking Nginx against Apache proving how much faster and lighter it is:
http://blog.webfaction.com/a-little-holiday-present

And since it's footprint is so small it's suitable for sites that run on limited memory (256MB of ram is enough to run CentOS, Nginx and Zope for site with anonymous traffic).

One more thing, it does pretty good round-robin load balancing, so it should also be on the load balancers list. And since you already need Nginx to act as a a reverse proxy, why not also give it the task of load-balancing. However, you're limited to anonymous traffic. Logged-in traffic should be load-balanced by a more advanced piece of software.

Did I mention that you have a buildout recipe for Nginx? :)
http://pypi.python.org/pypi/gocept.nginx

You don't have permission to comment on this page.