Originally Posted by Plan9
haha I was puzzled when I (mis)read your post. I literally asked myself how someone who spends their entire working day with web frameworks wasn't aware of caching
Sorry about that mate.
In answer to your question, I don't really know myself. I can't see the benefit of running both nginx and Apache on the same box either (bar some fringe cases like the server I'm currently building). In your case, it would make more sense to let Apache manage the entire httpd stack.
I think where people talk about nginx in that regard is one of two things:
- A free load balancer (not applicable in your case as you only have 1 node)
- A replacement for lighttpd for static content (not applicable in your case as you're going to run Apache anyway, so you'd be added to your systems footprint running two httpds)
I appreciate your diplomacy, but compiling and benchmarking Apache is something I do nearly every day for work (it's actually one of my main roles as the sys admin for a smallish datacenter). I've found that the benefits of compiling Apache yourself are minimal and vastly outweighed by the cons:
- Apache is one of those bits of software where keeping up with patches is paramount. It's significantly easier to do that with the repos than it is if you compile your own build of Apache.
- To expand a little more on the former point, doing so with next to zero down time is very difficult without a web farm and load balancer (And even then I end up having to install to non-standard directories and using symlinks to perform the switch over).
- Apache has a number of dependencies, some of which you may end up having to manually compile in addition to Apache (over the years I've ran into issues with zlib and openssl - amongst others that I've since forgotten)
- Apache has dozens of modules, some included by default, some of which are not. And you really need to know what you want before you start compiling if you want any kind of performance boost (yeah you can compile the modules as shared objects, but then you might as well just use the repos which do the same).
- To further the above point, by compiling in the modules, you're making it massively harder to disable unused modules (theres a reason why even the Apache devs state that using shared objects is the preferred method for compiling Apache)
- You're making it dead easy to miss security modules like user sandboxing, which are not compiled in by default.
- You're responsible for your own testing (STABLE repo's will have tested your Zend / mod_perl / etc VMs against Apache - when I've compiled my own I've found bugs that lead to segfaults in Apache threads (which lead to pages not loading) and those had to be fixed my myself).
Then lets look at the way how Apache works:
- All you're doing by compiling in your own modules is speeding up the thread creation time, but as threads can handle hundreds of simultaneous HTTP requests and can sit idle when there's no traffic, you're really not going to see any performance increase there unless your servers are hammered (and I mean > thousands of hits a minute) for hours on end. A personal website server would see zero performance increase.
- Most of the heavy processing in Apache happens in the language interpreters / VMs (eg mod_perl, Zend (PHP), Tomcat (Java servlets), etc). Compiling Apache would have zero performance gains with those VMs.
People really interested in Apache's performance should really be looking at the following:
- Most people configure their language VMs with sub optimal settings (eg enabling ENVs in mod_perl, not scanning through the php.ini for optimizations, etc) or even using the wrong handlers entirely (eg CGI instead of bespoke mod_perl handlers). These will have a significantly bigger performance impact than compiling your own build of Apache.
- Also there's many different interpreters for a lot of different languages. eg for PHP there's Zend, which is the most popular PHP engine, but that isn't the fastest. HipHop (facebook's creation) is supposed to offer up to 50% performance increases.
- But most of the time it's the web developers who have created those bottlenecks (needlessly heavy modularised code - as seen with many popular CMSs, inefficient SQL requests, etc)
- Sometimes Apache itself hasn't been configured optimally (inefficient threading ratios for that servers hardware, etc). That's a lot harder to configure right though - and often is just trial and error with load testing to benchmark each tweak.
- Then you have external configuration; (eg static pages held on slower storage mediums, database server not able to handle capacity or badly configured SQL connection pooling, etc). So many people overlook the obvious there. Sometimes even just storing cache on RAM disk can improve the responsiveness of sites.
- And, if you're really obsessed with going low level, there's a lot that can be done to to streamline Linux's TCP/IP stack for web serving and you'd see bigger improvements than compiling Apache manually. If you're running iptables, I've read that there's a few tweaks that can be done there as well (though I haven't personally tested those tweaks as we have dedicated hardware firewalls).
- Also, what version of the Linux kernel are you running? some of the newer versions have TCP/IP cookies that are designed to significantly reduce the footprint of TCP handshakes, very handy for web servers (sadly that specific patch requires the client support as well).
So, with the greatest of respect, telling someone to compile their own instance of Apache is just terrible advice and overlooks the mass of performance gains that can be achieved without compiling yet introduces a number of new security, potential stability issues as well as diverts peoples valuable free time from focusing on the real bottlenecks within Apache.
Honestly, if we didn't run a bespoke set up in my current work place, even I wouldn't bother compiling Apache.
Sorry if there's spelling or grammar mistakes, or even if bits read really badly. It's a long post and I should be working (coincidentally, I'm writing a new Nagios plugin to monitor some Apache upgrades as the last few I ran created unexpected conflicts with some existing software) so shamefully I've not bothered to read this back.
I found your post very informative and appreciate the effort you went to in creating it +REP
I will confess that my knowledge of Linux does not stand up to yours, so people should probably trust in your opinion over mine.
I will further research the subject.
However, another advantage to compiling a basic Apache install from source rather than installing via a package manager is that many distros have repos that are severely out of date and I know of several people who would not think to check a thing like that, after all, "the OS thinks it's up to date so it must be"
I'll concede that this is not the most compelling reason though, the main reason I personally compile most of my stuff from source is for fun, though in many cases (not just web servers) there are performance and efficiency gains, especially for esoteric architectures (which I encounter from time to time).Edited by dushan24 - 3/19/13 at 5:37am