Overclock.net banner

nginx or apache: which and reasons why?

1K views 23 replies 11 participants last post by  Plan9 
#1 ·
I'm currently tasked with setting up a new dedicated server for a client.

They are one of those people that don't know a whole lot about computers, but they have that <insert friend/family member here> that in 1978 took a computer class, and knows how to use google...

So according to this "person", I should configure this guys dedicated server with nginx instead of apache. I've done some research on the subject, apache vs nginx, most of what i can find is typically from one camp beating their drum louder.

from the concrete evidence i can find, i would make these assumptions about nginx:

its event based, meaning it uses less ram than apache for certain circumstances.

its better at serving static content than apache is, under higher connections.

less performance degradation than apache, when it comes to serving static pages.

less documentation than apache, less mods than apache.

apache:

apache tried and true

lots of mods

highly documented

I know apache quite well

scalability

about the guys site:

Very little if any of the site will be "static", Most of the site will be generated dynamically via a php framework and mysql database (using a cache system, more than likely memcache). it is expected on average to have about 3k to 6k members at a time browsing the site. we will be using a CDN to deliver Images and Javascript.

Now everything i've read and know says apache is better suited for this task than nginx. any sys/server admin have experience with nginx in environment that the site is rendered almost completely dynamically, using a cache and CDN compared to the same running in apache.

the server is quite powerful, resources aren't much of an issue atm. I do want to provide the client with the best solution, I just always take "this <insert friend/family member>, knows at hing or two about these things, and they have said...", type of statements with grains of salt, and when my own research is not able to return imho valid results to support such a claim, i usually stick with what i know, but something has told to delve deeper into the subject. so any one who can provide any factual information, i would appreciate it.
 
See less See more
#2 ·
They're both great web servers. So I think your decision boils down to one of two things:
Would you rather than a low footprint server or one with the most online tutorials?

By the sounds of it, I think you'd rather opt for the latter otherwise you'd have done your own thing rather than asking on a forum. But in honesty, you can't got far wrong with either of them.

[edit]

Just spotted that you already know Apache quite well. In that case just stick with what you know.
smile.gif
 
#4 ·
Yeah i figured as much, that there wasn't a huge amount of difference in the two. I'm sure they both suit their purposes. I'm not particular tied to apache by any means, it is just one i know better cause it is the one i've used the most. I will probably setup a vbox later and test out nginx on my own, but it will probably be awhile before i ever consider running it on a production server till i find out more about it.
 
#5 ·
I would go with Apache too ! There is hundreds and hundreds tutorials . A lot tweakable. There is no real reasons to not choose it... and if you knoow apache more than the other, well...
smile.gif
Just take it ! I used to developps Webserver on Linux and this is mainly on Apache i was working on. I would take Apache without hesiations
smile.gif
 
#10 ·
Quote:
Originally Posted by Transhour View Post

Very little if any of the site will be "static", Most of the site will be generated dynamically via a php framework and mysql database (using a cache system, more than likely memcache).
Cached content is static until it is invalidated by updates. It's possible to use Nginx as a front end webserver which will serve from the cache and hand off the request to Apache if the cache doesn't contain the content requested. Of course if your pages are constantly changing then this is a bad option.
 
#11 ·
Use apache if:
1) Tutorials are needed (AKA inexperienced client wants to modify it)
2) Requires HTaccess
3) You dont mind setting up a caching system such as Varnish

Apache is very time tested and is the default for many things. Tons of tutorials, lots of flexibility. Probably the best choice unless you know otherwise
Nginx is great for static content. Infact, you could run it as the front-facing server and have apache do the slave work. I know of a few that do this and it's very fast.
Litespeed should be considered too. Its a drop-in replacement for Apache and can have speed benefits over Apache. Supports HTaccess too.
 
#12 ·
Quote:
Originally Posted by randomizer View Post

Cached content is static until it is invalidated by updates. It's possible to use Nginx as a front end webserver which will serve from the cache and hand off the request to Apache if the cache doesn't contain the content requested. Of course if your pages are constantly changing then this is a bad option.
I agree with this.
 
#13 ·
I was also going to mention Varnish...

Anyhow, another vote for Apache, compile it from source if you want better performance or don't need certain modules.

Also, just to stir the pot, my friend swears by lighttpd. No one has mentioned it yet.
 
#14 ·
Quote:
Originally Posted by dushan24 View Post

Anyhow, another vote for Apache, compile it from source if you want better performance or don't need certain modules.
Terrible advice. It's absolutely pointless, long winded, needlessly complicated, can introduce a large number of faults, and can potentially make apache less secure if you don't know what you're doing.

If there's modules you don't want, then just comment out the shared object from your httpd.conf. That's literally all you need to do. If your pages are still slow, then it's either because you've written crap server side code (PHP / Perl / etc) or you've badly configured language VM (eg using non-cached CGI instead of cached binaries).
Quote:
Originally Posted by dushan24 View Post

Also, just to stir the pot, my friend swears by lighttpd. No one has mentioned it yet.
Because it's less sophisticated than nginx (hence why quite a number of lighttpd boxes have been migrated over to nginx in recent years) and is also out of scope from this question.

Really, all these alternative HTTP daemon suggestions are pointless because the OP has already said he has experience in Apache, the resources to run Apache and would rather host Apache unless nginx -specifically- has any significant advantages. While nginx is a fantastic bit of kit, it doesn't have any significant advantages in this specific scenario. And all these other suggestions make me wonder if you guys even read the brief or if you're just showing off a little knowledge of Linux servers by listing off a number of irrelevant suggestions. Sorry if this sounds harsh / judgmental, but I see this time and time again on forums where members get a little over enthusiastic and end up spec'ing their own systems instead of answering the OP's questions.
 
#15 ·
Quote:
Originally Posted by randomizer View Post

Cached content is static until it is invalidated by updates. It's possible to use Nginx as a front end webserver which will serve from the cache and hand off the request to Apache if the cache doesn't contain the content requested. Of course if your pages are constantly changing then this is a bad option.
I'm curious how this is better, it would require maintenance on two http daemons, eating up resources, and chewing thru cycles to determine which one to use for the called upon data.

We've gone ahead and went with apache for this server. something interesting did come from this tho, we did discover the benefits of MariaDB over MySQL. On average it is 15 to 25% faster in most cases compared to MySQL, and uses about 15% less resources when querying large amounts of data. Its a bit tricky to get it to work well with a SSL cert, something is wonky with the handshake. I've got one of the guys who's our DB administrator looking into it, says its configured correctly, not sure why it flakes out at times.

Oh and the other good thing that has come from this project, codeignitor :). Been looking for something for awhile to replace our in house framework, and i think we've found it with codeignitor :).
 
#16 ·
Quote:
Originally Posted by Transhour View Post

I'm curious how this is better, it would require maintenance on two http daemons, eating up resources, and chewing thru cycles to determine which one to use for the called upon data.
Cached pages load quicker than dynamic pages. Massively quicker because you're just serving up a static page instead of calling the various scripting handlers, making database calls and so forth.

As to how you implement cache, well that's entirely up to you. There's as many different ways as there are programming languages.
 
#17 ·
Quote:
Originally Posted by Plan9 View Post

Cached pages load quicker than dynamic pages. Massively quicker because you're just serving up a static page instead of calling the various scripting handlers, making database calls and so forth.

As to how you implement cache, well that's entirely up to you. There's as many different ways as there are programming languages.
I know how caches work, what I was asking, how is setting up nginx to handle the cached pages and apache handling the non-cache/new content pages, going to be "better" than just letting one or the other do all of it. I've never seen a setup like this before, so i honestly would be at a loss to even approach something like this.
 
#18 ·
Quote:
Originally Posted by Plan9 View Post

Terrible advice. It's absolutely pointless, long winded, needlessly complicated, can introduce a large number of faults, and can potentially make apache less secure if you don't know what you're doing.
Let's agree to disagree.
Quote:
Originally Posted by Plan9 View Post

If there's modules you don't want, then just comment out the shared object from your httpd.conf. That's literally all you need to do.
Granted, that is true.
Quote:
Originally Posted by Plan9 View Post

And all these other suggestions make me wonder if you guys even read the brief or if you're just showing off a little knowledge of Linux servers by listing off a number of irrelevant suggestions. Sorry if this sounds harsh / judgmental, but I see this time and time again on forums where members get a little over enthusiastic and end up spec'ing their own systems instead of answering the OP's questions.
I see this from time to time as well, but it is never my intention to do so, I am simply putting forward my own ideas and those of others such that we can have an interesting and relevant debate.
 
#19 ·
Quote:
Originally Posted by Transhour View Post

I know how caches work, what I was asking, how is setting up nginx to handle the cached pages and apache handling the non-cache/new content pages, going to be "better" than just letting one or the other do all of it. I've never seen a setup like this before, so i honestly would be at a loss to even approach something like this.
haha I was puzzled when I (mis)read your post. I literally asked myself how someone who spends their entire working day with web frameworks wasn't aware of caching
laugher.gif
Sorry about that mate.
redface.gif


In answer to your question, I don't really know myself. I can't see the benefit of running both nginx and Apache on the same box either (bar some fringe cases like the server I'm currently building). In your case, it would make more sense to let Apache manage the entire httpd stack.

I think where people talk about nginx in that regard is one of two things:
  1. A free load balancer (not applicable in your case as you only have 1 node)
  2. A replacement for lighttpd for static content (not applicable in your case as you're going to run Apache anyway, so you'd be added to your systems footprint running two httpds)
Quote:
Originally Posted by dushan24 View Post

Let's agree to disagree.
I appreciate your diplomacy, but compiling and benchmarking Apache is something I do nearly every day for work (it's actually one of my main roles as the sys admin for a smallish datacenter). I've found that the benefits of compiling Apache yourself are minimal and vastly outweighed by the cons:
  • Apache is one of those bits of software where keeping up with patches is paramount. It's significantly easier to do that with the repos than it is if you compile your own build of Apache.
  • To expand a little more on the former point, doing so with next to zero down time is very difficult without a web farm and load balancer (And even then I end up having to install to non-standard directories and using symlinks to perform the switch over).
  • Apache has a number of dependencies, some of which you may end up having to manually compile in addition to Apache (over the years I've ran into issues with zlib and openssl - amongst others that I've since forgotten)
  • Apache has dozens of modules, some included by default, some of which are not. And you really need to know what you want before you start compiling if you want any kind of performance boost (yeah you can compile the modules as shared objects, but then you might as well just use the repos which do the same).
  • To further the above point, by compiling in the modules, you're making it massively harder to disable unused modules (theres a reason why even the Apache devs state that using shared objects is the preferred method for compiling Apache)
  • You're making it dead easy to miss security modules like user sandboxing, which are not compiled in by default.
  • You're responsible for your own testing (STABLE repo's will have tested your Zend / mod_perl / etc VMs against Apache - when I've compiled my own I've found bugs that lead to segfaults in Apache threads (which lead to pages not loading) and those had to be fixed my myself).

Then lets look at the way how Apache works:
  • All you're doing by compiling in your own modules is speeding up the thread creation time, but as threads can handle hundreds of simultaneous HTTP requests and can sit idle when there's no traffic, you're really not going to see any performance increase there unless your servers are hammered (and I mean > thousands of hits a minute) for hours on end. A personal website server would see zero performance increase.
  • Most of the heavy processing in Apache happens in the language interpreters / VMs (eg mod_perl, Zend (PHP), Tomcat (Java servlets), etc). Compiling Apache would have zero performance gains with those VMs.

People really interested in Apache's performance should really be looking at the following:
  • Most people configure their language VMs with sub optimal settings (eg enabling ENVs in mod_perl, not scanning through the php.ini for optimizations, etc) or even using the wrong handlers entirely (eg CGI instead of bespoke mod_perl handlers). These will have a significantly bigger performance impact than compiling your own build of Apache.
  • Also there's many different interpreters for a lot of different languages. eg for PHP there's Zend, which is the most popular PHP engine, but that isn't the fastest. HipHop (facebook's creation) is supposed to offer up to 50% performance increases.
  • But most of the time it's the web developers who have created those bottlenecks (needlessly heavy modularised code - as seen with many popular CMSs, inefficient SQL requests, etc)
  • Sometimes Apache itself hasn't been configured optimally (inefficient threading ratios for that servers hardware, etc). That's a lot harder to configure right though - and often is just trial and error with load testing to benchmark each tweak.
  • Then you have external configuration; (eg static pages held on slower storage mediums, database server not able to handle capacity or badly configured SQL connection pooling, etc). So many people overlook the obvious there. Sometimes even just storing cache on RAM disk can improve the responsiveness of sites.
  • And, if you're really obsessed with going low level, there's a lot that can be done to to streamline Linux's TCP/IP stack for web serving and you'd see bigger improvements than compiling Apache manually. If you're running iptables, I've read that there's a few tweaks that can be done there as well (though I haven't personally tested those tweaks as we have dedicated hardware firewalls).
  • Also, what version of the Linux kernel are you running? some of the newer versions have TCP/IP cookies that are designed to significantly reduce the footprint of TCP handshakes, very handy for web servers (sadly that specific patch requires the client support as well).
  • and lastly, are you even compressing the bloody data (mod_deflate, minified CSS and Javascript, etc)? So many developers and sys admins don't even bother with this step yet it's essentially a free pass for faster page load times.
So, with the greatest of respect, telling someone to compile their own instance of Apache is just terrible advice and overlooks the mass of performance gains that can be achieved without compiling yet introduces a number of new security, potential stability issues as well as diverts peoples valuable free time from focusing on the real bottlenecks within Apache.

Honestly, if we didn't run a bespoke set up in my current work place, even I wouldn't bother compiling Apache.
smile.gif


[edit]
Sorry if there's spelling or grammar mistakes, or even if bits read really badly. It's a long post and I should be working (coincidentally, I'm writing a new Nagios plugin to monitor some Apache upgrades as the last few I ran created unexpected conflicts with some existing software) so shamefully I've not bothered to read this back.
 
#20 ·
Quote:
Originally Posted by Transhour View Post

I know how caches work, what I was asking, how is setting up nginx to handle the cached pages and apache handling the non-cache/new content pages, going to be "better" than just letting one or the other do all of it. I've never seen a setup like this before, so i honestly would be at a loss to even approach something like this.
Because you are playing to the strengths of two webservers instead of the strengths and "weaknesses" of one. There is obviously an overhead for cache misses compared to simply having Apache handle the request every time, but if 90% of your requests can be served from the cache then you have a large net gain in performance because you don't have to touch your DB. However, that's also why I said that if your pages change often that this is a bad idea. If you are nearly always needing to query your DB anyway then you're simply adding performance and maintenance overhead.

I worked on a site last year that desperately needed something like this (it was built poorly from the start so performance was bad). Unfortunately, while all assets were static (images, styles etc), all but the home page required queries to the DB to update displayed data, and that was what performed the worst (well, that and the bucket loads of JavaScript attached to the stupidest events).

If you do some Googling you wil find some people have received performance improvements of more than an order of magnitude with this setup, but you should always consider your specific situation (as you clearly are). I am not recommending it, I'm just throwing out another option to consider.
 
#21 ·
Quote:
Originally Posted by randomizer View Post

Because you are playing to the strengths of two webservers instead of the strengths and "weaknesses" of one. There is obviously an overhead for cache misses compared to simply having Apache handle the request every time, but if 90% of your requests can be served from the cache then you have a large net gain in performance because you don't have to touch your DB. However, that's also why I said that if your pages change often that this is a bad idea. If you are nearly always needing to query your DB anyway then you're simply adding performance and maintenance overhead.

I worked on a site last year that desperately needed something like this (it was built poorly from the start so performance was bad). Unfortunately, while all assets were static (images, styles etc), all but the home page required queries to the DB to update displayed data, and that was what performed the worst (well, that and the bucket loads of JavaScript attached to the stupidest events).

If you do some Googling you wil find some people have received performance improvements of more than an order of magnitude with this setup, but you should always consider your specific situation (as you clearly are). I am not recommending it, I'm just throwing out another option to consider.
Apache itself doesn't touch the DB (that would be your PHP / etc framework), so you can still use Apache for cache and not call any databases. (in fact, that is exactly what we do on our web sites where I work)
 
#22 ·
Sorry, yes, I worded that pretty badly (time for bed methinks
smile.gif
). However, if you're hitting a DB then you're not serving static content, ergo you would not want to use Nginx but Apache. That was what I was implying, it just got lost in my ramblings.
 
#23 ·
Quote:
Originally Posted by Plan9 View Post

haha I was puzzled when I (mis)read your post. I literally asked myself how someone who spends their entire working day with web frameworks wasn't aware of caching
laugher.gif
Sorry about that mate.
redface.gif


In answer to your question, I don't really know myself. I can't see the benefit of running both nginx and Apache on the same box either (bar some fringe cases like the server I'm currently building). In your case, it would make more sense to let Apache manage the entire httpd stack.

I think where people talk about nginx in that regard is one of two things:
  1. A free load balancer (not applicable in your case as you only have 1 node)
  2. A replacement for lighttpd for static content (not applicable in your case as you're going to run Apache anyway, so you'd be added to your systems footprint running two httpds)
I appreciate your diplomacy, but compiling and benchmarking Apache is something I do nearly every day for work (it's actually one of my main roles as the sys admin for a smallish datacenter). I've found that the benefits of compiling Apache yourself are minimal and vastly outweighed by the cons:
  • Apache is one of those bits of software where keeping up with patches is paramount. It's significantly easier to do that with the repos than it is if you compile your own build of Apache.
  • To expand a little more on the former point, doing so with next to zero down time is very difficult without a web farm and load balancer (And even then I end up having to install to non-standard directories and using symlinks to perform the switch over).
  • Apache has a number of dependencies, some of which you may end up having to manually compile in addition to Apache (over the years I've ran into issues with zlib and openssl - amongst others that I've since forgotten)
  • Apache has dozens of modules, some included by default, some of which are not. And you really need to know what you want before you start compiling if you want any kind of performance boost (yeah you can compile the modules as shared objects, but then you might as well just use the repos which do the same).
  • To further the above point, by compiling in the modules, you're making it massively harder to disable unused modules (theres a reason why even the Apache devs state that using shared objects is the preferred method for compiling Apache)
  • You're making it dead easy to miss security modules like user sandboxing, which are not compiled in by default.
  • You're responsible for your own testing (STABLE repo's will have tested your Zend / mod_perl / etc VMs against Apache - when I've compiled my own I've found bugs that lead to segfaults in Apache threads (which lead to pages not loading) and those had to be fixed my myself).

Then lets look at the way how Apache works:
  • All you're doing by compiling in your own modules is speeding up the thread creation time, but as threads can handle hundreds of simultaneous HTTP requests and can sit idle when there's no traffic, you're really not going to see any performance increase there unless your servers are hammered (and I mean > thousands of hits a minute) for hours on end. A personal website server would see zero performance increase.
  • Most of the heavy processing in Apache happens in the language interpreters / VMs (eg mod_perl, Zend (PHP), Tomcat (Java servlets), etc). Compiling Apache would have zero performance gains with those VMs.

People really interested in Apache's performance should really be looking at the following:
  • Most people configure their language VMs with sub optimal settings (eg enabling ENVs in mod_perl, not scanning through the php.ini for optimizations, etc) or even using the wrong handlers entirely (eg CGI instead of bespoke mod_perl handlers). These will have a significantly bigger performance impact than compiling your own build of Apache.
  • Also there's many different interpreters for a lot of different languages. eg for PHP there's Zend, which is the most popular PHP engine, but that isn't the fastest. HipHop (facebook's creation) is supposed to offer up to 50% performance increases.
  • But most of the time it's the web developers who have created those bottlenecks (needlessly heavy modularised code - as seen with many popular CMSs, inefficient SQL requests, etc)
  • Sometimes Apache itself hasn't been configured optimally (inefficient threading ratios for that servers hardware, etc). That's a lot harder to configure right though - and often is just trial and error with load testing to benchmark each tweak.
  • Then you have external configuration; (eg static pages held on slower storage mediums, database server not able to handle capacity or badly configured SQL connection pooling, etc). So many people overlook the obvious there. Sometimes even just storing cache on RAM disk can improve the responsiveness of sites.
  • And, if you're really obsessed with going low level, there's a lot that can be done to to streamline Linux's TCP/IP stack for web serving and you'd see bigger improvements than compiling Apache manually. If you're running iptables, I've read that there's a few tweaks that can be done there as well (though I haven't personally tested those tweaks as we have dedicated hardware firewalls).
  • Also, what version of the Linux kernel are you running? some of the newer versions have TCP/IP cookies that are designed to significantly reduce the footprint of TCP handshakes, very handy for web servers (sadly that specific patch requires the client support as well).
  • and lastly, are you even compressing the bloody data (mod_deflate, minified CSS and Javascript, etc)? So many developers and sys admins don't even bother with this step yet it's essentially a free pass for faster page load times.
So, with the greatest of respect, telling someone to compile their own instance of Apache is just terrible advice and overlooks the mass of performance gains that can be achieved without compiling yet introduces a number of new security, potential stability issues as well as diverts peoples valuable free time from focusing on the real bottlenecks within Apache.

Honestly, if we didn't run a bespoke set up in my current work place, even I wouldn't bother compiling Apache.
smile.gif


[edit]
Sorry if there's spelling or grammar mistakes, or even if bits read really badly. It's a long post and I should be working (coincidentally, I'm writing a new Nagios plugin to monitor some Apache upgrades as the last few I ran created unexpected conflicts with some existing software) so shamefully I've not bothered to read this back.
I found your post very informative and appreciate the effort you went to in creating it +REP

I will confess that my knowledge of Linux does not stand up to yours, so people should probably trust in your opinion over mine.

I will further research the subject.

However, another advantage to compiling a basic Apache install from source rather than installing via a package manager is that many distros have repos that are severely out of date and I know of several people who would not think to check a thing like that, after all, "the OS thinks it's up to date so it must be"

I'll concede that this is not the most compelling reason though, the main reason I personally compile most of my stuff from source is for fun, though in many cases (not just web servers) there are performance and efficiency gains, especially for esoteric architectures (which I encounter from time to time).
 
#24 ·
Quote:
Originally Posted by dushan24 View Post

I found your post very informative and appreciate the effort you went to in creating it +REP

I will confess that my knowledge of Linux does not stand up to yours, so people should probably trust in your opinion over mine.

I will further research the subject.

However, another advantage to compiling a basic Apache install from source rather than installing via a package manager is that many distros have repos that are severely out of date and I know of several people who would not think to check a thing like that, after all, "the OS thinks it's up to date so it must be"

I'll concede that this is not the most compelling reason though, the main reason I personally compile most of my stuff from source is for fun, though in many cases (not just web servers) there are performance and efficiency gains, especially for esoteric architectures (which I encounter from time to time).
Thanks mate
smile.gif


With regards to package versions, Apache is one of those packages that is up-to-date on pretty much every distro (it needs to be really). In fact Debian applied the 2.2.20* update before any other distro. So you don't even need to compile to get the latest versions - well, that is unless you want to run the test branch of Apache (odd numbered version numbers) or 2.4. But 2.4 isn't widely supported yet so best avoided (for now) and obviously test builds aren't suitable for production environments.

I do take your point about compiling your own builds for the latest features though - that argument does make sense in some instances, but thankfully Apache is well maintained on distros with even the slower of release cycles.

*the 2.2.20 update addressed a bug where packets of data could be multiplied up within Apache and form a DoS attack - so it was quite a serious security update.
Quote:
Originally Posted by randomizer View Post

Sorry, yes, I worded that pretty badly (time for bed methinks
smile.gif
). However, if you're hitting a DB then you're not serving static content, ergo you would not want to use Nginx but Apache. That was what I was implying, it just got lost in my ramblings.
Ahh I see what you meant. Sorry mate
smile.gif
 
This is an older thread, you may not receive a response, and could be reviving an old thread. Please consider creating a new thread.
Top