Survival in times of critical load

Posted by admin | Interesting,Useful | Monday 16 May 2011 11:36 am

In the life of each there are moments when the equipment can not handle the current load.

The reasons may be diverse and it is not always possible to fundamentally eliminate them within a reasonable time.

In such cases, the developers face the task of reducing the load with minimal disruption to visitors.

I do not try to tell you that my idea is genius, but I hope it can help someone.

In my case the problem is that the load on the database rises sharply from time to time.

As usually happens in such cases, the avalanche occurs – requests are beginning to slow, users are nervous and press reload, and the server “hangs”.

In addition, search engines add problems – they have a tendency to pull root pages from the site, which are not normally in the cache.

As a result, the pages for them are generated and the cache is filled with unnecessary data.

Since the second part of the problem can be handled easily enough – just do not cache the reference to the old pages.

But it is quite possible to go further and remove the load from the robots, at least temporarily when the server is overloaded.

I faced two problems:

1. How to determine the time of increased load
2. What can I do to make life easier at this moment

To solve the first problem I used a famous miracle tool zabbix, it has long been actively and successfully used and help to monitor multiple servers.

For my case, I chose load average on the database server.

And I have to respond on another server, where nginx lives.

I created a trigger with the condition system.cpu.load[,avg1].last(0)>3.5, I hung execute remote commands on it.

HOST: /path/lowerage.sh {STATUS} {ITEM.LASTVALUE} {TIME}

I posted a simple script on the server:

if [ "${1}" != "ON" ] ; then
/bin/unlink /tmp/cpu_load_high
else
/usr/bin/touch /tmp/cpu_load_high
fi
echo “lowerage ${1} ${2} ${3}” |/usr/bin/mail -s lowerage tmp@tmpmail.ru

As a result, at the time of overload a file was created with the help of which you could work on.

Smoothly pass to action itself.

The first thing I limited the run of chron scripts at a critical time. The solution is trivial, just the condition is added to chron. Example:

/bin/test! -r /tmp/cpu_load_high && /usr/bin/fetch -o — site/cron.php

Then I began to gently press down search engines.

I decided to prune them during the critical time at the level of nginx.

I had to be smart with the latter, the thing is, that nginx does not understand nested if, and I need at least two conditions. But I managed to do it.

I cite below a piece of the configuration file, which implements described above:

if (-f /tmp/cpu_load_high) {
set $troubleflag T;
}
if ($http_user_agent ~ (?:Yandex|Google|Yahoo|Rambler|msnbot) ) {
set $oblomflag Y$troubleflag;
}
if ($oblomflag = YT) {
return 444;
}

By the way, I’m not sure that the option to return 444 is optimal. In this case, nginx simply terminates the connection. I believe that robots should not be greatly offended by such behavior.

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a comment