Surviving reddit hug
Preparing for reddit hug is a waste of time. You should never do that if you
 - have just set up your blog in middle of nowhere. No one knows your site yet.
 - are a layman developer
 - don't have a marketing team
 - have no plans to advertise your site
 - on server log, the only visitors are Google / bing / Yandex / 360Spider / *bot*

You should prepare for reddit hug if you
 - have nothing else to do
 - want to go hardcore
 - are a masochist developer hosting a server at home with a regular PC


So, how to survive reddit hug?

The Story

Say your blog is just personal site that you used to jot down your life and some useful notes for your future self...

SUDDENLY YOU GET REDDIT HUGGED?!

Now your server is down. It just couldn't handle this enormous traffic because someone decided to share the picture of lovely Voistitoevitz.  You tried to bring it back up, but it gets down quickly after. You felt both happy and sad because you can't bring up your site and there were so many people waiting to see Voistitoevitz!

Days later, your site is back up. But the surge has already passed. There was only a few visits. Good O' Google and bing are back. It's like a bubble of dream, popped and nothing really happened.

You are alone, again.

===

So that's totally not the reason I want to implement this features by the way. I just like making enterprise-level stuffs for tiny use. OVERKILL IS THE BEST KILL!

The c10k problem

To survive reddit hug, we need to solve the C10k problem. As you see, node.js is already an event-driven language, it should be able to handle 10k connections right?

Well, not quite.

There are several things in between your node.js runtime and client's browser.

First is the ISP cap, the ISP might cap your connection for regular home users because it's for home use ( !! ). Generally you are very *unlikely* to get 10k simultaneous connection in 1 hour anyway. Although capping your connection is pointless, but it is *very likely* you are hooked into a much smaller stack which rusting in some remote location for god who knows how many decades!

Second the network adapter, routers and the modem. In my experience, routers are likely to be the bottle neck because TP-LINK are just being TP-LINK.

So fuck TP-LINK I'm using switch and standalone AP.

Third is the server configuration. If you are using KVM you should be aware of the routing configurations.

But if you are using the *cloud* you shouldn't be aware of those because those problems are theirs!

Server spec

I've allocated only 2 CPU & 512MB RAM for web machine ( node.js runtime ), 1 CPU & 512MB RAM for database machine.

Generally, it is not that you can't serve 10k connection because of the shitty server spec. You can't serve 10k connection because the way the program handles the connection. Running a loop like
for( var i = 0; i < 10000; i ++ ) console.log( "<html> MyWebSite </html>" );

shouldn't take long for any program. In fact, it didn't. Using this simple c10k-server.js snippet, you could actually see the machine handles it quite easily:
var http = require('http');
var i = 0;
http.createServer(function (req, res) {
    res.writeHead(200, {'Content-Type': 'text/plain'});
    res.end('Hello World\n');
    console.log( "here: " + ( i ++ ) );
}).listen( 12345, "0.0.0.0" );
console.log( 'Server running at http://0.0.0.0:12345/' );
So why would your full fledged website couldn't?

The wait

Because it is waiting! Even using even-driven language, you can't eliminate the fact that it is waiting for database connection, waiting for it to return query results, waiting to read the html template file, waiting the article being paginated.

Technically it might be faster if you are doing them simultaneously, say getting the template and sending queries at the same time instead of doing it one at a time.

But it didn't actually solve the C10k problem because gathering data needs time. There WILL be overhead for 10k connections!

Cache

The only way I could think of other than increasing the server volume ( sometime you does need a good server because some websites only serve dynamic data such as WhatsApp ) is caching the data.

In theory, doing a 10000 times for loop doesn't hurt. Doing
// shared resources
var cache = { "index": "<html> My home page </html>" };

// Connections begins here
for( var i = 0; i < 10000; i ++ )
{
	console.log( Cache[ "index" ] );
}
shouldn't hurt too. ( Of course there will be mechanism to invalidate the cache and such. But it-should-not-hurt. )

Let's do this!

Results

Testing for real world scenario. Using two different machine in different area. i.e., across different ISPs.

I have caps by ISP, I couldn't actually handle 10k connections. Some connections are just randomly dropped by them. But I think I could get to around 1000 concurrent connections here. You could still see the drastic difference before and after.

Before page cache implementation, my blog would just crash if there were too much connections:
$ ab -r -k -c 1000 -n 10000 https://blog.astropenguin.net/
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking blog.astropenguin.net (be patient)
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
SSL handshake failed (5).
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
Completed 10000 requests
Finished 10000 requests


Server Software:        Apache/2.4.10
Server Hostname:        blog.astropenguin.net
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256

Document Path:          /
Document Length:        18236 bytes

Concurrency Level:      1000
Time taken for tests:   100.228 seconds
Complete requests:      10000
Failed requests:        7516
   (Connect: 0, Receive: 0, Length: 7516, Exceptions: 0)
Non-2xx responses:      7288
Keep-Alive requests:    2616
Total transferred:      50230448 bytes
HTML transferred:       48165552 bytes
Requests per second:    99.77 [#/sec] (mean)
Time per request:       10022.805 [ms] (mean)
Time per request:       10.023 [ms] (mean, across all concurrent requests)
Transfer rate:          489.42 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0 1433 9469.1    215   92580
Processing:    15 1363 3857.1     53   28077
Waiting:        0 1037 3076.4     50   25263
Total:         15 2795 10174.3    300   93579

Percentage of the requests served within a certain time (ms)
  50%    300
  66%    399
  75%   1069
  80%   1811
  90%   5878
  95%  11942
  98%  24983
  99%  75332
 100%  93579 (longest request)
As you can see it crashed around 2616 connections.

Now with cache implemented:
$ ab -r -k -c 1000 -n 10000 https://blog.astropenguin.net/
This is ApacheBench, Version 2.3 <$Revision: 1604373 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking blog.astropenguin.net (be patient)
Completed 1000 requests
Completed 2000 requests
Completed 3000 requests
Completed 4000 requests
Completed 5000 requests
Completed 6000 requests
Completed 7000 requests
Completed 8000 requests
Completed 9000 requests
SSL handshake failed (5).
Completed 10000 requests
Finished 10000 requests


Server Software:        Apache/2.4.10
Server Hostname:        blog.astropenguin.net
Server Port:            443
SSL/TLS Protocol:       TLSv1.2,ECDHE-RSA-AES256-GCM-SHA384,2048,256

Document Path:          /
Document Length:        18232 bytes

Concurrency Level:      1000
Time taken for tests:   26.143 seconds
Complete requests:      10000
Failed requests:        1
   (Connect: 0, Receive: 0, Length: 1, Exceptions: 0)
Keep-Alive requests:    9
Total transferred:      183381454 bytes
HTML transferred:       182309960 bytes
Requests per second:    382.51 [#/sec] (mean)
Time per request:       2614.335 [ms] (mean)
Time per request:       2.614 [ms] (mean, across all concurrent requests)
Transfer rate:          6850.06 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0 1452 2228.0    646   17026
Processing:    18  138 104.3    143    5552
Waiting:       15   74  55.2     68    1810
Total:         18 1590 2229.4    774   17245

Percentage of the requests served within a certain time (ms)
  50%    774
  66%   1206
  75%   1556
  80%   1670
  90%   3576
  95%   6520
  98%   8756
  99%  13835
 100%  17245 (longest request)
It's super duper fast!

Hell, this method is even better than generating static htmls because the entire thing is stored inside the memory and ready to serve at anytime.

Now I am immune to reddit hug. How pointless!
Profile picture
斟酌 鵬兄
Fri Oct 14 2016 17:15:04 GMT+0000 (Coordinated Universal Time)
Last modified: Fri Oct 14 2016 17:28:44 GMT+0000 (Coordinated Universal Time)
Comments
No comments here.
Do you even comment?
website: 
Not a valid website
Invalid email format
Please enter your email
*Name: 
Please enter a name
Submit
抱歉,Google Recaptcha 服務被牆掉了,所以不能回覆了