Saturday, September 22, 2012

Boltsort for better Varnish cacheability

At Vimeo we use Varnish to cache our player traffic. It has worked out great for us. Varnish at this point has no built in module to sort querystrings.  Say I have two of these:


Obviously we want to treat the above one and the same:

We evaluated a few vmods. We were not happy with their performance/stability. So I wrote boltsort that is 2 times faster than existing implementations and has been running nice and stable.

Monday, December 12, 2011

ps returns incorrect etime

I run ps -Ao pid,pcpu,rss,etime,args to check for long running processes. If etime (elapsed time) of a process is a greater than say 10 hours, I kill the process. Lately I have been seeing valid processes getting killed. I noticed in the logs that etime was returning 49710-06:28:15 or 4294967295 seconds or 2^32-1. Anytime I see these magic numbers 2^N or 2^N-1, I know there is some thing weird. Turns out I am right.  The procps fix states "the ps utility's "etime" field shows the elapsed time since a process was started. On heavily-loaded systems, it was possible for this value to return negative due to an integer overflow. " 
I din't update the procps, instead I fixed my python script.

Saturday, December 10, 2011

fetching ec2 ondemand (excluding spot instances)

EC2 describeInstance API has no way filtering out only on-demand instances. The way I do it using python boto library:

from boto.ec2.connection import EC2Connection
reservations = ec2.get_all_instances(filters = {"instance-state-name":"running"})
ondemand_instances = []
for r in reservations:
    for i in r.instances:
        if not hasattr(i, "instanceLifecycle"):

However If I want to fetch all running instances there is a nice filter to do so:

reservations = ec2.get_all_instances(filters = {"instance-lifecycle":"spot",
spot_instances = []
for r in reservations:
    for i in r.instances: