Wherein I recap the things I’ve done and learned this year as a software developer (mainly at work). I’ve fixed many bugs and uttered many curses. Through it all I’ve constantly strived to make the code a better place.
Amazon Web Services
Video encoding instances Launch/Start when there’s a backlog, and Stop/Terminate when there’s not. Said goodbye to Amazon SQS for job queues, and hello to Redis. Hello consistency and speed.
My first motivation for investigating redis was due to an optimization challenge at work. Our network of websites covers track and field and cross-country. We have a database of 13 million performance records, and naturally our site provides visitors with the ability to view up-to-date rankings. Currently we use Sphinx for this, but Sphinx is geared for full-text search, and was quite slow (we’ve since optimized sphinx further, which I hope to write about later).
Had the energy and focus to make more progress on my ideas for redis last night. See my previous post here if you don’t remember what those ideas were.
The only things left to do:
Fix freeSetObject() such that it frees intset and/or hash table memory that may have been allocated Fix code in rdb.c that’s responsible for loading the data from disk and writing to it when redis is shut down.
Had an idea for reducing the memory footprint of redis sets. A redis set may be encoded as an intset (if the set contains nothing but integers) or a hash table. But if a set contains integers and other values, you lose out on the benefit of intsets (speed and memory efficiency) and in some cases you have to wait while redis converts your intset to a hash table. This may not be a big penalty, but it all depends on the order in which you add items.
Pieter responded to my thread. He’s worked on compressed sorted sets recently, and while doing that he fixed the issue I was having with intsets. His code should be merged into 2.2 in a few weeks after enough people have tested. Apparently I’m the only one to report the problem with intsets.
His branch’s code is similar to my approach, in that he also extended zsetopsrc with type and encoding variables.
At Thursday night’s PDX Hackathon I worked more on the redis changes I wanted to make. I mostly completed them, and announced my progress on the mailing list that night. Seems no one was aware that zinterstore was broken for a set encoded as an “intset” (a more efficient way to store sets of integers). So yay, I found a bug! But now another programmer has alerted me he’s made progress in another area of the code that should be relevant to me, so I’ve got to check that out.
Spent another hour reading the redis source today, and I’m beginning to feel more enlightened. I’m thankful for the comment at the top of t_zset.c that describes how sorted sets are implemented with a hash and a skiplist. Weighing that knowledge against the t_set.c code for un-sorted sets, it’s now much clearer how the two types are alike and how they differ.
This is all in an effort to teach redis to zinterstore sets against sorted sets.
Yesterday, during lunch, I read through these blog posts that walk through the redis source code:
I won’t go into the details now, but while investigating redis for work I saw a use case where being able to intersect sorted and unsorted sets would be nice. Un-sorted elements would have a 0 score (even better if it were customizable). Redis doesn’t currently support this, so I hope to contribute by adding this feature.
Moved away from storing individual string values to storing whole objects/documents in redis hashes. Hashes make more sense when you need to store a group of data values like a blog post.
But I’m not making a blog. My task it to create an updated version of my personal journal and information storage system. Daily I use it to log my thoughts, to stage blog posts (like this one), keep track of links to check later (which often doesn’t happen), keep todo lists for projects, and other things.
In my quest to learn new things, specifically the following things, I’ve been both excited and drained:
node.js Redis document database haml sass a micro-framework By “new things” I really mean “some cutting edge stuff that seems pretty cool”, the “cutting edge” portion being the source of most of my struggle. New tools, languages and software have sharp edges.
Since I love fast and light software, node.js is my current casual focus.