Spent another hour reading the redis source today, and I’m beginning to feel more enlightened. I’m thankful for the comment at the top of t_zset.c that describes how sorted sets are implemented with a hash and a skiplist. Weighing that knowledge against the t_set.c code for un-sorted sets, it’s now much clearer how the two types are alike and how they differ.
This is all in an effort to teach redis to zinterstore sets against sorted sets. The difficulty is due to the fact that sets can be “encoded” as a hash table, OR as an intset. So the zinterstore logic needs to be taught how to deal with un-sorted sets more completely.
Specifically, it seems length calculation is where the intersection is failing due to the way quicksort is invoked on the sets being intersected. Quicksort sorts the sets in size from small to large, which gives the logic the ability to short-circuit a bunch of code if the first set has a size of 0. And since the quicksort callback doesn’t correctly report the size of un-sorted sets, we don’t get the desired result.
Teaching t_zset.c how to measure un-sorted sets isn’t going to be hard but I’d like to do it elegantly, with re-use and some abstraction if possible.