My previous coworker Can Duruk recently posted an excellent article covering some web development tips he picked up working at Digg.
One of his comments about pagination provoked some confusion on Hacker News.
How could, one wonders from their comfy armchair, someone who works at a so-called “internet company”
be so incompetent as to have trouble with pagination?
First, let’s take a look at Can’s comment:
Counts are hard. Pagination is harder. Therefore, don’t show counts on pagination. Flawless logic.
Fundamentally, there are two straightforward conceptual ways to describe a starting
position in a list of content (this is an abstract concept of list, but are probably
rows in a database ):
offset which is an index in a list
cursor (probably a string, but maybe a timestamp or some such), which can uniquely
describe an item to start at (or after), probably a primary key or identifier or some kind.
Both approaches would also use an integer
count which describes the quantity of things to return.
offset approach has a lot of user experience benefits going for it, as it allows jumping into the list
at any position, as opposed to forcing the user to scroll (perhaps, endlessly!) through the content to
get to a given page. It’s also incrediably simple to pagination correctly: the next page always starts at
offset + count, the previous page starts
offset - count.
cursor approach does not have much going for it. It’s clunky because you can’t jump into arbitrary
positions in the list, and if you use a different cursor than the thing’s unique identifier then you can confuse
the hell out of yourself, especially if they’re both formatted timestamps prepended to uuids
(not that we did that). (We did.)
The only positive thing you can say for the
cursor approach is that within some constraints, it’s the only thing that works.
In our case we were storing most lists of content within Cassandra,
where the datastructures available are ColumnFamilies filled with Rows, filled with columns.
In normal Cassandra configurations, Rows are randomly ordered to support hashing content across
a ring of nodes, and Columns within a given Row are ordered by their key. In that sense, you can
think of a ColumnFamily as an unordered dictionary with strings as keys and Rows as values, and you
can think of Rows as an ordered dictionary with strings as keys and strings as values.
(Ok, they’re probably all bytes instead of strings, hang me later.)
Within the constraints of storing stories within Cassandra, you’re generally going to end up
using cursors instead of offsets if you’re dealing with frequent writes, which generally we
were. If you’re dealing with a low number of writes, you can do any number of asinine things
to work around the limitation. With large number of writes, any strategy I’ve been able to think up
for index based access against Cassandra runs into the tombstone implementation.
(Who knows, maybe there are new APIs which make this stuff possible.)
In the end, we were able to move away from using cursors and standardize on
pagination, which we accomplished by moving our dynamicly calculated storylists into Redis sorted sets,
which are a great complement to Cassandra for handling dynamicly sorted content (using unique identifiers for the key) and storing
themselves in Cassandra, accessible via the unique identifiers retrieved from the Redis sorted set.
The one storylist we kept in Cassandra was the Top News storylist, which was only recalculated periodically.
We prematerialized each combination of filters (type of content, topic, etc) and stored each one as a freaking
JSON blob stored within a Cassandra column. As far as implementations go, all you can really say about that one
was “it worked”.
Anyway, that’s why pagination was hard.
Followup on Pagination and Inconsistent Ordering
Another aspect of pagination which is awkward is when you’re ranking content by a dynamic score,
such that a given piece of content can go from position three to position ten, and then in a later
surge of popularity from position ten to position two.
In this case offsets are liable to show the same content multiple times on different pages
(because the content moved up or down between loading the first and second pages),
and cursors are liable to lose position entirely (if the cursor drops from tenth to fifteenth position,
when you load the next page you’ll skip positions ten through fifteen entirely).
We generally didn’t try to solve this problem, but for the third iteration of MyNews the storylist
was changing so aggresively, that we did implement a solution for consistent pagination. That version
of MyNews took content your friends were Digging, submitting or commenting on, plus content from Newsrooms
you participated in most, plus trending content across Digg itself, threw it all together and ranked it
according to a variety of factors. In the initial internal implementation, that meant that content could
jump around quite a bit from one page to the other.
Our solution was:
offset was zero, then generate the first 225 items in the MyNews storylist,
store that ordered list of item_ids as a JSON blob inside Redis,
and return the first
offset was not zero and the MyNews JSON blob existed in Redis, retrieve the blob
and slice the
count item_ids from within.
- If the
osset was not zero, and the MyNews JSON blob did not exist, then generate it,
store it, and retrieve the
(This was an edgecase for people refreshing old pages or if the blob got evicted from the
cache, which in general shouldn’t happen in a short period of time, but could happen.)
This implementation has the drawback of content becoming stale on non-first pages, but our
experience with the user behavior was that that wasn’t important, as people looking for newer
content would go to the first page rather than hiding around on later pages.
Follow-Followup on Inconsistent Materialized Views
Can also alludes to another problem we experienced with our storylist implementations
which relied on Cassandra when he says
Therefore, don’t show counts on pagination.
This one was really entirely our fault, but as we materialized more and more lists of
stories, we often lost track of some of them, such that when a story was deleted (because the
submitter closed their account, etc) or hidden because it was identified as a spam,
then it would still be returned to the frontend when it queried for stories.
The frontend would then hide that story, but would only have, say, twelve valid stories
to display when it had asked the backend for fifteen.
As the number of materialized views grew to include half a dozen per Newsroom,
six per user’s MyNews, and so on, we ended up with an awkward compromise, which is
we would repair storylists on read (removing any invalid, expired, spam and such).
In most cases this was a backup mechanism rather than the primary mechanism for
keeping the views consistent, but meant some reads would return less than the
desired quantity of items.
Like a lot of our solutions at Digg, all you can say is that it worked, and worked
at a moderately high scale. Maintaining a registry of all the storylists
a given article was in and then cleaning them up on modification would have been
much prettier, but if a story was in several hundred thousand storylists, many of
which will never actually be read, then the repair-on-read mechanism may be preferable
(it also self-repairs in the case of failures, which is a bonus when dealing
with distributed systems with non-zero error rates).