f(x) -> ~dac

IX and the Throughput/latency Tradeoff

Since Saem put me onto the Coursera “Learning How To Learn” course and Tavis (as well as Tyler iirc) pointed me at Scott Young, I’ve been trying to figure out how to better approach learning all of the material I try to throw at myself.

This coming Monday for the Polyglot reading group we’re covering the IX protected dataplane OS paper which describes a set of tradeoffs that operating systems must make when deciding how to handle IO (network in this case).

While initially skimming the paper to try to figure out how I might chunk it to build a more cohesive understanding, I realized that I had an intuitive grip on the tradeoff between throughput and latency but I’d never tried to properly define it for myself. Being able to clearly define something I think is a mark of actual understanding and builds the ability to leverage it. I figured this would be a good exercise in that spirit and both Dr Oakley and Scott Young seem to agree from what I’ve seen so far.

My Attempt

Describe throughput, latency, and the tradeoff between them as generally as possible but with a view to how they pertain to the IX paper.


Is a measure of how much uninterrupted work can be done.


Is a measure of how long it takes from submission of some work until its completion.

I think “submission” is likely interchangeable with any of the following if you prefer:

  • definition
  • creation
  • identification

The Tradeoff

In a system with limited capacity to do work, latency is proportional to the ability to interrupt existing work. If interruption of current work is not permitted, meaning existing work proceeds with no breaks, the latency is equivalent to the time it takes to complete the pre-existing work because the system cannot do the new work until it finishes the existing work. If interruptions are permitted without restriction, throughput is entirely bound by the frequency of interruptions.

In a system that preempts work, the balance is struck by limiting the frequency of interruptions.

In a cooperative system, the balance is struck by yielding control after an alotted amount of work is done (note that this depends on altruistic workers).


I’m going to try to apply this same sort of approach for simpler base concepts. I think it’s roughly in line with the Feynman technique albeit simplified for the moment perhaps.

If you disagree with my definition with respect to the IX paper, I’d love to hear about it. As mentioned I’ve only skimmed the paper so far and so it’s entirely likely my definition and/or understanding will change a bit as I dig in deeper.


Thanks to Saem poking holes in the above definition, I’ve started to refine it a bit. The primary issue he turned up is that nowhere did I specify a cost for the interruptions required by latency and this further turned up an error in my definition of latency. I’ve adjusted the definition of latency and the tradeoff as follows:


Is a measure of the time from submission of work until some acknowledgement that the work is proceeding. Servicing latency requires a non-zero amount of work because the expected minimum of the acknowledgement itself is work.

Note that this definition is primarily concerned with responsiveness rather than completeness of the submitted work and that latency consumes some measure of available throughput since it requires work.

The Tradeoff

A system with a limited capacity to do work, like a computer, is one with a bounded or static throughput. In order for this system to provide the lowest possible latency, it must be able to interrupt existing work without restriction and consume throughput for at least acknowledgement of the receipt of new work. On the other hand, to provide the highest possible throughput the system must prevent interruptions that also wish to consume its limited resource in favour of any existing work’s use of it.