TDD and Specification exercice
Let compute the columns of the Kanban view when put in Month visualisation mode
Following a discussion about TDD, I recently shared on LinkedIn a programming exercise inspired by a recent Klaro Cards feature.
The Exercise
In Klaro Cards, when you have a series of cards with a date, you can view them in columns. In the example, blog posts can be viewed by month, quarter, or year. You can also decide whether or not to display empty columns.
You are responsible for writing the logic behind this feature. Given a set of cards with a publication date, calculate the columns to display. Once your program is written, add the option that allows showing or hiding empty columns.
To keep the exercise simple, we focus only on the function that calculates the columns to display, and only with a monthly periodicity. You’re not even required to calculate the groups of cards for each column (but you can).
Use the language you want, the method you want (BDD, TDD, Example Mapping, Vibe Coding, etc.). We are particularly interested in what you call the “Specification” once the job is finished.
The Context
The reason I proposed this exercise is that sometimes TDD tends to become an obligatory ritual.
I myself use TDD very often, because it allows discovering a specification by induction: we start from simple examples, then increase complexity in a process of generalization (see further). But I don’t use TDD all the time, because I have other tools:
- if I know how to write the specification directly, I use it in a more classic testing scheme (see below)
- if I can reduce the problem to a trivial mathematical property, I can even skip tests completely
In practice, I use all three together. I propose to start from the end, taking them in the following order:
- The mathematical approach, using SQL and Bmg
- The specification + tests approach, in Typescript
- The TDD approach, also in Typescript
The Mathematical Approach
With a bit of abstraction, we quickly understand that we want to find all months between a MIN (start of the month of the earliest publication date) and a MAX (start of the month following the latest publication date). My real and favorite “mathematical” language is still SQL, and the solution is straightforward.
The intuition (pseudo code), which I don’t even imagine needing to test personally:
SELECT generate_series([min date], [max date], interval '1 month')
In practice, because SQL is a bit verbose, we have to write more code to find these MIN and MAX than to solve the problem itself. However the solution fits in a few lines and does not require excessive testing time:
WITH
date_bounds AS (
SELECT
date_trunc('month', MIN(publication_date)) AS start_month,
date_trunc('month', MAX(publication_date)) AS end_month
FROM
cards
)
SELECT
a_month
FROM
date_bounds,
generate_series(start_month, end_month, INTERVAL '1 month') AS a_month
Removing empty columns
The magic of SQL, or declarative languages in general, is that we don’t even have to refactor the program to add a requirement. To remove empty columns, we add a WHERE
clause and that’s it. Here too, I’m not sure I’d spend many hours testing it:
[...]
WHERE EXISTS (
SELECT
*
FROM
cards
WHERE
publication_date >= a_month
AND
publication_date < a_month + INTERVAL '1 month'
);
(It took me 4 minutes.)
The Relational Approach
You might say: OK, but if SQL is not an option, what should I do? Well, there's a reason I promote the relational approach outside of databases. In En Bmg, 100% Ruby, the solution looks like this:
## Find the min & max, and align them to the beginning of the month
mm = cards.summarize([], {
:min => Bmg::Summarizer.min(:publication_date),
:max => Bmg::Summarizer.max(:publication_date),
}).transform(&:beginning_of_month).one
## Generate the months (like generate_series)
## and filter those that actually have at least one card
Bmg::Relation
.generate(mm[:min], mm[:max], step: ->(d){ d.next_month }, as: :publication_date)
.matching(cards.transform(:publication_date => ->(t){ t.beginning_of_month }))
I'm not going to go into detail about this solution here — I'll write a longer post about it later. I usually frame writing code like this with one or two tests (it's less declarative than SQL). Once the test passes, I know it's 100% correct, because I understand the mathematical semantics of the summarize
, transform
, generate
, and matching
operators.
(It took me 6 minutes.)
The Specification Approach
If I first think in terms of specification, I approach the problem quite differently. I write the following pseudo code:
/**
* Returns a list of months covering the publication dates of the cards
*
* PRE 1: the set of cards can be empty
* PRE 2: each card has a mandatory publication date
*
* POST 1: the returned list is complete, i.e. each card appears in at least one column
* POST 2: the returned list is continuous, i.e. the timeline is not interrupted
* POST 3: the returned list is minimal, i.e. the first and last month have at least one card
*/
function compute_columns(cards)
...
end
We see that the angle is completely different, and complementary:
- We didn’t consider
NULL
in the previous solution, but we naturally do here - We better understand the qualities the solution should have
- We specify what should ideally always be tested (in all examples or test cases): the three POST-conditions
Linking back to user goals
These PRE
and POST
conditions are a real invitation to better understand user goals. We ask “WHY”:
PRE 1
: why support the empty set? Because the user might have no cardsPRE 2
: why require a date? To keep it simple for now (can be lifted later)POST 1
: Why complete? Because the user doesn’t want to lose sight of any cardPOST 2
: Why continuous? Because it seems more intuitive to the userPOST 3
: Why minimal? Because the user doesn’t want to have to scroll left to see the first cards, nor unnecessarily right where there will be no cards
Discovering a conflict
If the list of cards is empty (PRE 1
), no month is returned (POST 3
), so there will be no columns.
This can cause a UX problem since clicking on a column allows creating a new card.
Hiding empty columns
Adding the option to hide empty columns refines the specification:
/**
* [...]
* POST 2: if hide_empty_columns=false, then the returned list is continuous, i.e. the timeline is not interrupted
* [...]
* POST 4: if hide_empty_columns=true, then the list is compact, i.e. each month has at least one card
*/
function compute_columns(cards, hide_empty_columns)
...
end
Deriving test cases
The specification makes it easier to find test cases we care about:
- No cards
- Two cards published the same month
- Two cards published in two consecutive months
- Two cards published at least one month apart, without hiding empty columns
- Two cards published at least one month apart, hiding empty columns
- Three cards distributed somewhat randomly
We will likely test exact values, but also the three postconditions (see below).
The TDD Approach
Solving the problem with TDD is much more incremental.
I made about 10 steps. It took me about 45 minutes patiently applying Uncle Bob’s increments from The Transformation Priority Premise. A silly mistake: I tried too quickly to integrate hiding empty columns (Step 6) and had to change my mind at step 8 to come back to it later.
The final algorithm
I finish here with the algorithm below. I admit I don’t find it particularly elegant. It took me much longer to write than the SQL version.
import { DateTime } from 'luxon';
export type Card = { publication_date: DateTime }
export const monthOf = (card: Card): DateTime => card.publication_date.startOf('month')
export const computeColumns = (cards: Card[], hide_empty_columns: boolean = false) => {
if (!cards.length) return [];
const months = cards.map(card => monthOf(card)).sort();
const min = months[0];
const max = months[months.length - 1];
const result = [];
let current = min;
while (current <= max) {
if (!hide_empty_columns || cards.some(card => monthOf(card).equals(current))) {
result.push(current);
}
current = current.plus({month: 1})
}
return result;
}
Adding “real” coverage tests
Since I wasn’t really convinced that this algorithm met its specification, I ended up coding real tests for the postconditions:
I ended up adding a test that verifies these postconditions on random data. Since the tests didn’t reveal any bugs, I finish relatively confident, but much less sure than the SQL version. The latter might hide a bug, though, since I didn’t write tests for it. Obviously that should be done using the specification as a guide.
Conclusion
Excellent exercise, which took me the morning, including writing this blog post. It allowed me to show the three approaches. I find them really complementary. Personally, I use the mathematical approach whenever the problem allows it, using the specification as leverage to write functional tests. I systematically use BDD and TDD in two cases I consider very frequent:
- when the specification isn’t clear at the start, and it’s not possible to solve the problem just by trying to write it
- when the problem requires real architectural work, in which case I prefer BDD over TDD
And you, which approach do you prefer?