Neither if nor while; neither map nor reduce

Relational algebra as an alternative to imperative and functional programming.

Tech 27.05.2025
Table of contents

Introduction

I’ve been advocating for relational algebra for over 20 years now. For example, I wrote Relations as First-Class Citizen more than 10 years ago. More recently, together with my colleague Felix, we launched https://www.relational-algebra.dev/ to document the Bmg library used in the Klaro Cards backend.

Relational algebra is often categorized as "only for databases." No one seems to realize that it's a standalone tool, which can be applied in regular programming with excellent results: less verbosity, more productivity, and greater correctness. That’s what I aim to demonstrate here.

The example

We’ll build upon the example from the previous post but make it a bit more complex. We want to compute the columns of the Kanban "organized by month," but also the groups of cards to display in each column. The user should be able to show or hide empty columns.

Before continuing to read, I invite you to take a moment for yourself. How would you implement this feature?

  • It’s simple: group blog posts by month
  • But not simplistic: we need all the months between two dates, even those without blog posts

You can revisit the previous solution which computed the necessary columns. In this blog post, I’m not going to improve the TypeScript algorithm given at the end of the article. Instead, I’ll walk through how to use relational algebra step by step for a solution with no loops, no ifs, no maps, and no reduce.

A Data-Driven approach

One thing that might surprise you—because it’s unusual in object-oriented programming—is that we’re going to follow a resolutely data-oriented, or informational, approach. In other words, I won’t be bothering with classes or domain objects like Article or KanbanColumn. These classes and objects are implementation details (if that statement stings a bit, see the last section of this article).

Our problem and its solution are illustrated as follows.

As input, we have blog articles in a Relation:

+----+------------------+----------------------------------+ 
| id | publication_date | title                            |
+----+------------------+----------------------------------+
|  1 | 2025-01-02       | Out of the Tar Pit               |
|  2 | 2025-01-17       | Relations as First-Class Citizen |
|  3 | 2025-03-11       | Bmg, a Relational Algebra        |
+----+------------------+----------------------------------+

That is, in JavaScript—or here, Ruby:

articles = [
  { id: 1, publication_date: "2025-01-02", title: "Out of the Tar Pit" },
  { id: 2, publication_date: "2025-01-17", title: "Relations as First-Class Citizen" },
  { id: 3, publication_date: "2025-03-11", title: "Bmg, a Relational Algebra" }
]

And the result we’re aiming for should look like the Relation below (it contains "sub"-Relations as values, which is perfectly allowed by the theory).

This Relation captures the following information:

  • In January, two articles were published: "Out of the Tar Pit" on the 2nd and "Relations as First-Class Citizen" on the 17th;
  • February saw no publications;
  • March had one publication: "Bmg, a Relational Algebra" on the 11th.

Or in summary: In month [month], the listed [articles] were published. In the following, I will always include this kind of “informational semantics” as comments. It is essential to the approach: it allows us to reason about what we want, but also to document the solution.

// In month [month], the listed [articles] were published
+------------+--------------------------------------------------------------+
| month      | articles                                                     |
+------------+--------------------------------------------------------------+
| 2025-01-01 | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  1 | 2025-01-02       | Out of the Tar Pit               | |
|            | |  2 | 2025-01-17       | Relations as First-Class Citizen | |
|            | +----+------------------+----------------------------------+ |
|                                                                           |
| 2025-02-01 | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | +----+------------------+----------------------------------+ |
|                                                                           |
| 2025-03-01 | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  3 | 2025-03-11       | Bmg, a Relational Algebra        | |
|            | +----+------------------+----------------------------------+ |
+------------+--------------------------------------------------------------+

That is:

result = [
  {
    month: "2025-01-01",
    articles: [
      { id: 1, publication_date: "2025-01-02", title: "Out of the Tar Pit" },
      { id: 2, publication_date: "2025-01-17", title: "Relations as First-Class Citizen" }
    ]
  },
  {
    month: "2025-02-01",
    articles: [
    ]
  },
  {
    month: "2025-03-01",
    articles: [
      { id: 3, publication_date: "2025-03-11", title: "Bmg, a Relational Algebra" }
    ]
  }
]

Implementation overview

The solution fits in just a few lines. You can also find the full code on GitHub. Don’t worry—we’ll take a detailed look at everything in the following sections.

# 1. Compute the month of each article
monthly_articles = articles.extend({
  month: ->(t) { t[:publication_date].beginning_of_month },
})

# 2. Find min and max month
minmax = monthly_articles.summarize([], {
  :min => Bmg::Summarizer.min(:month),
  :max => Bmg::Summarizer.max(:month),
}).one

# 3. Generate all months of interest
all_months = Bmg::Relation.generate(minmax[:min], minmax[:max], step: ->(d){ d.next_month }, as: :month)

# 4. Find, for each month, the articles published
articles_per_month = all_months.image(monthly_articles, :articles, [:month])

# 5. Remove empty months
result = articles_per_month.exclude(->(t) { t[:articles].empty? })

As the code suggests, we break the problem down into subproblems:

  1. Compute the publication month for each article, using extend
  2. Find the first and last month, using summarize
  3. Generate the full list of months to consider, using generate
  4. Attach to each month its list of articles, using image
  5. Optional: remove empty months, using restrict or exclude

Although I use the term "list" below, we will only be using Relations and the operators mentioned. Each takes a Relation as input and produces a Relation as output. I will show all intermediate results.

Compute the month of each publication

The extend operator extends its input relation with one or more computed attributes. Here, we simply compute the start date of the month using ActiveRecord#beginning_of_month, which now appears as a new attribute called month:

monthly_articles = articles.extend({
  month: ->(t) { t[:publication_date].beginning_of_month },
})
// The publication [id] was published in the month of [month]
+----+------------------+----------------------------------+------------+
| id | publication_date | title                            | month      |
+----+------------------+----------------------------------+------------+
|  1 | 2025-01-02       | Out of the Tar Pit               | 2025-01-01 |
|  2 | 2025-01-17       | Relations as First-Class Citizen | 2025-01-01 |
|  3 | 2025-03-11       | Bmg, a Relational Algebra        | 2025-03-01 |
+----+------------------+----------------------------------+------------+

Find the first and last months

The summarize operator allows us to find the two months that will form the boundaries of the Kanban board. It’s somewhat equivalent to SELECT MIN(...) GROUP BY ... in SQL: it lets us “summarize” the information represented in a relation. Here, we'll use it with an empty set of grouping attributes ([]) and specify that we want to compute the smallest (min) and largest (max) month:

minmax_relation = monthly_articles.summarize([], {
  :min => Bmg::Summarizer.min(:month),
  :max => Bmg::Summarizer.max(:month),
})

The result is as follows, with its informational semantics in the comment:

// The first month in which an article was published is [min] and the last is [max]  
// (each represented by its first day)
+------------+------------+
| min        | max        |
+------------+------------+
| 2025-01-01 | 2025-03-01 |
+------------+------------+

Generate the full List of months to consider

Here we will apply the same technique as in the previous blog post, the equivalent of generate_series in PostgreSQL:

minmax = minmax_relation.one
months = Bmg::Relation.generate(minmax[:min], minmax[:max], step: ->(d){ d.next_month }, as: :month)

The first line uses one, a non-relational operator that extracts the one and only record from a relation.

{ min: '2025-01-01', max: '2025-03-01' }

The generate operator generates a relation (without taking another relation as input). Here we give it our two min and max dates and an increment function, and it generates the following relation:

// The month [month] lies between two months in which articles were published
+------------+
| month      |
+------------+
| 2025-01-01 |
| 2025-02-01 |
| 2025-03-01 |
+------------+

Attach to each month its list of articles

For this final step, we use a somewhat magical operator, image. This operator extends its input relation—the "list" of months—with a new attribute articles. This attribute will be the sub-relation obtained by filtering the articles that share the same value for the common attribute (:month). It is similar to a LEFT JOIN + ARRAY_AGG in PostgreSQL:

articles_per_month = months.image(monthly_articles, :articles, [:month])
// The month [month], the listed [articles] were published
+------------+--------------------------------------------------------------+
| month      | articles                                                     |
+------------+--------------------------------------------------------------+
| 2025-01    | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  1 | 2025-01-02       | Out of the Tar Pit               | |
|            | |  2 | 2025-01-17       | Relations as First-Class Citizen | |
|            | +----+------------------+----------------------------------+ |
|                                                                           |
| 2025-02    | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | +----+------------------+----------------------------------+ |
|                                                                           |
| 2025-03    | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  3 | 2025-03-11       | Bmg, a Relational Algebra        | |
|            | +----+------------------+----------------------------------+ |
+------------+--------------------------------------------------------------+

Another way to understand image is to see it as a shortcut for extend + restrict (restrict is introduced in the next section):

articles_per_month = months.extend({
  articles: ->(t) { monthly_articles.restrict(month: t[:month]) }
})

Hide or show empty months

If you want to hide empty months, you can use the restrict

result = articles_per_month.restrict(->(t) { !t[:articles].empty? })

Or more elegantly, exclude, its opposite:

result = articles_per_month.exclude(->(t) { t[:articles].empty? })
// The month [month], the listed [articles] were published
+------------+--------------------------------------------------------------+
| month      | articles                                                     |
+------------+--------------------------------------------------------------+
| 2025-01    | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  1 | 2025-01-02       | Out of the Tar Pit               | |
|            | |  2 | 2025-01-17       | Relations as First-Class Citizen | |
|            | +----+------------------+----------------------------------+ |
|                                                                           |
| 2025-03    | +----+------------------+----------------------------------+ |
|            | | id | publication_date | title                            | |
|            | +----+------------------+----------------------------------+ |
|            | |  3 | 2025-03-11       | Bmg, a Relational Algebra        | |
|            | +----+------------------+----------------------------------+ |
+------------+--------------------------------------------------------------+

An alternative, a bit less naive, is to avoid the effort of generating all months with generate if you are going to remove them afterward. The next section offers an alternative along these lines.

Going further

The previous section showed how to apply pure relational algebra to the problem of grouping articles by month. In this section, we broaden the scope: how to encapsulate the whole thing into a function, and how to test it?

The complete program

If you want a function that takes the articles and returns the desired grouping, with or without empty months, you can encapsulate all the code like this:

def group_articles_per_month(articles, hide_empty_columns = false)
  monthly_articles = articles.extend({
    month: ->(t) { t[:publication_date].beginning_of_month },
  })
  all_months = if hide_empty_columns
    monthly_articles.project([:month])
  else
    minmax = monthly_articles.summarize([], {
      :min => Bmg::Summarizer.min(:month),
      :max => Bmg::Summarizer.max(:month),
    }).one
    Bmg::Relation.generate(minmax[:min], minmax[:max], step: ->(d){ d.next_month }, as: :month)
  end
  all_months.image(monthly_articles, :articles, [:month])
end

In this variant, the project operator is used to extract from the original relation the months with at least one published article. This operator never returns duplicates, so the month of January appears only once:

// monthly_articles.project([:month])
// The month [month] has at least one article published
+------------+
| month      |
+------------+
| 2025-01-01 |
| 2025-03-01 |
+------------+

Where do the data come from?

The advantage of this approach is that the source of the articles doesn’t really matter. It can be a literal (for example, in a unit test):

articles = Bmg::Relation.new([...])
result = group_articles_per_month(articles)

A JSON file will also do, if you convert the publication date into an instance of the Date class. The transform operator allows applying a transformation to all values of an attribute. Bmg supports using a class that has a parse method:

articles = Bmg::Relation
  .json_file("articles.json")
  .transform(publication_date: Date)
   # shortcut for .transform(publication_date: ->(str) { Date.parse(str) })
result = group_articles_per_month(articles)

Finally, relations can obviously come from an SQL database:

database = Sequel.connect('sqlite://blog.sqlite3')
articles = Bmg.sequel(:articles, database)
result = group_articles_per_month(articles)

Testing the solution

In the previous post I already talked about post-conditions in the specification, which guided the testing effort. In this section, I propose to review them, still using relational algebra.

This testing effort is somewhat over the top, since it partly involves deconstructing what we just built. Still, it remains a nice exercise in algebra. We'll start by helping ourselves a bit for what follows by adding the next month for each row:

result = group_articles_per_month(articles).extend({
  :next_month => ->(t) { t[:month].next_month }
})

Testing a single case

Based on the example, a first test allows us to intuitively verify that we have the desired results. If we simply count the results, we can take the following approach:

assert_equal(
  result.count,
  3
)
assert_equal(
  result
    .transform({
      :articles => :count,
      :month => :to_s
    })
    .y_by_x(:articles, :month),
  {
    "2025-01-01" => 2,
    "2025-02-01" => 0,
    "2025-03-01" => 1
  }
)

In this example, the y_by_x method allows us to leave the relational world and obtain a mapping between the month names and the number of publications for each.

Complete colution

The solution is complete if all the original articles are present in the final result. In other words, none are missing:

assert_empty (
  articles
    .not_matching(result.ungroup(:articles), [:id])
)

We use the not_matching operator, which is equivalent to WHERE NOT EXISTS in SQL with an equality clause on the id.

Correct solution

The solution is correct if the publication date of each article is indeed in the indicated month. That is, there is no article that violates this constraint:

assert_empty (
  result
    .ungroup(:articles)
    .exclude(Predicate.between(:publication_date, :month, :next_month))
)

We use the ungroup operator which somewhat reverses what image did earlier and returns a "flat" Relation.

Continuous solution

The solution is continuous if all months are represented between the min month and the max month. In other words, if every month is also a following month, except for the first:

assert_empty (
  result
    .exclude(month: minmax[:min])
    .not_matching(
      result
        .project([:next_month])
        .rename(next_month: :month),
      [:month]
    )
)

The project and rename operators allow us to compare the results to themselves. This is probably the least intuitive condition. Think of it as testing the following assertion: except for the first, there is no month that is not the following month of another.

Minimal solution

Finally, the solution is minimal if the first and last months have at least one article.

assert_empty (
  result
    .restrict(Predicate.in(:month, minmax.values))
    .restrict(Predicate.empty(:articles))
)

Conclusion

I have presented here a relational approach to solve a simple but non-trivial data grouping problem. The solution is original in the sense that it does not use any imperative programming constructs (if, repeat, while, until, foreach) nor functional programming constructs (map, reduce, fold), nor recursion. The relational approach is both powerful, elegant, and concise. It allows implementing the solution as well as testing the specification in terms of post-conditions.

I hope you enjoyed this article and that it showed you something you did not know before :-) You can find the full code on GitHub.

The Bmg library is used here and offers relational algebra to Ruby programmers. It is inspired by a more theoretical algebra Tutorial D, proposed by Chris Date & Hugh Darwen in their books on the relational model, and implemented in the tool Rel. If you like this approach and/or Bmg, know that I recently started a Typescript implementation, to which you are invited to contribute on GitHub: https://github.com/enspirit/bmg.js

Finally, if you appreciate this work, the best way to support us is to use Klaro Cards.

Appendix: Classes and Objects, an implementation detail?

If you are an object-oriented enthusiast, you might be wondering: why not introduce any class like Article, Month, List<Article>, or KanbanColumn<Article> to represent the domain concepts?

The "resolutely data-oriented" approach introduced here reflects what "flows" from one architectural component to another: information. Indeed, frontend and backend do not exchange classes and objects. They exchange data that capture useful information.

It is in this sense that I dared to say that classes and objects are an implementation detail: one can build software where frontend and backend collaborate perfectly without ever sharing classes and objects. These latter are not essential. However, one cannot create this software without sharing the information. That, for sure, is essential.

This also recalls Codd’s principle of essentiality from the Great Debate. But that is a topic for another blog post.

Table of contents