Modeling

The modeling module is where the recommender model itself lives. It is deliberately the smallest substantial module in the tutorial. ADR-0007 records the decision to keep the model intentionally modest: content similarity over article text, with a popularity-by-category cold-start fallback. No training, no fine-tuning, no two-stage architecture.

The point is not that this model is impressive. The point is that the platform around it does enough work that the model does not have to be. That is the platform-as-leverage argument made concrete.

The runnable rep

Modeling ships three runnable lessons plus the standard rep verbs:

make -C tutorial/modeling run     # lesson 01: register a model + refresh embeddings
make -C tutorial/modeling check   # the module's full test suite
make -C tutorial/modeling clean    # remove the lesson work dir

The three lessons walk the model lifecycle (ADR-0031), not the math:

lesson-01-embedding-refresh.py — register a candidate embedding model in the local Parquet registry and refresh article embeddings for it.
lesson-02-registry.sql — query the registry as data: who is candidate, who is production, which revision produced which vectors.
lesson-03-eval-gate-and-promote.py — run the offline eval gate and promote a candidate to production only if it clears the thresholds.

The model itself is deliberately modest (below); the depth you drill here is the MLOps lifecycle around it — register, evaluate, gate, promote, roll back.

What the code looks like

Modeling is split across two homes:

tutorial/modeling/src/tutorial_modeling/embedding_identity.py owns the ADR-0031 production embedding identity: model id, Hugging Face name, exact revision, and the bootstrap fallback used before any production model has been promoted.
tutorial/serving/src/serving/embeddings.py exposes the deterministic embedding interface the rest of the platform calls; it resolves the active model key through the modeling lifecycle instead of carrying its own production constant.
tutorial/serving/dbt/models/staging/article_embeddings.py is the dbt Python model that runs the embedding once per article and materialises the vector column plus lifecycle model id/name/revision into the Parquet substrate. This means the embeddings are part of the analytical contract — analysts can query them just like any other column.
tutorial/serving/src/serving/recommendations.py does candidate generation: average the embedding of the user’s recently read articles, find the top-N nearest articles by cosine similarity, exclude already-read articles, and return the candidate set.

The cold-start path is in the same file — when a user has no read history, the code falls back to popularity-by-category over a recent time window. No fancy bandits, no warmup model. The honest acknowledgement that cold-start is a real problem is treated as more important than the particular technique chosen.

Tests anchor the behaviour

test_embeddings.py covers the encoder behaviour, and test_recommendations.py covers candidate generation end-to-end with seeded user histories and embedding fixtures. Both run in milliseconds without a database, which is the test discipline ADR-0007 promises.

Why the model deliberately stays this small

ADR-0004 predicts the failure mode: if the platform argument only works because the model is impressive, the platform is not actually doing much work. Keeping the model modest puts the architecture under pressure in the right place.

It also keeps the lab’s effort where the learning is: the depth worth drilling here is platform, product, and domain reasoning, not deep recommender research. Overclaiming the model would put the practice in the wrong place and teach the wrong lesson.

What makes this a good rep

Following the foundations template (ADR-0033): make run exercises the entry lesson hermetically — the embedding model is a deterministic local stand-in, so there are no downloads — make check runs the full test suite (registry, eval gate, offline eval, lessons) as the correctness gate, and make clean resets the work dir. Each lesson rewrites its own work directory, so it is identical the first run and the hundredth.

After this module

The candidate sets this module produces flow into editorial, where the ranker applies the five editorial constraints. The same candidate set is what the editor interface displays before-and-after when an editor moves a slider. And the evaluation harness sweeps constraint configurations against these candidates to render the Pareto chart.