Skip to content

Editorial lessons

The editorial lessons walk the headline deep module of the platform — the ranker — plus the supporting deep module for sensitive-topic detection. Both are pure functions with exhaustive test suites.

tutorial/serving/src/serving/ranker.py is one file. Read it top to bottom. The signature is the entire interface:

def rank(candidate_set, configuration) -> ranked_list

No database, no HTTP, no filesystem. Dependency-free.

The function:

  1. Scores each candidate with the relevance term (cosine similarity) plus the three soft-term contributions (diversity, recency, sentiment), weighted by the configuration’s three weight columns.
  2. Sorts descending by score.
  3. Applies hard rules — first inserts promoted articles at their target positions, then enforces the sensitive-topic cap by dropping sensitive articles from the bottom of the list until the cap holds.

The exact math is in ADR-0015. Compare the ADR formulas line-by-line against the code.

test_ranker.py covers the cardinal cases:

  • Default config — produces a sensible list
  • Click-only config (all soft weights zero) — ranker reduces to pure relevance sort
  • Diversity-max — visible reordering toward more category variety
  • Recency-max — fresh articles win
  • Sentiment-target shift — list re-centres
  • Editorial promotion — promoted articles at exact positions
  • Sensitive-topic cap respected — never exceeds max_sensitive
  • Empty candidate set, single candidate, ties broken deterministically

Run the tests:

Terminal window
uv run --package tutorial-serving pytest tutorial/serving/tests/test_ranker.py -v

Read the test names — they read like a specification of editorial behaviour.

sensitivity.py combines two signals into a per-article boolean:

  • A small list of sensitive named-entity categories (e.g. crime, violence, death indicators)
  • The EB-NeRD sentiment_score when it exists — articles below a negative-sentiment threshold are upweighted toward sensitive

The dbt-side wrapper at article_sensitivity.py runs the detector against the article corpus and materialises an is_sensitive column. The custom dbt test assert_article_sensitivity_seeded_cases.sql pins known-sensitive seed articles to true and known-benign to false.

The detector tests live at test_sensitivity.py.

constraint_configurations.sql defines the table. The Python side at configurations.py is the repository layer the FastAPI app uses to read/write rows.

Two important properties:

  • The table is part of the analytical contract (ADR-0006). Analysts can query historical configurations directly in SQL.
  • Every row has updated_at and updated_by columns, so the audit trail is part of the data, not a separate logging concern.

The ranker is consumed by serving — every /recommendations/ and /preview response calls rank(). The editor module’s UI reads and writes the configuration table. The evaluation module sweeps configurations through the ranker to produce the Pareto chart.