How The New York Times Cooking Team Makes Personalized Recipe Recommendations

A look into how recommendation algorithms are used to help readers find the right recipes for themselves.

The NYT Open Team

Published in

NYT Open

9 min readOct 13, 2023

By Kyelee Fitts and Celia Eddy

With tens of thousands of recipes in our catalog, The New York Times Cooking team faces a challenging task — how can we serve the best recipes for our users, taking into account their varying preferences, including diet, nutrition, cook time, cuisine, and ingredients? Each week, The New York Times Cooking editors manually curate a set of recipes and collections to promote on the homepage and in email newsletters. To enhance our editorial curation, we have been experimenting with personalized recipe recommendations with the goal of making sure our users find the right recipes for them.

In August, we launched a new personalized homepage in The New York Times Cooking mobile app. This new homepage includes both editorially and algorithmically programmed content that is grouped into different carousels. Examples of editorially-curated content include the recipe of the day, recipes from Sam Sifton and Melissa Clark’s newsletter, or editorially-curated collections.

*Left: Example of an editorially curated carousel on the Web. Right: Carousels on the iOS app.*

Other carousels are powered by algorithmic recommendations; for example the We Think You’ll Love carousel recommends recipes based on what users have engaged with in the past.

In this post, we’ll focus on how we use machine learning algorithms to deliver recipe recommendations for our users. Different algorithms serve different user needs, and we’ll walk through a few examples of how we apply different algorithms across The New York Times Cooking website and mobile app.

How do our algorithms work?

First, we need to define a pool of eligible content to recommend to users. Pools can be manually curated by editors or can be generated via a query of our recipe database, using a set of rules we determine for each recommendation carousel. Next, we use an algorithm to rank the items in that pool according to a ranking algorithm that we define. This ranking algorithm can be as simple as ordering recipes by popularity, but we can also use reinforcement learning methods or natural language processing techniques. Let’s go into a few different ranking algorithms that we use to power some of our different carousels, each of which fulfills a different user need.

Recipe Recirculation

One tool we often use in our ranking algorithms is text embedding. Embeddings are structured numeric representations of words or documents that enable us to use text as input to machine learning models. Once we represent our recipes as fixed-length vector embeddings, we can take the distance between the vector embeddings for any pair of recipes as a measurement of how similar the recipes are to each other. For example, if we were representing these three recipes as a length-2 vector embedding, we can plot them in 2 dimensions and get an idea of how “far apart” the recipes are:

In practice, these embeddings encompass more than just two values, allowing them to capture the multifaceted nuances of recipes. We use several different parts of the recipe text to generate these embeddings: the title, the description, the ingredients, and the cooking steps.

We’ve experimented with many different methods for embedding recipes. Recently, we upgraded our recipe embeddings to use a pre-trained sentence transformer model that is optimal for measuring similarity between two pieces of text. These models are easy to use, fast to score, and most importantly, outperform other embedding methods in our experiments in terms of engagement with our content.

We use these embeddings directly on the Similar Recipes ribbon (pictured below), a recirculation module that appears below the recipe steps. It directs users to recipes that are similar to the one that they are currently viewing.

Once we generate embeddings for all the recipes we want to consider for the carousel, we compute the similarity to every other recipe in the catalog using a metric called cosine similarity. The cosine similarity varies from -1 to 1, where a number closer to 1 means the recipes are more similar. We can then sort all the recipes by cosine similarity to find the ones that are most similar. On this carousel, we use a weighted combination of the similarity and number of page views recipes have recently received, so the top recipes shown are both similar to the current recipe and fairly popular.

*Recirculation ribbon for* *Stone Fruit Caprese Salad*

Most Popular This Week / We Think You’ll Love

The model we use to power the majority of our algorithmic carousels is a contextual multi-armed bandit algorithm. As a quick recap (see our previous blog post for more information), a bandit is a reinforcement learning algorithm that suggests the best decision (i.e., an article or recipe) among a list of options (in our case, a pool of content that’s eligible to be recommended in a certain location) in order to maximize a reward. We use a modified version of a model that predicts the “reward” (in our case, a user clicking on a piece of recommended content) of showing a particular article or recipe to a user as a linear combination of contextual features plus a term that estimates the uncertainty of the prediction. The bandit balances exploiting known high-CTR content with exploring new content, and over time it learns what recommendations to make in order to optimize reader engagement. We have made several modifications to the bandit algorithm that we’re using, including introducing stochasticity and “forgetfulness” to ensure we can model changes in article and recipe CTR over time. Through experimentation, we determined that these model updates improved reader engagement and recommendation freshness.

A bandit with no contextual features will typically recommend the most engaging articles or recipes to all users. This is the algorithm that we currently use to power the Most Popular This Week carousel on the mobile app and the web (pictured below). The pool consists of the recipes that received the most pageviews over the last week. This allows us to show the recipes that have been most popular recently and order them efficiently in terms of what will drive the most engagement on the carousel.

We also have the ability to add features to our bandits to help them learn what content will likely result in a click. One contextual feature we use frequently is something we’ll call the personalization feature, which tells us how similar a recommendation candidate is to the recipes a user has engaged with in the past. To create this feature, we first calculate embeddings for all our recipes using a sentence transformer model, as described above. Then, we use prior engagement data to generate user vectors for each of our users. In our case, we have chosen recipe saves as a strong indication of a user’s interest in a recipe, since a user saves a recipe if they intend to cook it or return to it later. For a given user, we take all of the recipes that they have saved, get the embeddings of those recipes, and average them pointwise to create a user history vector. Then, we can measure what recipes are closest to that user history vector according to cosine similarity. We use the cosine similarity score as a contextual feature in the bandit and let it learn how much weight to give the feature as a predictor of clicks.

*For a user (yellow smiley face) who has saved the 3 red dots, we can find their average user history vector (yellow dot) and recommend the Cucumber Salad over the Roast Chicken recipe.*

When we first tested this bandit personalization feature on the We Think You’ll Love carousel, we realized that the same popular recipes tended to be recommended to the majority of users. In other words, the bandit was not paying as much attention to each individual user’s preferences as much as we’d like for a carousel that’s explicitly personalized. To ensure that our recommendations were aligned with the business goals for this carousel, we decided to make a modification to the bandit reward. Instead of rewarding the bandit on every click it got equally, we rewarded it more if it got a click and the recipe was similar to recipes the user had engaged with in the past. Modifying the reward increased the diversity of the recipes we were showing between users and resulted in more personalized recommendations. We have found via experimentation that a bandit with this personalized reward drove more engagement than our previous approach, which involved ranking recipes based purely on the similarity between candidate recipes and a user’s saved recipes.

Further improvements to cooking algorithms: Seasonality and Diets

Many users are interested in cooking recipes with ingredients that use produce that is seasonal to their areas. To meet this need, we developed a seasonality score to match the seasonal ingredients in different regions in the US with the ingredients in our recipes and scored each recipe based on how seasonal it is in each region at each time of the year. The seasonality score is based on several factors, including the number of seasonal ingredients the recipe contains, the percentage of total ingredients that are seasonal, and whether the recipe contains ingredients that are newly in season.

This method of ranking recipes by seasonality allows us to build pools of recipes that are locally seasonal in different regions of the US. Instead of just ranking by the seasonality score, we use a bandit on top of the seasonal pool in order to optimize engagement within that locally seasonal pool of recipes.

This algorithm is what powers the In Season Near You carousel on the personalized homepage (pictured below).

Finally, let’s cover dietary needs and preferences. Dietary preferences are a huge aspect of recipe personalization: Users want their New York Times Cooking app to cater to their specific dietary needs. We don’t currently collect data from users on their explicit dietary preferences, so instead we use signals about their preferences based on their activity within the product so we can more effectively provide the personalized recommendations people expect. In the new, more personalized homepage, we introduced Dietary Preference carousels (pictured below).

When our recipe developers and editors create a recipe, they tag it with a variety of tags, including cuisine, main ingredients, and whether it adheres to a specific diet. The dietary preference carousels use contextual bandits to recommend a set of the most popular diet-tagged (in particular, vegetarian, vegan, dairy-free, or gluten-free) recipes for users. These carousels only appear if the users have saved to their recipe box a certain number of those diet-tagged recipes.

Another way that we’ve used dietary preferences is as a contextual feature in our bandits. This is a feature we’ve introduced recently to the We Think You’ll Love carousel. To create this feature, we first count how many recipes are of a particular diet in a user’s saved recipes. We then construct a vector with these counts and normalize them:

# [vegetarian, vegan, gluten-free, dairy-free]
# For user “Jane”: 
diet_vector_counts = [10, 5, 5, 0]
diet_vector_normalized = [0.5, 0.25, 0.25, 0]

The bandit feature is the cosine similarity between this dietary preference vector and a vector of the recipe’s diet tags. So for a vegetarian recipe, the recipe diet tag vector is:

recipe_diet_vector = [1, 0, 0, 0]

And the feature that the bandit would use to predict whether or not user Jane clicks on this particular vegetarian recipe is ~0.82. Adding this contextual feature about diet tag similarity enables the bandit to make recipe recommendations that are more in line with users’ implicit dietary preferences.

These were just a couple of examples of how we’re improving our recipe recommendations to better meet reader needs. As we learn more about our users’ preferences, we are continuing to refine our algorithms to improve the personalization within The New York Times Cooking app, working closely with our newsroom editors to ensure that we are always bringing our users a curated, enjoyable Cooking experience.

Happy Cooking!

Kyelee Fitts is a former Senior Data Scientist on the Algorithmic Recommendations team at The New York Times, and she is currently a data scientist at Google. Outside of work, Kyelee likes to dance, travel, and cook.

Celia Eddy is a Lead Data Scientist on the Algorithmic Recommendations team at the New York Times. Outside of work, Celia enjoys board games, reading, and watercolor painting.

How The New York Times Cooking Team Makes Personalized Recipe Recommendations

A look into how recommendation algorithms are used to help readers find the right recipes for themselves.

How do our algorithms work?

Most Popular This Week / We Think You’ll Love

Further improvements to cooking algorithms: Seasonality and Diets

Written by The NYT Open Team