Analyze pizza recipes using R for customer insights and forecast trends

Presentation made with Canva, click on the image below to access Executive Summary or see it on Canva.

Analysis of pizza recipes by anrimissor

Full R script available on my GitHub profile.

I used R to filter all the recipes containing the word “pizza” from a data set scraped from “” (retrieved here). The original dataset included a large variety of recipes, not only for pizza (32367 observations and 16 variables).

The result is a dataset with 14599 observations and many more variables (e.g., I split the “nutrition” variable split into “calories,” “total_fat,” “sugar,” “sodium,” “protein,” “sat_fat,” and “carbs”).

The outcomes of the analysis were striking, especially in regard to the ingredients used as toppings, and the presence of frozen pizza crust in most high-rated recipes. I also compared’s “search” functions with Giallo Zafferano — one of Italy’s most famous food blogs. The comparison confirms that Americans like many topping variations, while Italian searches mostly focus on methods for preparing the dough and the type of flour. While performing the data analysis I also noticed some technical differences between the two websites — in terms of usability, organization, and some potential insights about usage.

The differences between American and Italian pizza recipes stem at least in part from interesting historical facts (see, e.g., Turim, 2018). When Italians migrated to the United States, they first settled primarily in big cities such as New York and Chicago; immigrants were used to baking basic recipes prepared with the ingredients available in their places of origin (wheat flour, tomatoes, basil, mozzarella cheese). After World War II, many Italian-Americans moved to the suburbs and pizza’s popularity increased; at the same time, they added local ingredients that were not available elsewhere. This helps to explain why corn flour and ground beef are common ingredients in American pizza recipes, but are almost never found in Italian pizza.

This screenshot shows the pizza results on Click on the image for details.

Top Recipes

The above screenshot shows the first results that appear after searching for “pizza” on on November 2022. The results are grouped according to the number of reviews.

In the screenshot below I count the pizza recipes from the dataset according to the number of ratings they received between 2005 and 2018 when the website was scraped. These are almost the same as the screenshot above, except for “beth’s pizza crust.” Note also that the number of ratings has not significantly increased considering there is a difference of at least 4 years.

Table 1. Most rated pizzas as per scraped data between 2005 and 2018


Most recipes had high ratings, with an average of 4.47; 10900 recipes obtained 5-star ratings — suggesting an appreciation for anything named pizza. Many recipes listed in the pizza category are not really pizza. One example is the top-ranked “my family’s favorite sloppy joes (pizza joes).” Another is the “banana berry brownie pizza,” a sort of pizza-inspired dessert with dough covered with slices of bananas.

Table 2
Table 3


There is not a clear relationship between calories and ratings, but recipes with 2-star ratings have more calories.

Table 4. Recipe calories and their ratings

Average calories per rating.

Table 5
Table 6. Distribution of calories on a log scale for compact visualization

Preparation Time

The average preparation time of the pizza recipes is 50.17 minutes, and the median prep time is 35 minutes.

Table 7

Some recipes just require a pre-made pizza crust and toppings, while in others the pizza dough is prepared from scratch and the raising times have been included. Dispersion in the next boxplot confirms that there are outliers.

Table 8

The most-rated recipes do not require much prep time, except for the “easy peezy pizza dough bread machine pizza dough,” which is made using a bread-making machine:

Table 9

Here we see the relationship between ratings and the average number of minutes required to prepare recipes; recipes rated with 2 stars require a bit more time, on average, while 5-star recipes tend to require the least amount of time.

Table 10

Number of Steps

One recipe requires 66 steps to be prepared as we can see in the screenshot below. Recipes that require more steps have fewer reviews (count) and they probably discourage users. However, after checking the web pages corresponding to those recipes, it appears that the actual number of “steps” does not match the numbers in the data.

Thinking of the reasons for the low reviews, one might be the absence of a photo showing the final dish, too many details, or a problem in the way number of steps are counted. For example, the recipe “bacon cheeseburger pizza” (5th on the list) is a short one with just 7 steps and “ready-to-use” pizza crust, so 47 steps appears to be an error in the dataset.

Table 11
Table 12

The most-rated recipes require between 5 and 20 steps.

Table 13. The most rated recipes.

Number of Ingredients

The mean and the median number of ingredients required to prepare pizza recipes are similar, 9 and 9.03.

Table 14

Pizza recipes rated with 2 stars require more ingredients compared to the other recipes.

Table 15

The most-rated recipes have between 5 and 15 ingredients. The “quick and easy pizza dough” recipe requires 5 ingredients while the “ultimate pizza sauce” has 15 ingredients (this seems excessive).

Table 16

Pizza ingredients from the most rated recipes on Giallo Zafferano – Italy

Out of curiosity, I wanted to compare the ingredients used in the most-rated pizza recipes on, with the Italian versions. Giallo Zafferano’s blog automatically groups recipes into some useful categories, including most-rated, most-visited, and those receiving the most comments. I scraped the sections with the most-rated recipes, and they have between 10 and 16 ingredients. Note that Giallo Zafferano shows the average star ranking of each recipe. See the appendix for a translation of the ingredients from Italian into English.

Table 17. The most-rated pizza recipes from Giallo Zafferano – Italy

Review and Submission Year

The pizza recipes posted on received more reviews between 2006 and 2010 than in other years. There was a decline in posting recipes and reviews starting in 2011. In fact, the differences between the current number of ratings compared to the years when this dataset was scraped are minimal (see the screenshot at the beginning of this article).

Table 18

Overall, there are more reviews in February and January than in other months.

Table 19

Text analysis of reviews

Reviews with 5 stars and 0 stars (0 means that users have not tried the recipe) have more exclamation points.

Table 20

Ingredients on

The most common ingredients on for pizza recipes are as follows:

Table 21

Ingredients on Giallo Zafferano

The ingredients in the 12 most-rated pizza recipes on Giallo Zafferano.

Table 22. The most rated pizza recipes on Giallo Zafferano – frequent ingredients

The two most common ingredients are the same – salt and oil (in Italy extra virgin oil).

The third ingredient on is mozzarella cheese, and the third ingredient on Giallo Zafferano is water.

Note that sugar is the fourth ingredient on’s pizza recipes, while on Giallo Zafferano there is honey at 19th place.

The Italian analysis of ingredients mostly shows “basic” ingredients required for the dough and for the sauce. For the dough there are water, salt, yeast, and specific types of flour such as 00, Manitoba, Kamut, etc..and for the sauce there are tomatoes, basil. Cheese is mostly mozzarella or mozzarella “fior di latte.”

Not-so-good recipes

Out of 14599 reviews, 10900 gave ratings of 5 stars, which seems high. Maybe we can learn more by examining recipes that never received a 5-star review.

I used the “ggVennDiagram” package to create a Venn diagram to distinguish 5-star recipes from 1- and 2-star recipes. I then used the setdiff function to filter the names of those recipes, and to obtain their ingredients.

Table 23

18 recipes received at least one 1-star review but no 5-star ratings — see their names in the table below. Out of these 18 recipes, one received both 1-star and 2-star reviews, as we can see in the Venn diagram (the wedge between 1- and 2-star recipes). It is the “mac cheese stuffed crust pizza.” Perhaps it has too many carbs?

Table 24

13 recipes had a least one 2-star review and no 5-star reviews:

Table 25

Odd ingredients (plus some common ones)

Below we see the ingredients that appear in pizza recipes with at least one 1-star review and no 5-star ratings:

peach preserve, canola oil, pineapple juice, bisquick, baking soda, rolled oats, spaghetti noodles, dried thyme, cheese soup, celery ribs, and other more common ingredients.

Table 26

2 is the new 1 star

Recipes rated with 2 stars have more calories, require more time to be prepared, and have more ingredients. As we can see from the screenshot below, they also have fewer ratings (count).

Table 27

The favorite ingredients

These ingredients appear more than 100 times in recipes rated above 4.5:

Table 28. Ingredients appearing at least 100 times in recipes rated 4 and 5

The ingredients correlated with the highest ratings are:

granulated sugar, fast-rise yeast, chives, fresh ground black pepper, mayonnaise, bread flour, minced garlic cloves, dried Italian seasoning, bay leaf, cream cheese, fresh garlic, dry mustard, barbecue sauce, fresh cilantro, onion powder, and ketchup.

Ingredients appearing in recipes with ratings around 4.80 and liked more than 1000 times are:

celery, cornmeal, Italian seasoning, garlic, dried oregano, garlic powder, ground beef, pizza sauce, sugar, parmesan cheese, water, olive oil, salt, onion, and mozzarella cheese.

Eggs are listed more than 1000 times in recipes ranking between 4.75 and 4.80.

Prepared pizza crust is frequently mentioned among the favorite recipes.

A mix of good and bad

The network below shows all the ingredients in recipes rated with 1 star that appeared at least 5 times; there are many and some obviously incorrect names (ziploc bags? yes, for the “pizza lunchable” recipe). Some of them are quite unusual: English muffins? processed cheese food? creamy peanut butter? The reason for some “sweet” ingredients is that these recipes include some pizza desserts. Perhaps “pizza” is an overused term.

Table 29


Recipes for Pizza. (n.d.). Retrieved December 5, 2022, from

Ricette Pizza – Le ricette di GialloZafferano. (n.d.). Retrieved December 5, 2022, from

SCHACHT, E. (2019). EDA and Text Analysis.

Silge, J. (2021, December 15). Topic modeling for #TidyTuesday Spice Girls lyrics. Julia Silge.

Silge, J. (2022, January 21). Text predictors for #TidyTuesday chocolate ratings. Julia Silge.

Silge, J., & Hvitfeldt, E. (2022). Supervised Machine Learning for Text Analysis in R. In Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Turim, G. (2018, August 23). A Slice of History: Pizza Through the Ages. HISTORY.

Yan, Q. (2020). 4.2 Counting and correlating pairs of words with widyr | Notes for “Text Mining with R: A Tidy Approach.” In


Translation of Italian Recipes

Pizza a lunga lievitazione= Slow-Rise Pizza

Pizza alla Napoletana= Pizza Neapolitan Style

Pizza con farina integrale= Whole Wheat Pizza Dough

Pizza di Kamut= Khorasan Wheat Pizza Dough (

Pizza Fritta= Deep Fried Pizza

Pizza Fritta con Mortadella e Fior di Latte= Deep Fried Pizza With Mortadella and Fior di Latte Mozzarella

Pizza in Teglia Alta e Morbida= Thick Crust Sheet Pan Pizza

Pizza napoletana= Same as Pizza Neapolitan Style

Pizza Romana= Romana Pizza

Pizza senza Glutine= Gluten Free Pizza

Pizza senza Impasto= No Knead Pizza Dough

Pizza Arrotolata Stromboli= Rolled Pizza

Translation of Italian Ingredients

Sale fino= fine ground salt

Olio extravergine di oliva- extra virgin olive oil

Acqua= water

Passata di pomodoro= tomato puree

Origano= oregano

Lievito di birra fresco= fresh yeast

Lievito di birra secco= dry yeast

Farina 00= 00 Flour (double zero)

Basilico= basil

Farina= flour

Pepe nero= black pepper ground

Mozzarella or Fiordilatte is mozzarella

Farina Manitoba= High protein flour

Provola= provolone

Miele= honey

Acciughe sott’olio= anchovy fillets in oil


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s