Generating The Total Nutritional Content For a Week
During a conversation with one of my friends, I showed them some of my work with analyzing the protein content of various meats and vegetables. After seeing the analysis, they asked me if I would be interested in analyzing the total caloric intake for a week's worth of meals.
Introduction
During a conversation with one of my friends, I showed them some of my work with analyzing the protein content of various meats and vegetables. After seeing the analysis, they asked me if I would be interested in analyzing the total caloric intake for a week's worth of meals.
I learned a great amount concatenating, droping, and manipulating indexes in Pandas.
import pandas as pd
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
df_days = pd.read_csv('Days.csv')
df_ingredients = pd.read_csv('Ingredients.csv')
df_meals = pd.read_csv('Meals.csv')
df_results = pd.read_csv('Results.csv')
Investigating Dataframes
Before calculating the totals, I wanted to take a look at the dataframes, I transposed the columns in libreoffice such that the Days were columns, and the meals were rows. The majority of Pandas operations manipulate columns rather than rows.
I also transposed the columns for the other dataframes (df_meals
and df_ingredients
, )as well
df_days
df_meals
df_ingredients
df_results
Generating The Code
After looking at the raw data and transposing the data, I decided to use multiple for loops to solve the problem.
My general plan followed
- get meals for each day
- make a dataframe for each day where there is 1
- get foods for each meal
- add each food to a list
- add each ingredient to a list
- store sum in results
The following code iterates through all the dataframesdf_days
,df_meals
, and df_ingredients
intialized in the previous step to generate a dataframe with all the individual ingredients for all the days
Variables Intialized outside of the loop
meal_list
,food_list
, andingredient_list
are initially empty, we will set values for them once the loop begins.days
is a list of the days of the week, we will use to keep track of the day. Furthermore, we will usedays
to specify which column we want python to focus ondf_summary
is a empty dataframe we will append the ingredients tocounter
is a integer variable that that is used as a index when iterating thedays
list, it is set to be -1 to offset the starting column ofdf_days
all print()
and display()
statements were used to view the variables as the loop iterated
Explanation For The First loop
The first for
loop iterates accross the columns of df_days
dataframe in the first loop, generates a list of meals consumed in a day
column_title
is the title of the column fromdf_days
day
is the current daydf_day_non_zero
isdf_days
, but all values greater than 0meal_list
usesdf_day_non_zero
but has only the meals quantity for a single day,.reset_index()
resets the index as filtering for the day changes the index,.drop(columns = "index")
drops the index created by.reset_index()
Explanation For The Second Loop
The second for
loop iterates accross the columns of meal_list
dataframe generated for a day from the previous step, generates a list of foods consumed in meal
meal_item
is the title of the column fromdf_meals
df_empty
is a empty temporary dataframe that we generate each iterationfood_quantity
stores the number of meal consumed in a daydf_meal_day
isdf_meals
, but all values greater than 0food_list
dataframe that usesdf_meal_day
to get the meal item for a particular mealfood_list[meal_item]
dataframe that multiplies the specific quantity by thefood_quantity
Explanation For The Third Loop
The third for
loop iterates through the food_list
dataframe generated for a meal from the previous step, concatenates the ingredients to df_empty
dataframe, uses a conditional that will multiply the nutritional value by the number of values consumed
df_current_ingredient
isdf_ingredients
but when the food item matches the value from`food_list
ingredient_quantity
stores the number of meal items consumed in a single meal
After the third for
loop, we will concatenate all the ingredients per meal into a single dataframe with two columns for the day and meal the item was consumed from. When creating new columns using the df["new column"] = value
Pandas will add the new column to the end of the columns.
meal_list = []
food_list = []
ingredient_list = []
days = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
df_summary = pd.DataFrame()
counter = -1
# loop through days table
for column_title in df_days:
# get meals for each day
# make a dataframe for each day where there is 1
day = days[counter]
df_day_non_zero = df_days[df_days[day] > 0]
meal_list = df_day_non_zero[["Meal",day]].reset_index().drop(columns = "index")
#print statements for debugging and tracking variable values
#meal_quantity = meal_list[day][]
#print("days[i]")
#display(days[counter])
#print("df_day")
#display(df_day)
#print("column_title")
#display(column_title)
#print("meal_list")
#display(meal_list)
food_counter = 0
# get foods for each meal
for meal_item in meal_list["Meal"]:
#print("meal_item")
#display(meal_item)
df_empty = pd.DataFrame()
food_quantity = meal_list[day][food_counter]
#print("food_quantity multiplication factor")
#display(food_quantity)
df_meal_day = df_meals[df_meals[meal_item] >= 1]
food_list = df_meal_day[["Meal", meal_item]].reset_index().drop(columns = "index")
#print("food_list before multiplication")
#display(food_list)
food_list[meal_item] = food_list[meal_item]*food_quantity # multiplies the list by the number of times from days ie if someone eats 2 meals, the meal item will be multipled by 2
#print("food_list after multiplication")
#display(food_list)
# add each food to a list
for food_list_index in range(len(food_list)):
#print("food_list")
#display(food_list)
#print("food_list_index")
#display(food_list_index)
df_current_ingredient = df_ingredients[df_ingredients["Item"] == food_list["Meal"][food_list_index]]
ingredient_quantity = food_list[meal_item][food_list_index]
#print("df_current_ingredient")
#display(df_current_ingredient)
#print("food_list[Meal][food_list_index]")
#display(food_list["Meal"][food_list_index])
#print("ingredient_quantity")
#display(ingredient_quantity)
if(ingredient_quantity>=1):
df_empty = pd.concat([df_empty, df_current_ingredient * ingredient_quantity])
else:
df_empty = pd.concat([df_empty, df_current_ingredient])
#display(df_empty)
# add each ingredient
#need to make a running total, and then append to a seperate df for each day
df_empty.loc[column_title,:] = df_empty[["Item","Calories", "Total Fat", "Saturated Fat", "Monounsaturated Fat", "Polyunsaturated Fat", "Cholesterol", "Carbs", "Sugar","Protein", "Fiber", "Sodium"]].sum(axis=0)
df_empty["Meal"] = meal_item
#display(df_empty)
df_summary = pd.concat([df_summary, df_empty.tail(1)])
#display(df_summary)
food_counter+=1
counter+=1
# store sum in results
df_summary = df_summary.reset_index().rename(columns = {"index" : "Day"})
df_summary
df_grouped = df_summary[4:].groupby(['Day']).sum().reindex(days)
df_grouped
df_test = df_days[df_days["Thursday"] == 1]
#df_test[["Meal","Monday"]]
df_test
print(df_days.columns.values[1:8])
df_test2 = df_meals[df_meals["Standard breakfast"] ==1]
df_test2
df_test2[["Meal"]]
ingredient_list = []
ingredient_list = df_test2[["Meal"]].reset_index().drop(columns = "index")
ingredient_list
len(ingredient_list)
df_test3 = df_ingredients[df_ingredients["Item"] == ingredient_list["Meal"][0]]
df_test3
df_test4 = df_ingredients[df_ingredients["Item"] == ingredient_list["Meal"][1]]
df_test4
df_test5 = df_test3.append(df_test4)
df_test5
df_test5.loc['Total',:]= df_test5.sum(axis=0)
df_test5
df_test5
df_meal_day = df_meals[df_meals["Weekend egg muffin breakfast"] >= 1]
display(df_meal_day)
display(df_meals)
food_list = df_meal_day[["Meal", "Standard breakfast"]].reset_index().drop(columns = "index")
food_list
df_day[["Meal", "Monday"]]
meal_list
df_test6 = meal_list[meal_list["Sunday"]>1]["Sunday"]
df_test6
df_test7 = meal_list["Meal"][meal_list["Sunday"]==2]
df_test7
cats = [ 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
df_weekday = df_summary[4:].groupby(['Day']).sum().reindex(cats)
df_weekday