Guest post by Tsuji Ryutaro, Data Scientist at dely 

Founded in 2016 by entrepreneur Yusuke Horie, dely is the company behind the top Japanese cooking video app kurashiru. To date, the app has been downloaded by 21 million users and allows them to search through over 35,000 video recipes.

Kurashiru statistics

On the recipe side, all of kurashiru’s recipes are created by cooking professionals with diverse culinary backgrounds and are supervised by professional nutritionists.

On the technical side, kurashiru’s engineers have worked hard to develop the recipe search engine and have learned to consider a variety of details when building out the app. Some user considerations include:

  • What leftover ingredients are in the user’s refrigerator?
  • Which ingredients do the user’s family members like and dislike?
  • Which of the user’s family members ate lunch today?

Each of these considerations is more critical than you might think, as the recipe for one dinner is dependent on many variables. For example, if a user is using the app inside a grocery store, the main deciding factor in what recipe they look up is likely dependent on the price and quality of the food available. Recipes might also depend on scheduling and revolve around which family members are available to cook.

To manage all of these moving parts, we started using Amazon SageMaker. In this post, we detail how we addressed each user’s preference to address our principal, and how we took care of the “whole process of cooking” to provide users with the true value of our service.

diagram of dely sagemaker architecture

Architecture of AutoML in action

The Business Problem

Building a personalized meal-building service is difficult because there are so many variables to account for everyone’s different tastes. To design a pipeline that offers a product for everyone, the architecture needs to be flexible and able to switch each component as needed, offer alternative proposals, and satisfy the customer. Machine learning makes this task easier. To introduce ML into the kurashiru app, we first annotated professionally-created meals based on predefined rules. Then, we created classification models of categories based on the annotated dataset with Amazon SageMaker and XGBoost’s built-in algorithm and hyperparameter optimization (HPO), which dramatically shortened the time to implement.

Training using XGBoost; a built-in algorithm in Amazon SageMaker

from import get_image_uri
smclient = boto3.Session().client('sagemaker')
sess = sagemaker.Session()
container = get_image_uri(region, 'xgboost', repo_version='latest')
xgb = sagemaker.estimator.Estimator(container,

Passing Estimator of XGBoost to HPO in Amazon SageMaker

hyperparameter_ranges = {'eta': ContinuousParameter(0, 1), 'min_child_weight': ContinuousParameter(1, 10), 'alpha': ContinuousParameter(0, 2), 'max_depth': IntegerParameter(1, 10)}
objective_metric_name = 'validation:auc'
tuner = HyperparameterTuner(xgb,
max_parallel_jobs=3){'train': train, 'validation': validation})

However, as time passed, we learned that users’ actual meal demands were very different from the nutritional meals created by kurashiru’s professional chefs and nutritionists. We therefore further challenged ourselves to adapt our users’ true desires into our recipe recommendation. We also offered users the ability to create their own original dish combinations using k-means and recommended the set of the meal in a professionally-refined manner.

The Technical Solution

To do this, we underwent a great deal of trial and error. The first version of the kurashiru app was released with a feature that recommended a combination of meals, such as “main dish,” “side dish,” and more. But the latest version of kurashiru offers users the ability to create an original combination of dishes. We believe this fits the users’ needs better than merely providing a textbook combination of meals. Although kurashiru’s chefs and nutritionists were pros at creating the meals, their ideas for the meals were based on textbook nutritional studies.

Additionally, the combination of meals suggested by the app would occasionally not align with what the user actually wanted. The following is the logic for how we refined the meal.

When we recommended a dish in the “suggested” section, and as a side dish to pair with a main dish of the choice, we included positive feedback based on the user’s activity. Examples of positive feedback included high number of video plays, favorites, the rate of matching recipes in the dish, and matching the search keywords.

We also removed recipes and food items that are inferred to be negative to the user. Examples of the negative feedback were marked when the users changed the dish that was recommended, or suggested videos were left unplayed. We inferred recipes and food items were disliked from the co-occurrence rate and when recipes were removed from the list of recommendations to the user.

Image flow of differences between meals users built and annotated meals

Differences between meals users built and annotated meals


Thumbnail of meal built by annotated data

Meals built by annotated data

Inferring visual appropriation based on teaching data of the meal

with open(file_name1, 'rb') as f:
payload =
payload = bytearray(payload)
response = runtime.invoke_endpoint(EndpointName='image-classification-XXXXXXXXXX', ContentType='application/x-image', `
Body=payload) result = response['Body'].read()
result = json.loads(result)
index = np.argmax(result)
object_categories = ['Balanced visual!', 'Imbalanced visual!']
print("Result: label - " + object_categories[index] + ", probability - " + str(result[index]))
## => Result: label – Balanced visual!, probability - 0.6134527325630188

How We Designed Recipe Recommendations

To provide further personalizing services, we needed to consider user preference, preference of family, and lifestyle need to affect the recommended recipes.

The recipe recommendation has to consider three main points:

  1. Usage Frequency
  2. Seasonality and recurrence
  3. User Context

Usage Frequency

The first point refers to a user’s app usage frequency. If a user uses the service frequently, they are considered “warm” and it becomes easier for the service to use collaborative filtering to implement for the rating functionality.

Alternatively, if the user is an infrequent user of the service, it is difficult to collect enough information to recommend recipes because the process starts “cold.” We configured kurashiru to set users with a usage frequency higher than a certain threshold so they could launch the recipe recommendation service with a “warm” start.

Seasonality and Recurrence

Seasonality can affect a recipe video’s rate of play. For example, it would not be appropriate to recommend somen (cold noodles) to users in the winter, even if the data indicated that a particular user loved cold noodles. Additionally, one user might use the same recipe a week later with a few variations not outlined in the original recipe. Data usage time needs to be considered per the user’s latest data, as well as the predefined seasonal data and event data, which are applicable globally.

External Factor User Context

Recipe recommendations can be influenced by other external factors. For example, when a popular Japanese TV show featured a variety of special ingredients, the scene boosted, on the app, the number of searches for a recipe using those ingredients. Popular culture definitely influences the recipe search. Recipe searches are also affected by the temperature outside a user’s home.

We decided to ask users to pick out several recommendations depending on several contexts. We are currently developing this method.

Personalized recommendation flow of how meals get suggested

Personalized recommendation flow

It is also essential to meet users’ hidden requirements. In order to make sure there aren’t any missing events data that should be retrieved, we implemented an “event design first” method to our architecting events before beginning development.

Diagram flow of "event design first" architecture

Architecture of the “event design first” method

Method: Extracting User Attributes

Below are the methods we used to extract user profiles:

  1. Extracting profiles based on user activity
  2. Clustering
  3. Analyzing tendency per cluster

Extracting Profiles Based on User Activity

To use the user activity data, we looked at the total contact between user and recipe. Specifically, we kept logs of which recipe videos the users played, how long they played them, and how many of the videos were marked as “favorites.” We then used the data of the top half of our most frequent users, so that we could collect enough data to start from a “warm” position. We did not use the data from the bottom half of the users, in terms of frequency.

Next, we analyzed that data based on recipe features. For example, if a user looked up a recipe with a short cooking time, it might indicate that the user was interested in saving time. On the other hand, if the contact to the recipe was limited to recent activities, the user’s interest might be temporal. We considered it reckless to determine that a short cooking time was an essential feature of this user. Only if the preference for the quick cooking recurred, and the tendency lasted a certain length of time, it would be safe to say the user likes the quick cooking time as a permanent characteristic. To observe the tendency, we need users’ data with frequently recurring features and a certain amount of such users. If the sample number is too small, the level of confidence reduces. To keep the confidence high enough, instead of using the user as a whole, we used the data from a certain number of clusters (k number of users).


Both rule-based and k-means clustering was applied to the user cluster. As shown below, we use k-means available as a built-in algorithm in Amazon SageMaker.

Training in k-means and deploying inference endpoint.

role = os.getenv("SAGE_ROLE", "arn:aws:iam::XXXXXXXXXXXXXXXXXXXXXXX")
bucket = os.getenv("SAGE_OUT_BUCKET", "XXXXXXXXXXXXXXX")
data_location = 's3://ZZZZZZZZZ'
output_location = 's3://YYYYYYYYYYY'
k = N
kmeans = KMeans(role=role,
predictor = kmeans.deploy(initial_instance_count=1,instance_type='ml.m5.xlarge')

Analyze Tendency per Cluster

Analyzing each cluster reveals a particular feature. For example, one cluster tends to show a significant decline from the recommended day to the next day of the recipe in CTR (Click-through rate) and the number of favorites.

We hypothesize the CTR of a cluster tends to be influenced by the freshness of the recipe entry. This hypothesis was proven true in a test against the other cluster.

Also, users in this cluster tend to use the service on weekends, and the length of time to play the video was relatively short; therefore, we considered their marking a recipe as a favorite might suggest to keep the alternative method of their cooking style. They are experienced in cooking, so they did not even play the video while they were cooking. From this analysis, a recommendation of the recipe to the user in this cluster might be much more effective if the recipe has a high uniqueness. The first day of the recommendation marked high CTR and favorite numbers due to the freshness of the recipe entry.

Moreover, these users have a high frequency of usage of the service, and a tendency to move to new recipe quickly, we took off points from the rating of the recipes that they had already viewed or marked as a favorite. This adjustment increased CTR by 15%, comparing the case adjusted recommendation was offered and not CTR shift per cluster.

Line graph depicting the personalization improvements in terms of click through rate


In this article, we introduced our service, kurashiru.

At dely, since we are able to create our own recipe content, we can control both human wave-attack based and rule-based annotation, as well as how to apply feedback from user activities. In particular, Amazon SageMaker allows us to train and deploy models quickly without the cost of additional resources.

This year, we plan to expand this service to fine-tune the preference data of the latest recipes using reviews posted by users. Additionally, we plan to include information on particular family members, their preference for ingredients or food allergy, at the time of creating meals. This will ensure that the created meal will meet user requirements more. We believe these plans would develop the service to provide value to our users.

Also, by periodically analyzing the contents of recipes and videos, and promoting better reproducibility of recipes, we plan to keep providing high-quality content.

This was originally posted in Japanese at the AWS Japan blog.