KGRec: An interpretable recommendation system using a knowledge graph

The Japanese version is available here. 日本語の記事はこちらになります。

Hello everyone. I am @joisino. I am doing an internship in Mercari from August 1st. I built a recommendation system using a knowledge graph. I introduce the results in this blog post.


As the number of items in Mercari increases, it becomes more and more difficult for users to find their favorite items. Thus the recommendation system plays a central role in finding items. However, in Mercari, each item is unique, and the same item will not be sold again. In other words, all items are suffered from the cold-start problem. This fact makes implementing an effective recommendation system difficult. In this internship, I built a catalog-based recommendation system using a knowledge graph, named KGRec, based on KGCN [Wang et al., 2019]. This system
(1) alleviates the cold-start problem by using the catalog and knowledge graph,
(2) is interpretable by figuring out which attributes (e.g., an author of a book and a genre) affect the result,
(3) can measure similarities between any attributes (e.g., item-item similarity, author-author similarity), and
(4) is scalable because it can recommend items just by running the nearest neighbor search in the 32-dimension Euclidean space, which works fast in practice.
Especially, KGRec can perform zero-shot recommendation: KGRec can recommend items that have never been bought by any user thanks to the information of the knowledge graph.
The table below contrasts KGRec against other recommendation systems.

Query Solves cold-start Zero-shot Interpretable Attribute Similarity Scalable
KGRec, KGCN [Wang+ 19]
RippleNet [Wang+ 18]
KPRN [Wang+ 19]
Catalog-based MF
Item-based MF

KGRec inherits the strong points of KGCN. KGRec further improves its performance and interpretability by the user-attribute loss and tuning for Mercari dataset. Existing KG-based recommendation systems such as RippleNet and KPRN also have some of the characteristics of KGRec, but not all of them. For example, KPRN is not scalable because it uses LSTMs to compute user-item scores.


In this section, I introduce what a knowledge graph is. A knowledge graph stores complex relations of information in a single graph. There are a number of knowledge graphs available online such as DBPedia and YAGO. However, they do not suit our purpose because most of the items in Mercari do not appear in such knowledge graphs. Therefore, I build an original knowledge graph based on the catalog data in Mercari. The table below shows examples of catalog data.

Title Authors Publisher Target Category
ぼくは王さま 和歌山静子/寺村輝夫 理論社 Infants
あいうえおうさま 和歌山静子/寺村輝夫/杉浦範茂 理論社 Infants 仮名

The figures below show a part of the knowledge graph I constructed for a book recommendation.


This knowledge graph stores which author wrote a book, which company published a book, what a book is about, and so on. Books that have common attributes (e.g., authors, genres) are connected indirectly via attribute nodes (e.g., ぼくは王さま — written by –> 寺町輝夫 <– written by — あいうえおうさま). This information enhances the performance of the recommendation. For example, thanks to the knowledge graph, we can recommend a book that was written by the same author as a book the user bought before.

Related work

There are many KG-based recommendation systems in the literature. In this section, I review the existing work briefly. HeteRec [Yu et al., 2014] is one of the first methods that utilize a knowledge graph for a recommendation. This enumerates paths between a user and an item in the knowledge graph to measure the similarity between the user and item. CKE [Zhang et al., 2016] utilizes embedding of entities of a knowledge graph for a recommendation as well as text and visual information extracted by autoencoders. RippleNet [Wang et al., 2018] learns how to aggregate information in a knowledge graph by the attention mechanism, which is utilized for the recommendation task. Recently, many methods have combined graph neural networks with KG-based recommendation systems. KGCN [Wang et al., 2019] is one of such methods. This aggregates information from neighboring nodes of a knowledge graph based on user-relation scores, which model how much important a user thinks each relation is. For example, thanks to the user-relation scores, KGCN can recommend a book based on its author to users that choose books based on the author, and this can also recommend a book based on its topic (e.g., self-development, family, and physics) to users that think topics are important to choose a book. KGCN suits best for our purpose because it is catalog-based, interpretable, and scalable. Therefore, I built KGRec based on KGCN.


In this section, I explain the method in detail. In KGRec, each node  v in the knowledge graph, including item nodes and attribute nodes, has an embedding vector  mathbf{h}_v in mathbb{R}^{d}. Each user  u and each relation  r (e.g., “is written by” and “is about”) also have embeddings  mathbf{h}_u, mathbf{h}_r in mathbb{R}^{d}, respectively. First, KGRec calculates an embedding of item  i customized for user  u by aggregating attributes from neighboring nodes:

 displaystyle hat{mathbf{h}}_{iu} = text{tanh}(sum_{(v, r) in mathcal{N}(i)} alpha_{iur} mathbf{W} mathbf{h}_v),

where  mathcal{N}(i) is the set of incident edges of  i,  v is a neighboring attribute,  r is a relation,  mathbf{W} in mathbb{R}^{d times d} is a paramter matrix,  alpha_{iur} is the user-relation score, which I describe next. User-relation scores model how much important a user thinks each relation is. This is modeled by the attention mechanism:

 displaystyle alpha_{iur} = frac{exp(mathbf{h}_u^{top} mathbf{h}_r)}{sum_{(v, r') in mathcal{N}(i)} exp(mathbf{h}_u^{top} mathbf{h}_{r'})}.

Finally, KGRec calculates the recommendation scores of items by inner product:

 text{score}(u, i) = mathbf{h}_u^{top} hat{mathbf{h}}_{iu},

and recommends top-K items in terms of the recommendation scores. In the training, I used BPR-loss [Randle et al., 2009]:

 displaystyle mathcal{L}_{text{BPR}} = sum_{(u, p, n) in mathcal{T}} sigma(mathbf{h}_u^{top} hat{mathbf{h}}_{pu} - mathbf{h}_u^{top} hat{mathbf{h}}_{nu})

where  mathcal{T} is training data, which contains triplets of a user, positive item, and negative item, and  sigma is the sigmoid function. In addition to this loss, I found that user-attribute BPR loss improves performance and stability:

 displaystyle mathcal{L}_{text{UA-BPR}} = sum_{(u, p_{text{att}}, n_{text{att}}) in mathcal{T}_{text{att}}} sigma(mathbf{h}_u^{top} mathbf{h}_{p_{text{att}}} - mathbf{h}_u^{top} mathbf{h}_{n_{text{att}}})

where  mathcal{T}_{text{att}} is training data, which contains triplets of a user, attribute of a positive item, and attribute of a negative item. This loss explicitly associates user embeddings with their favorite attributes, which improves interpretability. The loss function for KGRec is:

 mathcal{L} = mathcal{L}_{text{BPR}} + gamma mathcal{L}_{text{UA-BPR}} + eta |theta|_2^2,

where  theta denotes all parameters of KGRec (i.e.,  mathbf{h}_v, mathbf{h}_u, mathbf{h}_r, and  mathbf{W}) and  gamma and  eta are hyperparamters.


I aim to answer the following questions through extensive experiments:

  • Q1: Does KGRec perform better than ordinary catalog-based recommendations especially in the cold-start setting?
  • Q2: Is the result of KGRec interpretable?
  • Q3: Can KGRec recommend attributes as well as items?
  • Q4: Is KGRec scalable?

Experimental Setup:

I use Mercari book data through experiments. I set the dimension of embeddings as 32 through experiments. I train models using Adam optimizer [Kingma and Ba, 2015] with learning rate = 0.001.

Experiments for Q1:

To show the effectiveness of KGRec, I measure the recall of KGRec and the catalog-based matrix factorization [Koren et al., 2009] trained by the BPR-loss for Mercari book data. I conduct two experiments. In the first experiment, I use items that are purchased at least 5 times, which is a common setting in the recommendation research. In the second experiment, I use all items, including items that are not bought by any user in the training dataset, which is extremely difficult to predict. This is a more realistic setting because most of the items are purchased less than 5 times.

Since there are as many as 262661 and 774716 items in the subsampled and all item sets, respectively, it is too time-consuming to exactly compute the exact ranking for all users. Therefore, I resort to the sampling scheme. Namely, For each user, I sample 199 items randomly (therefore 200 items with the ground truth item in total) and measure recall@5, recall@10, recall@20, and MRR of the ground truth item in the test data.

The tables below report the result. The result of the first experiment indicates that KGRec outperforms the catalog-based matrix factorization method. More importantly, KGRec further outperforms the matrix factorization in the second experiment. This shows that KGRec effectively alleviates the cold-start problem. Note that the scores in the second experiment tend to be higher than those in the first experiment because the number of irrelevant items increases in the negative sample in the latter experiment.

Subsampled Items
Matrix Factorization
Chance Rate
All Items
Matrix Factorization
Chance Rate

Experiments for Q2:

We can infer why KGRec outputs the result by examining user-relation scores and vector similarities between user embedding  mathbf{h}_u and attribute embeddings  mathbf{h}_i. The table below shows examples of recommendations generated by KGRec1. For example, User #1 bought many novels written by Leo Tolstoy and Fyodor Dostoevsky. Therefore, KGRec recommends novels written by them. User #2 bought many books about dressmaking for infants. Therefore, KGRec recommends similar books about dressmaking to this user. User #3 bought many textbooks for high school students. Therefore, KGRec recommends similar textbooks to this user. Furthermore, KGRec succeeds in providing reasons for the recommendation. For example, in the second example, KGRec recommends textbooks because these books target at high school students. It should be noted that these reasons are also provided by KGRec automatically. Note also that KGRec does not use the attribution of users (e.g., a user is a high school student) but implicitly learns them from buying history data.

Buying History of User #1 Recommended Items for User #1 Reason for the Recommendation
Anna Karenina War and Peace This book is written by Leo Tolstoy
Resurrection Snow Country This book is written by Yasunari Kawabata
Notes from Underground Crime and Punishment This book is written by Fyodor Dostoevsky
Demons The Double This book is written by Fyodor Dostoevsky
Walk in the Light While There Is Light 女であること This book is written by Yasunari Kawabata
One Day in the Life of Ivan Denisovich The Idiot This book is written by Fyodor Dostoevsky
The Raw Youth Poor Folk This book is written by Fyodor Dostoevsky
White Nights The Death of Ivan Ilyich/The Kreutzer Sonata This book is written by Leo Tolstoy
Ivan the Fool Call If You Need Me This book is translated by Haruki Murakami
The Overcoat/The Nose Will You Please Be Quiet, Please? This book is translated by Haruki Murakami
Buying History of User #2 Recommended Items for User #1 Reason for the Recommendation
おうちで給食ごはん 小さくてもきちんとした服 This book is about dressmaking for children
愛情いっぱい手作りの赤ちゃん服 かんたん!かわいい!はじめての赤ちゃん服と小物 This book is about dressmaking for children
ハンドメイドベビー服 きれいに縫うための基礎の基礎 This book is about dressmaking
いちばんよくわかる赤ちゃんと小さな子の服 女の子のまいにちの服 This book is about dressmaking for children
3歳からのおべんとう 女の子のシンプルでかわいい服 This book is about dressmaking for children
ぼく、アンパンマン! 簡単に作れて、とことん使える日常着 This book is about dressmaking for women
ちくちくはじめて赤ちゃんスタイ 初めての育児 This book is published by ひよこクラブ
ワンポイント刺しゅう図案365 女の子のおしゃれ服 This book is about dressmaking for children
1色刺繍と小さな雑貨 おしゃれが好きな女の子の服 This book is about dressmaking for children
あかちゃんのために作るもの 子供服ソーイングLESSON BOOK This book is about dressmaking for children
Buying History of User #3 Recommended Items for User #2 Reason for the Recommendation
漢文ヤマのヤマ 漢文道場 This book is published by Z会
得点奪取現代文 古文上達 This book is written by 仲光雄
パラグラフリーディングのストラテジー 1 大学入試全レベル問題集現代文 3 This book targets at high school student
大学入試全レベル問題集現代文 1 中堅私大古文演習 This book targets at high school student
大学入試全レベル問題集現代文 2 実力をつける日本史100題 This book is published by Z会出版編集部
実力をつける世界史100題 パラグラフリーディングのストラテジー 2 This book targets at high school student
はじめる日本史 要点&演習 古文 This book targets at high school student
生物〈生物基礎・生物〉基礎問題精講 センター試験現代社会集中講義 This book targets at high school student
センター古文満点のコツ 「源氏」でわかる古典常識 This book targets at high school student
金の漢字最強編 大学受験 大学入試全レベル問題集現代文 4 This book targets at high school student

Experiments for Q3:

KGRec learns to embed attributes such as authors, publishers, and categories as well as items. Therefore, KGRec can recommend attributes based on other attributes via the nearest neighbor search of  mathbf{h}_i. In this section, I demonstrate an author recommendation based on other authors. The table below shows examples of author recommendations generated by KGRec. For example, 東野圭吾 (Keigo Higashino) and 伊坂幸太郎 (Kotaro Isaka) are famous novelists, known for their mystery novels. 辻村深月 (Mizuki Tsujimura) and 湊かなえ (Kanae Minato) are also famous novelists known for their mystery novels. 池上彰 (Akira Ikegami) is a journalist and writes many books for business people about economics and politics. 斎藤孝 (Takashi Saito) and 茂木健一郎 (Ken Mogi) also write many books for business people. 荒木飛呂彦 (Hirohiko Araki) is a manga artist known for JoJo’s Bizarre Adventure. 諫山創 (Hajime Isayama) and 荒川 弘 (Hiromu Arakawa) are also manga artists known for Attack on Titan and Fullmetal Alchemist, respectively. 有元葉子 (Yoko Arimoto) writes many books about home cooking for homemakers. ワタナベマキ (Maki Watanabe) and 梶晶子 (Akiko Kaji) also write many books about home cooking. 柳沢小実 (Konomi Yanagisawa) and 石村由起子 (Yukiko Ishimura) write books about interiors for homemakers. These results demonstrate that KGRec can make not only for user-based item recommendation but also author-based author recommendation. This result can be naturally extended to other attributes such as user-based author recommendation and item-based item recommendation.

1st 辻村深月 有川浩 齋藤孝 諫山創 ワタナベマキ
2nd 湊かなえ 東野圭吾 茂木健一郎 荒川弘 柳沢小実
3rd 薬丸岳 朝井リョウ 林修 小畑健 石村由起子
4th 西加奈子 恩田陸 古市憲寿 石田スイ 石井佳苗
5th 三浦しをん 荻原浩 塚本亮 森薫 梶晶子
6th 雫井脩介 西加奈子 池谷裕二 長月達平 中川たま
7th 伊岡瞬 中村文則 野中香方子 堀越耕平 山脇りこ
8th 芦沢央 真梨幸子 テレビ東京報道局 冨樫義博 青山有紀
9th 池井戸潤 乾くるみ 岸見一郎 小太刀右京 三宅郁美
10th 村田沙耶香 島本理生 外山滋比古 押見修造 渡辺有子

Experiments for Q4:

In this section, I confirm the scalability of KGRec. Since the dimension of embeddings in KGRec is 32, it is expected that KGRec is highly scalable. I conduct item-item recommendation, where similarity is calculated by cosine similarity of  mathbf{h}_i. I conduct this experiment on an instance of Google Compute Engine with single vCPU without GPU. There are 262661 items in total, and I calculate scores of all items with exact computation one by one. In average, it takes only 0.09 second to calculate the full ranking for each item. Considering the computation is done with only a single CPU, and this is exact computation, KGRec is highly efficient. KGRec can be further accelerated by multiple CPUs or GPUs, or efficient search libraries such as faiss, FLANN, and annoy. I leave this direction for future work.


In this internship, I built KGRec, a catalog-based recommendation system using a knowledge graph. This alleviates the cold-start problem, is interpretable, and scalable. I demonstrated the effectiveness of KGRec through extensive experiments.

There are many existing knowledge graphs such as YAGO, DBpedia, Google’s knowledge graph, and Satori. Utilizing such off-the-shelf knowledge graphs for the recommendation in Mercari is an interesting future direction. Most importantly, deploying KGRec to the Mercari C2C marketplace app to measure the effectiveness of this method in the real-world environment is promising future work.


Wenqi Fan, Yao Ma, Qing Li, Yuan He, Eric Zhao, Jiliang Tang, Dawei Yin. Graph Neural Networks for Social Recommendation. WWW, 2019.

Diederik P. Kingma, Jimmy Ba. Adam: A Method for Stochastic Optimization. ICLR, 2015.

Yehuda Koren, Robert M. Bell, Chris Volinsky. Matrix Factorization Techniques for Recommender Systems. IEEE Computer, 2009.

Steffen Rendle, Christoph Freudenthaler, Zeno Gantner, Lars Schmidt-Thieme. BPR: Bayesian Personalized Ranking from Implicit Feedback. UAI, 2009.

Hongwei Wang, Fuzheng Zhang, Jialin Wang, Miao Zhao, Wenjie Li, Xing Xie, Minyi Guo. RippleNet: Propagating User Preferences on the Knowledge Graph for Recommender Systems. CIKM, 2018.

Hongwei Wang, Miao Zhao, Xing Xie, Wenjie Li, Minyi Guo. Knowledge Graph Convolutional Networks for Recommender Systems. WWW, 2019.

Hongwei Wang, Fuzheng Zhang, Mengdi Zhang, Jure Leskovec, Miao Zhao, Wenjie Li, Zhongyuan Wang. Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems. KDD, 2019.

Xiang Wang, Xiangnan He, Yixin Cao, Meng Liu, Tat-Seng Chua. KGAT: Knowledge Graph Attention Network for Recommendation. KDD, 2019.

Xiang Wang, Dingxian Wang, Canran Xu, Xiangnan He, Yixin Cao, Tat-Seng Chua. Explainable Reasoning over Knowledge Graphs for Recommendation. AAAI, 2019.

Xiao Yu, Xiang Ren, Yizhou Sun, Quanquan Gu, Bradley Sturt, Urvashi Khandelwal, Brandon Norick, Jiawei Han. Personalized entity recommendation: a heterogeneous information network approach. WSDM, 2014.

Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, Wei-Ying Ma. Collaborative knowledge base embedding for recommender systems. KDD, 2016.

  1. In this experiment, I do not show buying histories of real users but virtual users to protect the privacy of Mercari users.