The purpose of the paper is to analyze the difference of facial expressions between male and female in picture scrolls created by multiple painters in the middle ages of Japan (13th to 16th centuries), and to clarify the production situation.

Fig. 1: Gender with different codes (left) and same codes (right).

In the Japanese picture scrolls of that time, male and female are drawn differently by combining codes. For example, warriors (武士) and nobles (貴族) are distinguished between gender by codes such as clothes and hairstyles. However, the monks (僧侶) and nuns (尼僧) are ambiguous because they have basically same codes (Fig1). Hence monks and nuns can only be distinguished by facial expressions.

Analysis of facial expressions is also important to restore the original works. Because many manuscripts were copied in the Middle Ages of Japan. Research questions, such as the individuality of the characters in the original works, can be answered by examining facial expressions in the copied manuscripts.

To analyze the multiple facial expressions, the following way has been traditionally used. First, place a labeled photo that capture all or part of art works on a large table. Second, rearrange photos for comparison and grouping. We call this method “GM method”. We show that digital technology, such as IIIF Curation Platform (ICP) (Kitamoto et al., 2018) and the machine learning with KaoKore Dataset (Tian et al., 2020), greatly enhances the usefulness and potential of the GM method.

The name GM method comes from the Italian art historian Giovanni Morelli (1816-1891) who is the founder of the style comparative study of art history with special attention to details of art works. GM also stands for Gazing Microcontents to take advantage of fragmented content (micro-content) for research (Suzuki and Kitamoto, 2019).


The target of this paper is one copy of Yugyo Shounin Engi-Emaki, “Shojo-Kouji Kouhon” 遊行上人縁起絵巻 清浄光寺 甲本 (Kouhon) archived in Shojo-Kouji Temple. The original and lost version of Yugyo Shounin Engi-Emaki is a 10-volume picture scroll depicting the establishment of the Jishu 時宗 sect of Buddhism in the Kamakura period (12th to 14th). Yugyo Shounin Engi-Emaki was copied many times during the early 14th and 17th centuries. According to previous research, multiple painters shared the work of Kouhon and modern and classical styles are mixed (Takagishi, 2020).

We have a particular interest in Yugyo Shounin Engi-Emaki in the context of the Jishu sect. In the Middle Ages, Jishu has been criticized for mixing different genders, namely monks and nuns, in the same group. Therefore, in the every copy of Yugyo Shounin Engi-Emaki, monks and nuns are drawn on the same scene, but are arranged in two separate groups with clear signs such as Juuni-Kou-Bako 十二光筥 to claim that Jishu is a disciplined sect (Fig2). These signs allow us to distinguish monks and nuns having the same code.

Fig. 2: Two groups with Juuni-Kou-Bako (three-colored lid box in the center).

Kouhon has 28 scenes where separated monks and nuns are drawn. According to previous research, these scenes were drawn by painter A and B (both names are unknown) (Iwase, 1989). By focusing on difference between painter A and B in terms of facial expressions of monks and nuns, these scenes allow us to analyze how Kouhon was copied from the lost original work.


The GM method with ICP consists of two steps. Frist, we used ICP to create curations for all facial expressions in Kouhon. At this step, we have two tasks, 1) to draw rectangles surrounding each facial expression, 2) to add basic metadata (gender, status, direction, source) of facial expressions. These two tasks should be performed by human experts, but the first task could be done faster using a machine learning-based face detector. The effect of machine learning for reduction of annotation time is discussed in another paper (Mermet et al., 2020).

Second, we analyzed 579 facial expressions appeared in the 28 scenes. At this step, we used the IIIF Curation Board (ICBoard). ICBoard shows an element of a curation like a card placed on a flat mat, which can be moved or arranged. ICBoard has many functions to help the GM method such as grouping mats and color-coded stickers.

By rearranging 579 facial expression, difference in drawing style of two painters became clear. Painter A draws a unique facial expression in deep colors. Both genders are drawn similar. The outlines of both monks and nuns have smooth curves. The variation within same gender is small. Painter B draws in light colors, the differences between genders are apparent. The outlines of the monks are rugged, while the nuns are smooth. The variation within gender is large, such as the hair shaving marks (Fig3).

Fig. 3: Facial Expression by painter A (above) and painter B (below). Marked: Red = Painter A, Blue = Painter B, Yellow = Monk

This result suggests different attitude of two painters on copying the lost original scroll. Although Painter A copied of the original composition, added his unique style. On the other hand, painter B draw different facial expressions between monks and nuns because he tried to faithfully copy the original works. We argue that monks and nuns might have been individually identified in the original picture scroll as illustrated by painter B.


Using the GM method with digital technology, a large number of facial expressions were smoothly analyzed. Analysis of this scale, such as 500 photos, has been hard for real photos but the digital method using ICP allows us to embark on a comprehensive analysis of facial expressions. In addition, the GM method can be more powerful using machine learning such as object detection and classification algorithms.

Our results support the previous study that Kouhon was created by multiple painters. Furthermore, this result brings about a new research question as follows. Similar facial expressions appear in different scenes by painter B. If painter B faithfully copied the characters in the original, these similar characters have been intentionally drawn in the lost original picture scroll as the important person. This is an important both for art history and Jishu sect.


We thank the Shojo-Kouji Temple and Yugyoji Museum for allowing us to use images for this research.


Kitamoto, A., Homma, J. and Saier, T. (2018) “IIIF Curation Platform: Next Generation IIIF Open Platform Supporting User-Driven Image Sharing.” Jinmoncom Symposyum 2018 : 327-334.

Tian, Y., Suzuki, C., Clanuwat, T., Bober-Irizar, M., Lamb, A., and Kitamoto, A. (2020). “KaoKore: A Pre-modern Japanese Art Facial Expression Dataset.” arXiv: 2002.08595.

Suzuki, C., and Kitamoto, A. (2019). “Using Pre-Modern Japanese Book 江戸買物独案内 as Microcontents for Reconstruction the City of Edo.” Jinmoncom Symposyum 2019 : 11-18.

Takagishi, A. (2020). Reflections on medieval Yamato-e. Tokyo: Yoshikawa-koubunkan.

Iwase, H. (1989). “Yugyo-Shonin-Engi-Emaki; Seijoko-Ji-Bon ni tsuite.” Ars buddhica (185): p51-59

Mermet, A., Kitamoto, A., Suzuki, C. and Takagishi, A. (2020). “Automated Face Detection for Pre-modern Japanese Artworks using Deep Neural Networks.” JADH2020 (in press)

IIIF Curation Board