3. Discovering Characteristic Patterns from Collection of Classical Japanese Poems

   Masayuki Takeda,Mayumi Yamasaki,Tomoko Fukuda and Ichiro Nanri
 Proc. First International Conference on Discovery Science (DS'98),pp. 129-140, 平成10年12月

WAKA is a form of traditional Japanese poetry with a 1300year history.In this paper,we attempt to discover characteristics common to a collection of WAKA poems.As a formalism for characteristics,we use regular patterns where the constant parts are limited to sequences of auxiliary verbs and postpositional particles .We call such patterns FUSHI.The problem is to find automatically significant FUSHI patterns that characterize the poems.
Solving this problem requires a reliable significance measure for the patterns.Brazma et al.(1996)proposed such a measure according to the MDL principle.Using this method,we report successful results in finding patterns from five anthologies.Some of the results are quite stimulating,and we hope that they will lead to new discoveries.Based on our experience,we also propose a pattern-based text data mining system.Further research into WAKA poetry is now proceeding using this system.