動的表現バイアスに基づくマルチタスクデータマイニング
科学研究費補助金 基盤研究(B)(一般) 知能情報学 課題番号21300053
2009/4/1 - 2013/3/31
研究組織 | ||
研究代表者,教授 |
研究概要:
本研究の目標は,互いに関連する複数のパターン発見タスクに効果的に対処するために,データとパターンの表現形式を自動的に変更する新しいデータマイニング手法を開発し,計算機システムとして実装して人工・実データでその有効性を示すことである.本年度は主に,昨年度開発した拡張MDL原理を用いるルール群発見・分類学習用マルチタスクデータマイニング手法の有効性を調べて改良・拡張し,コルモゴロフ複雑性に基づく情報量距離を用いるクラスタリング用マルチタスクデータマイニング手法を開発してその有効性を調べた.前者に関してはまず,昨年度開発した拡張MDL基準に基づく選言標準形概念用の手法を,符号長の計算などを厳密に見直すことで改良した.次に改良した手法を多数の人工データおよび機械学習標準データ集合などに適用して正答率・ノイズ耐性などに関する有効性を示した.後者に関してはまず,文字列などシーケンスデータを対象とし,LZW圧縮器を用いる近似情報量距離において,自タスクと他タスクにおける関連事例集合を反復的に求めて用いる新しい情報量距離を考案した.次にこの情報量距離を用いるクラスタリング手法を開発し,人工データ,単言語・多言語のテキストデータ,ウェブデータなどに適用してその有効性を確認した.さらに例分布を多様体としてとらえ重心などの幾何学的性質と例ペアの所属クラスに関する制約を用いるマルチタスクデータマイニング用の次元縮退手法も開発し,人工データやテキストデータに適用してその有効性を確認した.
主要成果:
1. Einoshin Suzuki: "Compression-based Measures for Mining Interesting
Rules", Next-Generatation Applied Intelligence (IEA/AIE), LNAI
5579, pp. 741-746, Springer-Verlag, Tainan, Taiwan, 2009 (invited talk at a special session).
2. Daisuke Ikeda, Einoshin Suzuki: Mining Peculiar Compositions of
Frequent Substrings from Sparse Text Data using Background Texts,
Machine Learning and Knowledge Discovery in Databases (ECML/PKDD),
Vol. 1, LNAI 5781, Springer-Verlag, pp. 596-611, September 2009, Bled, Slovenia.
3. JianBin Wang, Bin-Hui Chou, Einoshin Suzuki: Finding the k-Most Abnormal Subgraphs from a Single Graph,
Discovery Science, Lecture Notes in Artificial Intelligence 5808 (DS), Springer-Verlag, pp. 441-448,
October 2009, Porto, Portugal.
4. Bin Tong and Einoshin Suzuki: "Subclass-oriented Dimension
Reduction with Constraint Transformation and Manifold
Regularization", Advances in Knowledge Discovery and Data Mining
(PAKDD), Part II, LNAI 6119, Springer-Verlag,
pp. 1-13, June 2010, Hyderabad, India.
5. Bin Tong, Shao Hao, Bin-Hui Chou, and Einoshin Suzuki:
"Semi-Supervised Projection Clustering with Transferred Centroid
Regularization", Machine Learning and Knowledge Discovery in
Databases (ECML/PKDD), Part III, LNCS 6323, Springer-Verlag,
pp. 306-321, September 2010, Barcelona.
6. Bin Tong, ZhiGuang Qin, and Einoshin Suzuki:
"Topology Preserving SOM with Transductive Confidence Machine", Discovery Science, Lecture Notes in Artificial Intelligence (DS), LNAI 6332, Springer-Verlag,
pp. 27-41, October 2010, Canberra.
7. Einoshin Suzuki: "Discovering a Partial Decision List for
Understanding the Controller of a Reactive Robot", Mining patterns
and subgroups (MPS). Lorentz Center Workshop, University of Leiden,
November 2010, Leiden, Netherlands.
8. Bin-Hui Chou and Einoshin Suzuki: "Role Discovery for Graph
Clustering", Web Technologies and Applications (APWeb 2011), pp. 17-28, LNCS 6612,
Springer-Verlag, Beijing, April 2011.
9. Bin Tong, Junbin Gao, Nguyen Huy Thach, and Einoshin Suzuki: "Gaussian Process for Dimensionality
Reduction in Transfer Learning", Proc. Eleventh SIAM International
Conference on Data Mining (SDM 2011),
pp. 783-270, Phoenix/Mesa, Arizona, April 2011.
10. Shao Hao and Einoshin Suzuki: "Feature-based Inductive Transfer Learning through Minimum Encoding", Proc. Eleventh SIAM International
Conference on Data Mining (SDM 2011),
pp. 259-270, Phoenix/Mesa, Arizona, April 2011.
11. Nguyen Huy Thach, Shao Hao, Bin Tong, and Einoshin Suzuki: "A
Compression-based Dissimilarity Measure for Multi-task Clustering",
Foundations of Intelligent Systems, LNAI 6804 (ISMIS 2011), pp. 123-132, Springer, Warsaw, June 2011.
12. Shao Hao, Bin Tong, and Einoshin Suzuki:
"Compact Coding for Hyperplane Classifiers in Heterogeneous Environment", Machine Learning and Knowledge Discovery in
Databases (ECML/PKDD), Part III, LNCS 6913, Springer-Verlag,
pp. 207-222, September 2011, Athens.
13. Hiroshi Hirai, Bin-Hui Chou, and Einoshin Suzuki:
A Parameter-Free
Method for Discovering Generalized Clusters in a Network, Discovery
Science (DS 2011), LNAI 6926, Springer-Verlag, pp. 135-149, October 2011,
Espoo - Helsinki.
14. Shin Ando and Einoshin Suzuki: Role-Behavior Analysis from Trajectory
Data by Cross-Domain Learning, Proc. Eleventh IEEE
International Conference on Data Mining (ICDM 2011), pp. 21-30,
December 2011, Vancouver.
15. Bin Tong, Weifeng Jia, Yanli Ji, and Einoshin Suzuki:
"Linear Semi-Supervised Dimensionality Reduction with Pairwise
Constraint for Multiple Subclasses", IEICE Transactions on Information and Systems,
Vol. E95-D, No. 3, pp. 812-820, March 2012.
16. Bin Tong, Hao Shao, Bin-Hui Chou, and Einoshin Suzuki:
"Linear Semi-Supervised Projection Clustering by Transferred Centroid Regularization", Journal of Intelligent Information Systems,
Vol. 39, No. 2, pp. 461-490, Springer, October, 2012.
17. Bin-Hui Chou and Einoshin Suzuki:
"RoClust: Role Discovery for Graph Clustering", Web Intelligence
and Agent Systems, An International Journal, Vol. ?, No. ?,
pp. ?-?, IOS Press (accepted for publication).
18. Hao Shao, Bin Tong, and Einoshin Suzuki:
"Extended MDL Principle for Feature-Based Inductive Transfer
Learning", Knowledge and Information Systems, An International Journal, Vol. ?, No. ?,
pp. ?-?, Springer (accepted for publication).
19. Thach Nguyen Huy, Hao Shao, Bin Tong, and Einoshin Suzuki:
"A Feature-Free and Parameter-Light Multi-Task Clustering Framework", Knowledge and Information Systems, An International Journal, Vol. ?, No. ?,
pp. ?-?, Springer (accepted for publication).
20. Bin Tong, Junbin Gao, Thach Nguyen Huy, Hao Shao, and Einoshin Suzuki:
"Transfer Dimensionality Reduction by Gaussian Process in Parallel", Knowledge and Information Systems, An International Journal, Vol. ?, No. ?,
pp. ?-?, Springer (accepted for publication).
21. Thach Nguyen Huy, Bin Tong, Hao Shao, and Einoshin Suzuki:
"Transfer Learning by Centroid Pivoted Mapping in Noisy Environment", Journal of Intelligent Information Systems,
Vol. ?, No. ?, pp. ?-?, Springer (accepted for publication).
22. Bin-Hui Chou and Einoshin Suzuki:
"Detecting Academic Plagiarism with Graphs",
Extraction et Gestion des Connaissances (EGC'2013),
pp. 293-304, Toulouse, France, January 2013.
23. Shin Ando and Einoshin Suzuki:
"Time-sensitive Classification of Behavioral Data", Proc. Thireenth SIAM International
Conference on Data Mining (SDM 2013),
pp. ?-?, Austin, Texas, April 2013 (accepted for publication).