Reaction Graph (graph of graph) data

Explanation
Reaction graphs are described in a modified MOL format. The difference from the original format
are:
- The first line includes the KEGG REACTION ID and the EC number(s) of the reaction
- Atom identification numbers are replaced by KEGG COMPOUND IDs
- Bond identification numbers are replaced by KEGG REACTION PAIR types (1:group, 2:main, 3:leave, 4:cofactor, 5:transferase, 6:ligase)
For example, reaction(R03555)
Loganin(C01433) +
NADPH(C00005) +
H^+(C00001) +
Oxygen(C00080)
<=> Secologanin(C01852) +
NADP^+(C00006) +
H2O(C00001)
is represented in this way.
Data Download
For Leave-one-out cross validation experiment
Accuracy in terms of Leave-one-out cross validation using RGK(q=0.1,0.1) and Nearest Neighbor
|
EC class | EC subclass | EC subsubclass |
full-edge | 94.8% | 86.0% | 82.5% |
RPAIR | 92.3% | 81.4% | 78.1% |
main-pair | 77.8% | 69.8% | 66.2% |
For blind test
Number of correct predictions and accuracy in 36 test sets
|
Coverage | EC main | EC subclass | EC subsubclass |
RGK(top1) | 100% | 22(61.1%) | 14(38.9%) | 12(33.3%) |
RGK(top3) | 100% | 56(51.9%) | 30(27.8%) | 24(22.2%) |
RGK(top5) | 100% | 86(47.8%) | 37(20.6%) | 27(15.0%) |
E-zyme(top1) | 61.1% | 14(63.6%) | 10(45.5%) | 8(36.4%) |
E-zyme(top3) | 61.1% | 42(63.6%) | 24(36.4%) | 18(27.3%) |
E-zyme(top5) | 61.1% | 57(51.8%) | 30(27.3%) | 24(21.8%) |
Reference
[Saigo2010], Hiroto Saigo, Masahiro Hattori, Hisashi Kashima and Koji Tsuda
Reaction Graph Kernels Predict EC Numbers of Unknown Enzymatic Reactions in Plant Secondary Metabolism
BMC Bioinformatics 11 (Supple 1), 1-7, (01 2010), also appeard in Asia Pacifc Bioinformatics Conference (APBC2010)
Contact: saigo@i.kyushu-u.ac.jp