Data Mining in Reaction Databases: Towards Predicting Biodegradation Products and Pathways
by
Stefan Kramer
Institute for Informatics
Technical University of Munich, Germany
Friday, November 14, 2008
12:00 lunch
12:15 seminar
402 Walter Library
http://wwwkramer.in.tum.de/kramer/stefan.html
I will present data mining methods for databases of chemical reactions. The overall goal is to predict reaction products and pathways for compounds without experimental data. Our approach builds on the University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.msi.umn.edu/), a database containing information on microbial biocatalytic reactions and biodegradation pathways for about 1000 chemical compounds. Recently, the UM-BBD has been extended by a knowledge-based system for the prediction of plausible microbial catabolic reactions and pathways, the University of Minnesota Pathway Prediction System (PPS, http://umbbd.msi.umn.edu/predict/). I will explain how data mining and machine learning can be used to address the main problem of such systems, namely the over-prediction of transformation products. The first approach extracts so-called relative reasoning rules for the resolution of conflicts among transformation rules. The second learns one classifier for each UM-PPS rule and provides probabilities of suggested transformation products. The prediction of reaction products and pathways is a challenging and rewarding area for data mining research.