TY - GEN
T1 - Using Secondary Knowledge to Support Decision Tree Classification of Retrospective Clinical Data
AU - O'Sullivan, Dympna
AU - Elazmeh, William
AU - Wilk, Szymon
AU - Farion, Ken
AU - Matwin, Stan
AU - Michalowski, Wojtek
AU - Sehatkar, Morvarid
N1 - Funding Information:
The support of the Natural Sciences and Engineering Research Council of Canada, the Canadian Institutes of Health Research and the Ontario Centres of Excellence is gratefully acknowledged.
PY - 2008
Y1 - 2008
N2 - Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.
AB - Retrospective clinical data presents many challenges for data mining and machine learning. The transcription of patient records from paper charts and subsequent manipulation of data often results in high volumes of noise as well as a loss of other important information. In addition, such datasets often fail to represent expert medical knowledge and reasoning in any explicit manner. In this research we describe applying data mining methods to retrospective clinical data to build a prediction model for asthma exacerbation severity for pediatric patients in the emergency department. Difficulties in building such a model forced us to investigate alternative strategies for analyzing and processing retrospective data. This paper describes this process together with an approach to mining retrospective clinical data by incorporating formalized external expert knowledge (secondary knowledge sources) into the classification task. This knowledge is used to partition the data into a number of coherent sets, where each set is explicitly described in terms of the secondary knowledge source. Instances from each set are then classified in a manner appropriate for the characteristics of the particular set. We present our methodology and outline a set of experiential results that demonstrate some advantages and some limitations of our approach.
UR - http://www.scopus.com/inward/record.url?scp=44649141930&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-68416-9_19
DO - 10.1007/978-3-540-68416-9_19
M3 - Conference contribution
AN - SCOPUS:44649141930
SN - 3540684158
SN - 9783540684152
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 238
EP - 251
BT - Mining Complex Data - ECML/PKDD 2007 Third International Workshop, MCD 2007, Revised Selected Papers
T2 - 3rd International Workshop on Mining Complex Data, MCD 2007
Y2 - 17 September 2007 through 21 September 2007
ER -