Automated information extraction from free-text EEG reports

Biswal, S., Nip, Z., Moura Junior, V., Bianchi, M. T., Rosenthal, E. S. & Westover, M. B. (2015). Automated information extraction from free-text EEG reports. 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, doi: 10.1109/EMBC.2015.7319956

[img]
Preview
Text - Accepted Version
Download (677kB) | Preview

Abstract

In this study we have developed a supervised learning to automatically detect with high accuracy EEG reports that describe seizures and epileptiform discharges. We manually labeled 3,277 documents as describing one or more seizures vs no seizures, and as describing epileptiform discharges vs no epileptiform discharges. We then used Naïve Bayes to develop a system able to automatically classify EEG reports into these categories. Our system consisted of normalization techniques, extraction of key sentences, and automated feature selection using cross validation. As candidate features we used key words and special word patterns called elastic word sequences (EWS). Final feature selection was accomplished via sequential backward selection. We used cross validation to predict out of sample performance. Our automated feature selection procedure resulted in a classifier with 38 features for seizure detection, and 23 features for epileptiform discharge detection. The average [95% CI] area under the receiver operating curve was 99.05 [98.79, 99.32]% for detecting reports with seizures, and 96.15 [92.31, 100.00]% for detecting reports with epileptiform discharges. The methodology described herein greatly reduces the manual labor involved in identifying large cohorts of patients for retrospective neurophysiological studies of patients with epilepsy.

Item Type: Article
Additional Information: © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Uncontrolled Keywords: Bayes Theorem; Diagnosis, Computer-Assisted; Electroencephalography; Epilepsy; Humans; Machine Learning; ROC Curve; Retrospective Studies
Divisions: Cass Business School > Faculty of Management
URI: http://openaccess.city.ac.uk/id/eprint/18355

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year

View more statistics