City Research Online

Automated information extraction from free-text EEG reports

Biswal, S., Nip, Z., Moura Junior, V. , Bianchi, M. T., Rosenthal, E. S. & Westover, M. B. (2015). Automated information extraction from free-text EEG reports. 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2015, 2015-N, pp. 6804-6807. doi: 10.1109/embc.2015.7319956


In this study we have developed a supervised learning to automatically detect with high accuracy EEG reports that describe seizures and epileptiform discharges. We manually labeled 3,277 documents as describing one or more seizures vs no seizures, and as describing epileptiform discharges vs no epileptiform discharges. We then used Naïve Bayes to develop a system able to automatically classify EEG reports into these categories. Our system consisted of normalization techniques, extraction of key sentences, and automated feature selection using cross validation. As candidate features we used key words and special word patterns called elastic word sequences (EWS). Final feature selection was accomplished via sequential backward selection. We used cross validation to predict out of sample performance. Our automated feature selection procedure resulted in a classifier with 38 features for seizure detection, and 23 features for epileptiform discharge detection. The average [95% CI] area under the receiver operating curve was 99.05 [98.79, 99.32]% for detecting reports with seizures, and 96.15 [92.31, 100.00]% for detecting reports with epileptiform discharges. The methodology described herein greatly reduces the manual labor involved in identifying large cohorts of patients for retrospective neurophysiological studies of patients with epilepsy.

Publication Type: Article
Additional Information: © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Publisher Keywords: Bayes Theorem; Diagnosis, Computer-Assisted; Electroencephalography; Epilepsy; Humans; Machine Learning; ROC Curve; Retrospective Studies
Departments: Bayes Business School > Management
SWORD Depositor:
[thumbnail of 2015 - Automated information extraction from free-text EEG reports.pdf]
Text - Accepted Version
Download (677kB) | Preview


Add to AnyAdd to TwitterAdd to FacebookAdd to LinkedinAdd to PinterestAdd to Email


Downloads per month over past year

View more statistics

Actions (login required)

Admin Login Admin Login