Categories
Nevin Manimala Statistics

Reliable and fast automatic artifact rejection of Long-Term EEG recordings based on Isolation Forest

Med Biol Eng Comput. 2023 Nov 9. doi: 10.1007/s11517-023-02961-5. Online ahead of print.

ABSTRACT

Long-term electroencephalogram (Long-Term EEG) has the capacity to monitor over a long period, making it a valuable tool in medical institutions. However, due to the large volume of patient data, selecting clean data segments from raw Long-Term EEG for further analysis is an extremely time-consuming and labor-intensive task. Furthermore, the various actions of patients during recording make it difficult to use algorithms to denoise part of the EEG data, and thus lead to the rejection of these data. Therefore, tools for the quick rejection of heavily corrupted epochs in Long-Term EEG records are highly beneficial. In this paper, a new reliable and fast automatic artifact rejection method for Long-Term EEG based on Isolation Forest (IF) is proposed. Specifically, the IF algorithm is repetitively applied to detect outliers in the EEG data, and the boundary of inliers is promptly adjusted by using a statistical indicator to make the algorithm proceed in an iterative manner. The iteration is terminated when the distance metric between clean epochs and artifact-corrupted epochs remains unchanged. Six statistical indicators (i.e., min, max, median, mean, kurtosis, and skewness) are evaluated by setting them as centroid to adjust the boundary during iteration, and the proposed method is compared with several state-of-the-art methods on a retrospectively collected dataset. The experimental results indicate that utilizing the min value of data as the centroid yields the most optimal performance, and the proposed method is highly efficacious and reliable in the automatic artifact rejection of Long-Term EEG, as it significantly improves the overall data quality. Furthermore, the proposed method surpasses compared methods on most data segments with poor data quality, demonstrating its superior capacity to enhance the data quality of the heavily corrupted data. Besides, owing to the linear time complexity of IF, the proposed method is much faster than other methods, thus providing an advantage when dealing with extensive datasets.

PMID:37943419 | DOI:10.1007/s11517-023-02961-5

By Nevin Manimala

Portfolio Website for Nevin Manimala