Education

Ph.D. 2012-2017

Izmir Institute of Technology

Electronics & Communication Engineering

 

Thesis Title: Development of a Comparative Analysis Framework for High Dimensional Data Based on Quasi-supervised Learning
Supervisor: Prof. Dr. Bilge KARAÇALI

Abstract:

In this dissertation, automatic compensation and gating strategies are investigated for multi-color flow cytometry data analysis. We propose two clustering algorithms that combine the quasi-supervised learning algorithm with an expectation-maximization routine for automatic gating. The quasi-supervised learning algorithm estimates the posterior probabilities of the different cell populations at each sample in a dataset in a manner that does not involve fitting a parametric model to the data.

We have developed two different binary divisive clustering algorithms based on expectation maximization with responsibility values calculated using the quasi-supervised learning algorithm instead of the probabilistic models used in conventional expectation maximization applications. Our clustering algorithms determine the number of clusters in run-time by measuring the overlap between the estimated clusters in each provisional division and comparing it with the previous one to determine whether the division is warranted or not. Since this type of clustering is indifferent to the underlying distribution of dataset, it is well suited to automatic flow cytometry gating.

The second clustering algorithm improves upon the first one using a simulated annealing approach. Its iterative structure allows finding the global minimum of a cost functional that achieves the best separation point by gradually smoothing the decision regions in each iteration.

Finally, we have developed a joint diagonalization and clustering method for automatic compensation of flow data based on the methods above. The proposed method identifies cell sub groups using the annealing-based model-free expectation-maximization algorithm and finds a data transformation matrix that achieves orthogonality of the covariance structure of each identified cell cluster using fast Frobenius diagonalization.

We have tested all proposed algorithms on both synthetically created datasets and real multi-color flow cytometry datasets. The results show that our automated gating algorithms are very successful in identifying the distinct cell groups so long as there is enough statistical evidence for their presence. In addition, the automated compensation procedure was also successfully applied on the synthetically created dataset and real multi-color flow cytometry data of lymphocytes that are a low autofluorescence cell group. However, the automated compensation algorithm needs further study to be generalized to high autofluorescence cell types where proper compensation does not necessarily coincide with an orthogonal covariance structure.

M.S. 2009-2012

Izmir Institute of Technology

Electronics & Communication Engineering

Thesis Title: Separation Of Stimulus-Specific Patterns In Electroencephalography Data Usıng Quasi-Supervised Learning
Supervisor: Prof. Dr. Bilge KARAÇALI

Abstract:

In this study separation of the electroencephalography data recorded under different visual stimuli is investigated using the quasi-supervised learning algorithm.
The quasi-supervised learning algorithm estimates the posterior probabilities associated with the different stimuli, thus identifying the EEG data samples that are exclusively specific to their respective stimuli directly and automatically from the data. The data used in this study contains 32 channels EEG recording under six different visual stimuli in random successive order. In our study, we have first constructed EEG profiles to represent instantaneous brain activity from the EEG data by various combinations of independent component analysis and the wavelet transform following data preprocessing. Then, we have applied the binary and M-ary quasi-supervised learning to identify condition-specific EEG profiles in different comparison scenarios. The results reveal that the quasi-supervised learning algorithm is successful in capturing the distinction between the samples. In addition, feature extraction using independent component analysis increased the performance of the quasi-supervised learning and the wavelet decomposition revealed the different frequency bands of the features, making more explicit the separation of the samples. The best results we obtained by combining the wavelet decomposition and the independent component analysis before the quasisupervised learning algorithm.

B.S. 2005-2009

 

Izmir Institute of Technology

Electronics & Communication Engineering