Research

"Scalable. Interpretable. Trustworthy."

My research lies at the intersection of artificial intelligence, optimization, and systems theory — with a strong emphasis on clean design and explainability.

Multi-View Clustering: Developing efficient clustering algorithms that integrate heterogeneous feature spaces (views).
Federated Learning: Training models across distributed networks without compromising privacy or introducing chaos.
Edge AI: Designing lightweight models suitable for real-time, low-resource environments.
Anomaly Detection: Identifying patterns that don't belong — from machines to behavior.

I value clarity in implementation and reproducibility in results. My research balances theoretical foundation with practical deployment — ensuring what I publish is not just elegant on paper, but robust in the real world.

Selected Publications

Federated Multi-View K-Means Clustering

Kristina P. Sinaga, Miin-Shen Yang

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024 Impact Factor: 20.8

Summary: This work introduces a novel federated learning framework for multi-view clustering, enabling multiple data holders to collaboratively cluster heterogeneous data without sharing raw features or sensitive information. The proposed Federated Multi-View K-Means (FedMVKM) algorithm integrates local multi-view K-means computations with a secure aggregation protocol, ensuring privacy preservation and data locality. Key contributions include:

Formulation of a unified objective for multi-view clustering in federated environments, supporting arbitrary numbers of views and clients.
Development of a privacy-preserving protocol that aggregates only model parameters (not raw data), protecting both feature and membership information.
Explicit update rules for cluster centers and memberships that guarantee convergence and scalability across distributed nodes.
Comprehensive experiments on real-world multi-view datasets demonstrating that FedMVKM achieves clustering performance comparable to centralized methods, while maintaining strict privacy guarantees and communication efficiency.

Impact: This approach bridges the gap between multi-view learning and federated analytics, making collaborative clustering feasible for sensitive domains such as healthcare, finance, and cross-organization research, where data cannot be centralized due to privacy or regulatory constraints.

Article Matlab Functions Python Package PDF

Unsupervised K-means clustering algorithm

Kristina P. Sinaga, Miin-Shen Yang

IEEE Access, 2020

See the list on IEEE Xplore

Abstract:
This article presents a comprehensive and technically robust exploration of the classic K-means algorithm, focusing on its unsupervised learning capabilities for partitioning unlabeled data into meaningful clusters. The work advances the field by providing a rigorous mathematical formulation, detailed algorithmic steps, and practical considerations for real-world deployment. It revisits foundational principles—such as centroid initialization, iterative assignment, and update steps—while addressing common challenges like cluster initialization sensitivity, convergence criteria, and scalability to large datasets.

Enhanced Initialization Strategies: Introduces improved centroid initialization (e.g., k-means++), reducing the risk of poor local minima and improving clustering stability.
Convergence Analysis: Provides formal analysis of convergence properties, including proofs of monotonic decrease in the objective function and clear algorithm termination criteria.
Scalability and Efficiency: Explores computational optimizations such as efficient distance computations and parallelization, making K-means suitable for large-scale and high-dimensional data.
Evaluation Metrics: Introduces quantitative measures (e.g., inertia, silhouette score) for assessing clustering quality, enabling objective comparison across runs and parameter settings.
Practical Impact: Offers guidelines for parameter selection and best practices, empowering practitioners to apply K-means in diverse domains including image analysis, bioinformatics, and market segmentation.

Impact: This article has made a significant mark in the machine learning community, with over 2,000 citations since publication. Its exceptional citation rate demonstrates the paper's foundational contribution to clustering algorithms and its widespread adoption across multiple domains. The article's clarity, technical depth, and practical relevance have established it as one of the most influential works on K-means clustering in recent years, providing both theoretical foundations and implementation guidance that continue to shape research and applications in data mining, pattern recognition, and unsupervised learning.

Article Code PDF

Collaborative feature-weighted multi-view fuzzy c-means clustering.

Kristina P. Sinaga, Miin-Shen Yang

Pattern Recognition, 2021 Impact Factor: 7.5

Summary: This paper introduces a collaborative feature-weighted multi-view fuzzy c-means clustering (Co-FW-MVFCM) algorithm. It aims to improve clustering performance by simultaneously optimizing cluster assignments, view weights, and feature weights within each view in a collaborative manner. Key aspects include:

Development of an objective function that integrates feature weighting at both view and individual feature levels.
A collaborative optimization strategy that allows views to influence each other's feature weighting schemes.
Derivation of update rules for memberships, cluster centers, view weights, and feature weights.
Demonstration of the algorithm's effectiveness on various multi-view datasets, showing improved accuracy and robustness.

Impact: This method provides a more nuanced approach to multi-view clustering by allowing for fine-grained control over feature importance, leading to better discovery of underlying data structures in complex, heterogeneous datasets.

Article Code PDF

Technical Skills & Expertise

Core Research Areas

Multi-View Clustering & Learning
Federated & Distributed Learning
Edge AI & Lightweight Models
Anomaly & Outlier Detection
Optimization Algorithms
Explainable AI (XAI) & Interpretability
Privacy-Preserving Machine Learning

Methodologies & Techniques

Fuzzy C-Means & Variants
Kernel Methods (incl. Heat-Kernels)
Deep Learning (CNNs, RNNs, Transformers)
Statistical Modeling & Analysis
Algorithm Design & Complexity Analysis
Reinforcement Learning
Time Series Analysis

Programming & Tools

Python (NumPy, Pandas, SciPy, Scikit-learn)
TensorFlow, Keras, PyTorch
MATLAB
SQL & NoSQL Databases
Git & Version Control
Docker & Containerization
Cloud Platforms (AWS, GCP, Azure basics)

Current Projects

Cross-Modal Learning for Healthcare Applications

Developing AI systems that can integrate diverse healthcare data modalities (imaging, text reports, vitals) to improve diagnostic accuracy while maintaining interpretability.

` `

` Distributed Optimization for Edge Computing `

` `

`Creating novel optimization algorithms that enable efficient distributed computation across edge devices with heterogeneous capabilities and unreliable connectivity.`

` `

` Privacy-Preserving Machine Learning `

` `

`Building learning frameworks that maintain data privacy while enabling collaborative model development across organizations and institutions.`

` `

https://github.com/KristinaP09/collaborative-feature-weighted-fcm

https://github.com/KristinaP09/Fed-MVKM

Research Opportunities

I'm actively seeking collaborations in the following areas:

Industry partnerships for applying AI techniques to real-world problems
Academic collaborations on federated learning and multi-view data analysis
Mentoring opportunities for graduate and undergraduate researchers

If you're interested in collaborating, please contact me with your research interests and potential collaboration ideas.

Research

Selected Publications

Federated Multi-View K-Means Clustering

Unsupervised K-means clustering algorithm

Collaborative feature-weighted multi-view fuzzy c-means clustering.

Technical Skills & Expertise

Core Research Areas

Methodologies & Techniques

Programming & Tools

Current Projects

Cross-Modal Learning for Healthcare Applications

``` Distributed Optimization for Edge Computing `

``` Privacy-Preserving Machine Learning `

Research Opportunities

` Distributed Optimization for Edge Computing `

` Privacy-Preserving Machine Learning `