Cost-Efficient Feature Selection for Horizontal Federated Learning
Research on Cost-Efficient Feature Selection in Horizontal Federated Learning
Background and Motivation
As Federated Learning (FL) is increasingly recognized as a distributed machine learning paradigm that safeguards data privacy, its application to multi-client scenarios has garnered significant attention. In Horizontal Federated Learning (HFL), all clients share the same feature space but possess distinct data samples. However, challenges such as feature redundancy and the curse of dimensionality profoundly impact model performance and training efficiency. Feature Selection (FS)—an essential preprocessing technique in machine learning—plays a critical role in eliminating redundant features and enhancing model performance.
This study presents novel methods to address these challenges specifically in HFL scenarios.
Paper Information
The paper, “Cost-Efficient Feature Selection for Horizontal Federated Learning”, is authored by Sourasekhar Banerjee, Devvjiit Bhuyan, Erik Elmroth, and Monowar Bhuyan. The work is a collaboration between the Department of Computing Science at Umeå University and the Department of Electronics & Communication Engineering at Tezpur University. Published in IEEE Transactions on Artificial Intelligence, December 2024 (Volume 5, Issue 12), the research was supported by the Knut and Alice Wallenberg Foundation.
Research Challenges
Traditional FS methods are centralized but cannot be directly applied to FL due to its privacy constraints. In FL, clients do not share local data, and local feature subsets often vary significantly. Key challenges in this context include:
- Biased local feature importance evaluation due to statistical heterogeneity across clients.
- Lack of consistency in selected feature subsets among clients, hindering unified model training.
- High feature dimensionality leads to increased training time and decreased model performance.
Existing methods for FL feature selection often fall short when addressing non-IID (non-independent and identically distributed) data and require further integration with global model training.
To address these issues, the authors present a new method, Fed-MOFS (Federated Multi-Objective Optimization Feature Selection), while also revisiting an existing method, Fed-FIS (Federated Feature Importance Scoring).
Methodology
The research investigates two strategies for federated FS:
1. Fed-MOFS: Multi-Objective Optimization for Feature Selection
Fed-MOFS uses mutual information (MI) and clustering for local FS and applies Pareto optimization to achieve unbiased global feature selection.
Workflow:
Local Feature Selection (Local FS):
- Each client calculates feature-class MI (Fcmi) to assess relevance and feature-feature MI (Affmi) to measure redundancy.
- Features are clustered using K-Means, optimizing for high Fcmi and low Affmi values.
Global Feature Selection (Global FS):
- Clients send local feature scores to the server.
- The server applies multi-objective optimization on average Fcmi and Affmi values, generating Pareto fronts.
- Features are globally ranked based on dominance principles derived from Pareto optimization.
2. Fed-FIS: Feature Ranking with Scoring Function
Fed-FIS ranks features globally using a scoring function (Score = Fcmi - λAffmi
) to balance relevance and redundancy. Although straightforward, this method lacks the power of multi-objective trade-offs offered by Fed-MOFS.
Model Training and Evaluation
- Feature rankings from each method are utilized for training with Federated Averaging (FedAvg) or Federated Forest algorithms.
- Both IID (independent and identically distributed) and non-IID settings were tested, alongside variations in client participation, to evaluate performance, stability, scalability, and efficiency.
Experimental Results and Analyses
1. Performance Evaluation
The study tested various datasets (e.g., NSL-KDD99, ISOLET, IoT) for classification and regression tasks.
Classification Tasks:
- In 7⁄12 datasets, Fed-MOFS surpassed all baselines, including traditional methods (e.g., ANOVA, RFE) and state-of-the-art federated methods (FSHFL, Fed-MRMR). For example:
- IoT dataset: Fed-MOFS achieved 91% accuracy, outperforming Fed-FIS and Fed-MRMR.
- In regression, Fed-MOFS and Fed-FIS outperformed Fed-MRMR. On the Boston Housing dataset, Fed-FIS recorded a lower RMSE (< 9.0).
Feature Reduction:
- Fed-MOFS and Fed-FIS reduced feature space by more than 50% while maintaining strong performance.
2. Stability and Efficiency
Stability:
- Both methods showed high precision in ranking the most relevant features.
- For example, significant accuracy improvements were observed even with substantial feature reduction (>50%).
Efficiency:
- Fed-MOFS and Fed-FIS were at least 2× faster than FSHFL.
- For instance:
- On the ACC dataset, Fed-FIS outperformed FSHFL by 26 seconds.
- On IoT datasets, Fed-MOFS reduced runtime by an average of 15 seconds.
3. Performance Under Non-IID Settings
The models were assessed under varying degrees of non-IID data (γ) and client participation rates (δ):
Non-IID Factors (γ):
- As γ increased (data distribution became closer to IID), both Fed-MOFS and Fed-FIS exhibited improved performance.
Client Participation Rate (δ):
- With 50% client participation (δ = 0.5), Fed-MOFS achieved generalized models in most cases, whereas Fed-FIS required 80% participation (δ = 0.8) for better performance.
4. Scalability
Both Fed-MOFS and Fed-FIS scaled effectively to 100 clients and performed well with partial client participation, indicating robustness.
5. Convergence Analysis
The inclusion of FS did not hinder global model convergence. Both Fed-MOFS and Fed-FIS maintained convergence stability with reduced feature subsets.
6. Comparison with Baselines
Fed-MOFS frequently outperformed Fed-FIS due to its advanced multi-objective optimization strategy, especially in non-IID scenarios.
Conclusion and Implications
This study demonstrates that Fed-MOFS and Fed-FIS are efficient and scalable FS methods tailored for HFL. Key achievements include:
- Improved Model Performance: Both methods significantly enhanced classification and regression tasks compared to state-of-the-art methods.
- Reduction of Computational Overhead: With a wall-clock runtime at least twice as fast as FSHFL, Fed-MOFS and Fed-FIS offer substantial efficiency gains.
- Non-IID Robustness: Both systems excel in handling heterogeneous data and partial client participation.
- Convergence Stability: FS does not compromise model training stability or convergence rates.
Future Directions
Potential applications of federated FS include human activity recognition for anomaly detection, financial fraud analysis, and more. The authors plan to explore multi-modal data environments and extend these algorithms for anomaly-based applications in FL.
This study contributes significant advancements in federated FS, emphasizing the critical balance between feature relevance, redundancy, and distributed decision-making.