Simplified Kernel-Based Cost-Sensitive Broad Learning System for Imbalanced Fault Diagnosis
Research Report on the Simplified Kernel-Based Cost-Sensitive Broad Learning System (SKCSBLS) for Imbalanced Fault Diagnosis
Research Background and Significance
With the advent of Industry 4.0, smart manufacturing increasingly relies on industrial big data analytics. By extracting critical insights from machine operation data, the effectiveness of equipment health management can be greatly enhanced, ensuring safety and efficiency in production systems. However, in real-world industrial applications, imbalanced data poses a significant challenge to fault diagnosis in smart manufacturing. Most equipment operation data consists overwhelmingly of normal state data, with faulty data being scarce. This uneven class distribution may lead to a decrease in model prediction accuracy, making it challenging to effectively identify minority (faulty) classes.
Deep learning methods, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have been widely applied to fault detection. However, these models require large training datasets and can suffer from overfitting on imbalanced datasets. Additionally, the computational complexity of these methods can result in extended training times. Consequently, researchers are turning to the Broad Learning System (BLS), which features simpler structures and efficient training procedures.
While BLS has a single-layer network structure with feature and enhancement nodes for linear computation, offering data consistency and incremental training capabilities, its performance is suboptimal when addressing imbalance data distributions. To overcome these limitations, this article introduces a novel approach named Simplified Kernel-Based Cost-Sensitive Broad Learning System (SKCSBLS). The proposed SKCSBLS is highly efficient and robust, addressing imbalanced classification issues with applications in industrial fault diagnosis.
Paper Source and Author Information
This research, titled “Simplified Kernel-Based Cost-Sensitive Broad Learning System for Imbalanced Fault Diagnosis,” was published in IEEE Transactions on Artificial Intelligence (December 2024, Volume 5, Issue 12). The research team comprises scholars from South China University of Technology and Huaqiao University. Key authors include Kaixiang Yang (Member, IEEE), Wuxing Chen, Yifan Shi (Member, IEEE), Zhiwen Yu (Senior Member, IEEE), and C. L. Philip Chen (Life Fellow, IEEE). The research is funded by multiple grants, including the National Natural Science Foundation of China, the Fujian Provincial Natural Science Foundation, and the High-Level Talent Team Project in Quanzhou City.
Research Summary and Technical Workflow
Proposed Approach and Research Design
The core contribution of this article is the introduction of SKCSBLS, which integrates a cost-sensitive mechanism and simplified kernel mapping into the BLS framework. Its main modules include the following steps:
Construction of Cost-Sensitive Broad Learning System (CSBLS):
- Based on the traditional BLS model, cost-sensitive parameters are introduced for different classes, making the training process more focused on minority class samples.
- The cost-sensitive mechanism achieves this by assigning varying misclassification penalty coefficients (e.g., C+ and C-) to reduce the misclassification rate for minority classes.
Introduction of Kernel Mapping:
- To address issues such as noise and class overlap in imbalanced data, kernel mapping is incorporated into CSBLS to map raw features into a higher-dimensional kernel space, enhancing classification robustness.
- A Gaussian kernel function is employed, with its parameters optimized using a grid search approach.
Application of Simplified Kernel Techniques:
- To improve computational efficiency, the study proposes an innovative simplified kernel mapping method that reduces the dimensionality and time complexity of kernel computations.
- Substantial reduction in computational cost is achieved by randomly sampling smaller subsets from the original kernel matrix.
Streamlined Optimization Process:
- The output weight matrix is derived using a fast pseudo-inverse optimization method, enabling the model to efficiently handle large-scale datasets.
Experimental Design
The proposed method was validated using a diverse array of datasets, including 19 benchmark datasets from UCI and KEEL libraries and two real-world industrial datasets (CWRU bearing data and IMS data). The datasets have imbalance rates (IR) ranging from 2.48 to 36.67. The experiments compared SKCSBLS against various state-of-the-art imbalanced learning algorithms, such as Weighted Extreme Learning Machine (WELM), Weighted Broad Learning System (WBLS), Cost-Sensitive Extreme Learning Machine (CS-ELM), and AMSCO, among others.
Evaluation metrics included G-Mean (Geometric Mean) and AUC (Area Under the Receiver Operating Characteristic Curve). Five-fold cross-validation was implemented to ensure robustness.
Research Results and Analysis
Experimental Results
Performance Comparison: SKCSBLS outperformed competing models in 14 out of 19 benchmark datasets. Notably, it achieved superior results on datasets with high imbalance, such as Page Blocks and Ecoli, with G-Mean scores exceeding 0.93.
Processing Speed: For both the CWRU and IMS datasets, SKCSBLS demonstrated competitive runtime, ranking third among all methods while being noticeably faster than ensemble-based techniques like AMSCO. Its fast performance is attributed to the simplified kernel matrix’s reduced dimensionality during pseudo-inverse computation.
Industrial Application: In bearing fault diagnosis applications, SKCSBLS exhibited stable high-accuracy metrics on both real-world datasets. For instance:
- G-Mean: 0.987 (CWRU dataset) and 0.852 (IMS dataset)
- AUC: 0.985 (CWRU dataset) and 0.85 (IMS dataset) The model also handled noisy data effectively, showcasing its reliability in anomaly detection tasks.
Key Highlights
Innovative Methodology: The integration of cost-sensitive parameters and simplified kernel mapping significantly enhanced SKCSBLS’s sensitivity to minority class samples while optimizing computational efficiency.
Industrial Relevance: Beyond fault diagnosis, SKCSBLS offers actionable insights for addressing classification challenges in areas like medical diagnostics and text classification, where data distribution is highly imbalanced.
Research Significance and Future Directions
This study presented a robust, effective framework for handling imbalanced datasets, combining cost-sensitive strategies with kernel mapping. SKCSBLS’s ability to improve minority sample classification proves valuable not only in industrial fault detection but also in domains requiring precise anomaly detection.
Future Work:
- Enhancing the automated optimization of cost-sensitive parameters.
- Extending the approach to multi-class imbalance scenarios.
- Evaluating the model’s scalability with additional real-world industrial datasets across diverse domains.