Mitigating Social Biases of Pre-trained Language Models via Contrastive Self-Debiasing with Double Data Augmentation

Introduction: Currently, pre-trained language models (PLMs) are widely applied in the field of natural language processing, but they have the problem of inheriting and amplifying social biases present in the training corpora. Social biases may lead to unpredictable risks in real-world applications of PLMs, such as automatic job screening systems te...

A Unified Momentum-based Paradigm of Decentralized SGD for Non-Convex Models and Heterogeneous Data

A Unified Momentum-based Paradigm for Decentralized SGD for Non-Convex Models and Heterogeneous Data Research Background In recent years, with the rise of the Internet of Things and edge computing, distributed machine learning has developed rapidly, especially the decentralized training paradigm. However, in practical scenarios, non-convex objectiv...

Acquiring and Modeling Abstract Commonsense Knowledge via Conceptualization

Introduction The lack of commonsense knowledge in artificial intelligence systems has long been one of the main bottlenecks hindering the development of this field. Although great strides have been made in recent years through neural language models and commonsense knowledge graphs, the key component of human intelligence, “conceptualization,” has ...

A Multi-graph Representation for Event Extraction

Background Introduction: Event extraction is a popular task in the field of natural language processing, aiming to identify event trigger words and their related arguments from a given text. This task is typically divided into two subtasks: event detection (extracting event trigger words) and argument extraction. The traditional pipeline method per...

A Neurosymbolic Cognitive Architecture Framework for Handling Novelties in Open Worlds

A Neurosymbolic Cognitive Architecture Framework for Handling Novelties in Open Worlds

A Neural-Symbolic Cognitive Architecture Framework for Handling Novel Entities in Open Worlds Background Traditional AI research assumes that intelligent agents operate in a “closed world”, where all task-relevant concepts in the environment are known, and no new unknown situations will arise. However, in the open real world, novel entities that vi...

Learning Spatio-Temporal Dynamics on Mobility Networks for Adaptation to Open-World Events

Adapting to Open-World Events via Learning Spatio-Temporal Dynamics on Mobility Networks Research Background In modern society, the Mobility-as-a-Service (MaaS) system is seamlessly integrated by various transportation modes (such as public transport, ride-sharing, and shared bicycles). To achieve efficient MaaS operation, modeling the spatio-tempo...

Hyperbolic secant representation of the logistic function: Application to probabilistic multiple instance learning for CT intracranial hemorrhage detection

There has long been a problem of “weak supervision” in the field of artificial intelligence, where only part of the labels are observable in the training data, while the remaining labels are unknown. Multiple Instance Learning (MIL) is a paradigm to address this issue. In MIL, the training data is divided into multiple “bags”, each containing multi...

Investigating the Properties of Neural Network Representations in Reinforcement Learning

Investigating the Properties of Neural Network Representations in Reinforcement Learning

Traditional representation learning methods usually design a fixed basis function architecture to achieve desired properties such as orthogonality and sparsity. In contrast, the idea of deep reinforcement learning is that the designer should not encode the properties of the representation, but instead let the data flow determine the properties of t...

Critical Observations in Model-Based Diagnosis

In model-based fault diagnosis, the ability to identify the key observational data that leads to system abnormalities is highly valuable. This paper introduces a framework and algorithm for identifying key observational data. The framework determines which observations are crucial for diagnosis by abstracting the raw observational data into “sub-ob...

Polarized Message-Passing in Graph Neural Networks

Polarized Message-Passing in Graph Neural Networks

With the widespread application of graph-structured data in various fields, Graph Neural Networks (GNNs) have attracted significant attention as a powerful tool for analyzing graph data. However, existing GNNs primarily rely on neighborhood node similarity information when learning node representations, overlooking the potential of node differences...