Acquiring and Modeling Abstract Commonsense Knowledge via Conceptualization
Introduction
The lack of commonsense knowledge in artificial intelligence systems has long been one of the main bottlenecks hindering the development of this field. Although great strides have been made in recent years through neural language models and commonsense knowledge graphs, the key component of human intelligence, “conceptualization,” has not been well reflected in artificial intelligence systems. Humans acquire and understand the endless entities and situations in the world by conceptualizing specific things or situations into abstract concepts and reasoning based on them. However, finite knowledge graphs cannot cover the diverse entities and situations in the real world, let alone the relationships and inferences between them.
This research delves into the role of conceptualization in commonsense reasoning and constructs a framework to simulate the human process of concept induction: extracting event knowledge related to abstract concepts, as well as higher-level triples or inferences about these abstract concepts, from existing situational commonsense knowledge graphs. The framework first performs concept recognition and conceptualization on event instances in the commonsense knowledge graph ATOMIC, using language models and heuristic rules to generate abstract events and abstract triples representing abstract concepts. Researchers have constructed a large-scale dataset through manual annotation to supervise the training of relevant neural network models, thereby building a massive abstract knowledge graph “Abstract ATOMIC” on top of ATOMIC. Experimental results show that incorporating this abstract knowledge graph into existing commonsense models can significantly improve the performance of downstream tasks such as commonsense reasoning and zero-shot question answering.
Research Background
The representative form of existing commonsense knowledge is event-centric commonsense knowledge graphs, where nodes are represented in natural language text form. ATOMIC is a typical case, containing a large amount of manually annotated triples about everyday situations and their causes and effects.Despite its massive scale, the finite knowledge graph cannot cover the endless entities and situations in the real world.
Researchers believe that humans rely on “conceptualization” to acquire this commonsense knowledge. We capture commonsense in the real world by conceptualizing each concrete experience into an abstract concept and connecting them, enabling us to understand new instances. Concepts are the glue that connects our mental world, and intelligent systems lacking concepts will be unable to fully understand this world. However, replicating this human concept induction process is not easy, requiring handling the inherent flexibility of language, the many-to-many relationship between entities/events and concepts, and reporting bias, among other challenges.
This research focuses on acquiring and modeling abstract commonsense knowledge through text-based commonsense knowledge graphs and concept hierarchies, using neural language models and rule-based methods. The research models the conceptualization process at three levels: 1) identifying entities/events in events and conceptualizing them into concepts; 2) constructing abstract events based on concepts; 3) verifying the typicality of inferences (abstract triples) about abstract events.
Research Methods
Researchers first used heuristic rules and language models to identify entities and events in ATOMIC events and conceptualize them to generate candidate abstract events. To ensure quality, they manually annotated large-scale datasets for event conceptualization and triple conceptualization to supervise the training of neural network models such as concept validator and inference validator. The specific process includes:
1) Identification: Using syntactic and semantic features, design heuristic rules to identify entities and events in events as candidates for conceptualization.
2) Conceptualization: Generate candidate concepts through two paths - a concept generator based on the language model directly predicts concepts; heuristic rules link candidate objects to concepts in the concept hierarchy. All candidate concepts need to pass the concept validator to form abstract events.
3) Inference Validation: Validate the inference triples of each abstract event instance to determine which inferences are typically valid for that type of event, thereby forming abstract triples.
4) Instantiation: Conceptualize any new event that appears and perform inference based on the corresponding abstract triples.
Through this process, researchers built a large-scale “Abstract ATOMIC” knowledge base containing 70,000 abstract events and 2.95 million abstract triples on top of ATOMIC.
Application Evaluation
The researchers evaluated the performance of incorporating this abstract knowledge base into existing commonsense models on downstream tasks:
1) Commonsense Modeling: Incorporating abstract knowledge into the training of causal language models like COMET can significantly improve the model’s performance on the ATOMIC dataset.
2) Zero-Shot Commonsense Question Answering: Incorporating abstract knowledge into the training of synthetic question-answer pairs can significantly improve the performance of commercial large models (such as DeBERTa) on multiple commonsense QA benchmarks, with an average improvement of 1.4%, surpassing the level of ChatGPT.
3) Transfer to ConceptNet: Preliminary attempts show that the constructed language model-based concept generator can be successfully applied to other commonsense knowledge bases like ConceptNet.
This research systematically addresses the issue of introducing conceptualization into commonsense modeling and reasoning, proposing a process for acquiring abstract commonsense knowledge, and demonstrating that incorporating it into existing systems can significantly improve performance, potentially enabling artificial intelligence systems to better grasp commonsense reasoning capabilities.