mšŸixkg: Mixing for harder negative samples in knowledge graph

Academic Report

Background

A Knowledge Graph (KG) is structured data that records information about entities and relationships, widely used in question-answering systems, information retrieval, machine reading, and other fields. Knowledge Graph Embedding (KGE) technology maps entities and relationships in the graph into a low-dimensional dense vector space, significantly enhancing the performance of related applications. However, in the training process of KGE models, generating high-quality negative samples is crucial.

Currently, mainstream KGE models face numerous challenges in generating negative samples. Some models use simple static distributions, such as uniform or Bernoulli distributions, but the negative samples generated by these methods often lack discernibility. Moreover, existing methods often select negative samples solely from the entities already present in the knowledge graph, which limits the ability to generate harder negative samples.

This paper proposes a novel mixed strategy called mĀ²ixkg, which adopts two mixing operations to generate harder negative samples: one mixes heads and tails under the same relationship to enhance the robustness and generalization ability of entity embeddings; the other generates harder negative samples by mixing high-scoring negative samples. This paper aims to address the limitations of existing methods in generating high-quality negative samples and validate its effectiveness through experiments.

Paper Source

The paper titled “mĀ²ixkg: Mixing for Harder Negative Samples in Knowledge Graphs” was written by Feihu Che and Jianhua Tao from Tsinghua University and is published in the upcoming 2024 issue of the journal Neural Networks.

Research Process

This paper details the research process of mĀ²ixkg, including the following main steps:

1. Dataset and Model Selection

The research uses three widely recognized benchmark datasets: FB15k-237, WN18, and WN18RR. These datasets originate from well-known knowledge bases such as Freebase and WordNet. The scoring functions selected include TransE, RotatE, DistMult, and ComplEx, which are classic models in current KGE research.

2. Experimental Setup

The experimental setup includes training the model using the Adam optimizer and performing hyperparameter tuning on the validation set. The hyperparameters in the study include batch size, fixed margin, negative sample set size, and mixing coefficients.

3. mĀ²ixkg Methodology

mĀ²ixkg includes two main mixing operations: mix heads and tails (mix1), and mix between difficult negative samples (mix2). Specifically: - Mix1: Mixing Heads and Tails: Input features, model encoding, and labels are the head entity, relationship, and tail entity, respectively. For triplets under the same relationship, a mixing operation is performed. New triplets are generated, enhancing the model’s generalization ability. - Mix2: Mixing Between Difficult Negative Samples: High-quality negative samples are selected from the already sampled negative samples, and these negative samples are mixed to generate more challenging negative samples.

Specific steps include: 1. Randomly selecting entities from the knowledge graph to form a candidate set of negative samples. 2. Calculating the scores of these negative samples and sampling based on the scores’ probabilities. 3. Randomly selecting a pair of negative samples and performing linear interpolation mixing of their tail entities.

4. Loss Function

The loss functions used in this paper fall into two categories: - Translational Distance Model: such as TransE. - Semantic Matching Model: such as DistMult and ComplEx.

The setting of the loss function is crucial in model training and directly impacts the model’s performance.

Research Results

The research validates the effectiveness of the mĀ²ixkg method through experiments. The results show that this method outperforms existing negative sample generation algorithms in multiple scenarios.

1. Experimental Results and Analysis

Compared to other classic negative sample generation methods, mĀ²ixkg shows significant improvement in evaluation metrics such as MRR and Hits@10. Specifically, mĀ²ixkg achieves average MRR improvements of 0.0025 and 0.0011 on FB15k-237 and WN18RR datasets, respectively, and significant Hits@10 improvements of 0.21, 0.14, 0.94, and 0.27.

2. Ablation Study

The ablation study further verifies the contribution of the mixing operations to the model’s performance improvement. The results show that mix1 and mix2 significantly enhance the model’s performance across different scoring functions and datasets, with better results when used in combination.

Conclusion and Significance

The mĀ²ixkg method proposed in this research generates harder negative samples through mixing operations, which is a simple yet effective technique aimed at enhancing the performance of knowledge graph embedding models. The research validates the positive impact of mixing operations on knowledge graph embeddings, specifically noting that the mixing of head and tail entities enhances the generalization and robustness of the learned embeddings, and the mixing of difficult negative samples generates more challenging negative samples, improving the model’s ability to distinguish between positive and negative samples.

The significant contributions of the mĀ²ixkg method are as follows: - Generating high-quality negative samples by incorporating virtual entities. - Enhancing the generalization of learned embeddings through the mixing of heads and tails under the same relationship. - Designing a soft quantity selection mechanism for different head-relationship pairs to precisely select difficult negative samples.

In conclusion, this paper provides a new perspective and method for generating hard negative samples, and validates its broad applicability across multiple datasets and scoring functions, offering new avenues and means for optimizing KGE models.