Deep Graph Memory Networks for Forgetting-Robust Knowledge Tracing

Deep Graph Memory Network for Forgetting-Robust Knowledge Tracing

In recent years, Knowledge Tracing (KT) has attracted widespread attention as an important method for personalized learning. The goal of KT is to predict the accuracy of a student’s answers to new questions by utilizing their past answer history to estimate their knowledge state. However, current KT methods still face several challenges, including modeling forgetting behavior and identifying relationships between potential concepts. To address these issues, this paper proposes a novel KT model called Deep Graph Memory Network (DGMN). This paper specifically outlines the design of the DGMN model, the experimental process, and its performance on various datasets.

Research Background

Since its proposal, the knowledge-tracing problem has been a significant research direction in the field of education. Its core aim is to predict the probability of a student correctly answering a future question based on their historical answer data. Early KT methods mainly include Bayesian and state-space models, such as Hidden Markov Models (HMM). Although these methods are simple in concept, they often rely on overly simplified assumptions about knowledge states and latent concepts, resulting in high inference complexity.

In recent years, deep learning methods have been introduced into the KT field by using deep neural networks to model question-answer sequences. These methods have significantly improved prediction accuracy. For example, Piech et al. proposed the Deep Knowledge Tracing (DKT) model, which uses a Recurrent Neural Network (RNN) to track students’ knowledge states.

Schematic of the DGMN Model Although deep learning methods have made significant progress in the KT domain, challenges in modeling forgetting behavior and identifying relationships between latent concepts remain. To address these, the DGMN model introduced in this paper incorporates a forgetting gate mechanism within an attention memory structure to dynamically capture forgetting behavior in the KT process.

Article Source

This paper was written by Ghodai Abdelrahman and Qing Wang from the Australian National University (ANU) and was published in the IEEE Transactions on Knowledge and Data Engineering (TKDE) journal on September 9, 2022.

Research Methods and Process

Method Overview

The DGMN model combines two main components: Attention Memory and Latent Concept Graph, proposing a new forgetting modeling mechanism composed of the following steps: 1. Concept Embedding Memory: This component stores embedding vectors for each latent concept and calculates the relevance between the current question and the stored embeddings using an attention mechanism. 2. Concept State Memory: Stores the student’s current knowledge state and reads the relevant knowledge state data in the answer sequence using an attention mechanism. 3. Forget Gating Mechanism: Combines forgetting features with the current knowledge state, dynamically adjusting the knowledge state based on past answer sequences for final answer prediction. 4. Latent Concept Graph: Extracts relationships between latent concepts through Graph Convolutional Networks (GCN) and weights these relationships during the prediction process.

Specific Process

  1. Question and Answer Embedding: Given a set of questions, DGMN first embeds the question vectors and stores these embeddings in the memory matrix.
  2. Attention Mechanism Calculation: Calculates the relevance distribution between the current question embedding and the memory matrix through an inner product, forming a relevance vector.
  3. Reading Relevant Knowledge State: Reads the corresponding knowledge state information from the concept state memory based on the relevance vector.
  4. Building Forgetting Features: Calculates forgetting features in the question-answer sequence, including time intervals and the number of answers, and combines them with the knowledge state using the forgetting gate mechanism.
  5. Updating Memory: Updates the stored knowledge state using a new vector generated through the gating mechanism based on the latest question-answer data.
  6. Constructing Latent Concept Graph: Uses a GCN to extract relationships between latent concepts from the embedding matrix, dynamically adjusts the graph structure, and tracks relationships between latent concepts based on changes in the student’s knowledge state.
  7. Predicting Answers: Inputs the combined attention memory information and latent concept graph relationships into a fully connected layer for correct probability prediction.

Experimental Setup and Datasets

The study conducted experiments on four widely used benchmark datasets:

  1. ASSISTments2009: Contains school mathematics problems, collected during the 2009-10 academic year, with 110 questions, 4151 students, and a total of 325,637 question-answer pairs.
  2. Statics2011: Data collected from Carnegie Mellon University engineering courses, containing 1223 problems, 335 students, and a total of 189,297 question-answer pairs.
  3. Synthetic-5: Simulated data from the authors of the DKT model, containing 4000 students, 50 questions, and a total of 200,000 answers.
  4. KDDCup2010: Based on algebra course data from 2005-06, with 436 questions, 575 students, and a total of 607,026 final answers.

Model Optimization

The model was optimized using the Adam optimizer, with parameters for the memory matrix and embedding matrix initialized through zero-mean Gaussian distribution. Additionally, cross-entropy loss function was used for gradient descent.

Experimental Results and Discussion

Model Performance Comparison

The experimental results show that the DGMN outperforms the current highest-performing KT models on all datasets. By comparing the DGMN with models such as SAINT+, AKT, DKVMN, it demonstrated significant performance improvements and strong generalization capabilities across different datasets.

Ablation Study

Through comparison experiments of different model variants, it was found that modules such as the latent concept graph, forgetting gate mechanism, and question ordering technique significantly enhance DGMN performance. When any module was removed, the model’s AUC value dropped significantly, indicating the contribution of each component to the overall model performance.

Latent Concept Graph Analysis

Latent concept graph analysis was conducted on the ASSISTments2009 and Statics2011 datasets, visually displaying relationships between latent concepts and further validating the effectiveness of DGMN in knowledge state tracking and relationship capturing.

Forgetting Features Modeling Analysis

By comparing heatmaps of DGMN and the DKT+Forget model on predicting question accuracy, it was found that DGMN more accurately captures the forgetting behavior between different concepts, further validating the effectiveness of the forgetting mechanism.

Significance and Value of the Research

The DGMN model provides an efficient method to dynamically combine forgetfulness behavior and relationships between latent concepts into the knowledge tracing process. This is not only valuable in scientific research but also has broad potential applications in practical education scenarios, such as personalized teaching, learning path optimization, and question recommendation on online education platforms. Future work can further explore the application of latent concept graphs in course learning and student practice recommendation, continuing to optimize the model’s predictive ability and applicability.