A Comprehensive Survey of Loss Functions and Metrics in Deep Learning

Deep Learning, as a crucial branch of artificial intelligence, has achieved significant progress in recent years across various fields such as computer vision and natural language processing. However, the success of deep learning largely depends on the choice of loss functions and performance metrics. Loss functions are used to measure the difference between model predictions and true values, guiding the optimization process, while performance metrics are used to evaluate the model’s performance on unseen data. Despite the importance of loss functions and performance metrics in deep learning, researchers and practitioners often face challenges in determining the most suitable methods for their specific tasks due to the multitude of options available.

To address this, this article aims to provide a comprehensive review of the most commonly used loss functions and performance metrics in deep learning, helping researchers and practitioners better understand and select the appropriate tools for their tasks. The article not only covers classic regression and classification tasks but also delves into loss functions and performance metrics in fields such as computer vision, natural language processing (NLP), and retrieval-augmented generation (RAG).

Source of the Paper

This article is co-authored by Juan Terven, Diana-Margarita Cordova-Esparza, Julio-Alejandro Romero-González, Alfonso Ramírez-Pedraza, and E. A. Chávez-Urbiola, affiliated with institutions such as the National Autonomous University of Mexico (UNAM) and the Monterrey Institute of Technology (Tecnológico de Monterrey). The article was accepted on March 13, 2025, and published in the journal Artificial Intelligence Review, with the DOI 10.1007/s10462-025-11198-7.

Key Points

1. The Difference Between Loss Functions and Performance Metrics

Loss functions and performance metrics play different roles in deep learning. Loss functions are used during training to optimize model parameters by measuring the difference between model predictions and true values, and minimizing this difference through optimization methods like gradient descent. Performance metrics, on the other hand, are used after training to evaluate the model’s generalization ability and compare the performance of different models or configurations. The article details four key differences between loss functions and performance metrics, including their timing of use, selection criteria, optimization goals, and interpretability.

2. Properties of Loss Functions

When selecting a loss function, several properties must be considered, including convexity, differentiability, robustness, smoothness, sparsity, and monotonicity. These properties determine the suitability of loss functions for different tasks. For example, convexity ensures a global minimum for the loss function, while differentiability allows the use of gradient-based optimization methods.

3. Loss Functions and Performance Metrics in Regression Tasks

Regression tasks involve predicting continuous values in supervised learning. The article provides a detailed introduction to commonly used loss functions in regression, including Mean Squared Error (MSE), Mean Absolute Error (MAE), Huber Loss, Log-Cosh Loss, Quantile Loss, and Poisson Loss. Each loss function has its specific advantages and limitations. For instance, MSE is sensitive to outliers, while MAE is more robust.

In terms of performance metrics, the article discusses Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), Symmetric MAPE (SMAPE), Coefficient of Determination (R²), and Adjusted R². These metrics have their respective application scenarios and pros and cons. For example, RMSE is sensitive to outliers, while MAPE is more suitable for scenarios where relative error is important.

4. Loss Functions and Performance Metrics in Classification Tasks

Classification tasks involve predicting discrete labels in supervised learning. The article provides a detailed introduction to commonly used loss functions in classification, including Binary Cross-Entropy Loss (BCE), Categorical Cross-Entropy Loss (CCE), Sparse CCE, Weighted Cross-Entropy Loss, Cross-Entropy Loss with Label Smoothing, Negative Log-Likelihood Loss (NLL), PolyLoss, and Hinge Loss. These loss functions have their respective advantages in different scenarios. For example, weighted cross-entropy loss can effectively handle class imbalance.

In terms of performance metrics, the article discusses the Confusion Matrix, Accuracy, Precision, Recall, F1-Score, Specificity, False Positive Rate (FPR), Negative Predictive Value (NPV), and False Discovery Rate (FDR). These metrics help comprehensively evaluate the performance of classification models. For instance, the F1-Score is particularly important in imbalanced datasets.

5. Loss Functions and Performance Metrics in Computer Vision and Natural Language Processing

The article also delves into loss functions and performance metrics in the fields of computer vision and natural language processing. In computer vision, commonly used loss functions include cross-entropy loss, Focal Loss, and Contrastive Loss, while performance metrics include Mean Intersection over Union (mIoU) and Average Precision (AP). In natural language processing, commonly used loss functions include cross-entropy loss and contrastive loss, while performance metrics include BLEU, ROUGE, and Perplexity.

6. Loss Functions and Performance Metrics in Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) is a model that combines retrieval and generation, widely used in question-answering systems and text generation tasks. The article provides a detailed introduction to commonly used loss functions in RAG, including cross-entropy loss and contrastive loss, as well as performance metrics such as Answer Semantic Similarity, Answer Correctness, and Context Relevance. These metrics help evaluate the faithfulness and relevance of generated text.

Significance and Value of the Paper

The significance of this article lies in providing a comprehensive reference framework for loss functions and performance metrics for deep learning researchers and practitioners. By systematically analyzing loss functions and performance metrics across different tasks, the article helps readers better understand their selection criteria and application scenarios. Furthermore, the article proposes future research directions such as multi-loss setups and automated loss function search, offering new insights for the further development of deep learning.

Highlights Summary

  1. Comprehensiveness: The article covers loss functions and performance metrics across multiple fields, from classic regression and classification to computer vision, natural language processing, and retrieval-augmented generation, providing a broad reference.
  2. Practicality: By analyzing the pros and cons of each loss function and performance metric in detail, the article offers practical selection advice for researchers and practitioners.
  3. Forward-Looking: The article proposes future research directions such as automated loss function search and robust, interpretable evaluation metrics, providing new insights for the further development of deep learning.

This article serves not only as a detailed reference but also as an important guide for future research in the field of deep learning.