CLASH: Complementary Learning with Neural Architecture Search for Gait Recognition

CLASH: A Gait Recognition Framework Based on Complementary Learning and Neural Architecture Search

Research Background

Gait recognition is a biometric technology that identifies individuals based on their walking patterns. This technology has widespread applications in security screening, video retrieval, and identity recognition due to its ability to perform identification from a distance without requiring the cooperation of the individual. However, recognition methods based on human silhouettes face several issues: the binarized sparse boundary representation lacks rich spatiotemporal information, making most silhouette pixels insensitive to gait patterns. To enhance sensitivity to gait patterns while maintaining robustness in recognition, this paper introduces a framework called Complementary Learning with Neural Architecture Search (CLASH), aiming to address the aforementioned issues.

Neural network structure used for gait recognition

Paper Source

This paper was written by Huanzhang Dou, Pengyi Zhang, Yuhan Zhao, Lu Jin, and Xi Li, who are from Zhejiang University and Ant Group. The paper was published in Volume 14, Issue 8 of the “Journal of Latex Class Files” in August 2021.

Research Process

The research process in this paper mainly includes three parts: developing a gait descriptor, performing complementary learning, and validating experiments.

Development of the Gait Descriptor

Firstly, the authors propose a gait descriptor called Dense Spatial-Temporal Field (DSTF), which captures subtle motion variations by converting from binarized boundaries to texture representations based on dense distances. This method uses Bidirectional Distance Transform (Bi-DT) to convert each pixel’s value to its distance from the nearest boundary pixel. Considering the different semantics and pixel distributions between the foreground and background, the authors propose a foreground/background separation strategy, explicitly separating the foreground and background using signed distance functions and normalization.

Complementary Learning

To effectively utilize the sensitivity of the DSTF gait descriptor and the robustness of human silhouettes, the paper proposes a complementary learning method based on Neural Architecture Search (NAS). Specifically, the authors design a task-specific search space, integrating the features of human silhouettes and DSTF through bi-level optimization and Multi-Descriptor Cell (MD) units.

Experimental Results

Experimental results show that the proposed method outperforms existing methods on multiple mainstream datasets, both in laboratory and real-world environments.

Performance in Laboratory Environments

On the CASIA-B database, the CLASH framework achieves significant performance improvements under three common test conditions (normal, carrying bag, and clothing variation). Particularly, with a resolution of 128×88, the Rank-1 accuracy reaches 98.8%, 96.5%, and 89.3%, respectively.

On the OU-MVLP database, the CLASH framework achieves an average Rank-1 accuracy of 91.9% across all angles, significantly outperforming previous best methods.

Performance in Real-World Environments

On the latest real-world datasets Gait3D and GREW, the CLASH framework improves the Rank-1 accuracy by 16.3% and 19.7%, respectively, significantly outperforming silhouette-based methods and, in some conditions, even methods that rely on additional 3D information.

Research Conclusions and Value

The CLASH framework proposed in this paper effectively improves the accuracy and robustness of gait recognition by combining texture representations based on dense distances with complementary learning methods based on neural architecture search. The DSTF descriptor enhances sensitivity to gait patterns by capturing subtle motion variations and solving numerical issues through a foreground/background separation strategy. The complementary learning achieved through NAS not only reduces manual tuning workload but also ensures efficient complementarity between different gait descriptors. These innovations provide new ideas and tools for gait recognition research, having significant scientific and practical value.

Highlights and Innovations

  1. Gait Descriptor DSTF: Enhances sensitivity to walking patterns through bidirectional distance transform and a foreground/background separation strategy.
  2. Neural Architecture Search for Complementary Learning: Utilizes NAS to automatically design complementary learning architectures, improving the integration of silhouette and DSTF features.
  3. Experimental Results: The CLASH framework exhibits excellent performance in both laboratory and real-world environments across multiple datasets, validating its effectiveness and robustness.

The proposed method significantly advances the accuracy and robustness of gait recognition technology, providing robust technical support for practical applications in security surveillance, identity recognition, and more. Future research can test and optimize this method in more real-world scenarios, further promoting the development and application of gait recognition technology.