• Current Issue
  • Online First
  • Archive
  • Click Rank
  • Most Downloaded
  • 综述文章
    Article Search
    Search by issue
    Select AllDeselectExport
    Display Method:
    2025,34(4):1-17, DOI: 10.15888/j.cnki.csa.009839, CSTR: 32024.14.csa.009839
    Abstract:
    In recent years, as the forged face technology rapidly develops, the face synthesized has been extremely hard for the human eyes to identify, and the application of this technology by some criminals has badly threatened social stability and personal privacy, so the importance of forged face detection technology has become increasingly prominent. This review systematically discusses the current status of forged face detection technology, mainly from two aspects of forged face image detection and forged face video detection. In the aspect of forged face image detection, the methods based on the image spatial domain and frequency domain, identity consistency detection, and the application of face region localization technology are discussed. In the field of forged face video detection, the research focuses on the integration of spatio-temporal features, the utilization of physiological features, and the combination of audiovisual information. In addition, the study introduces the commonly used evaluation indicators and systematically analyzes a variety of important data sets, including their characteristics and application scenarios. At the same time, it also points out the limitations in the current literature, such as the lack of robustness of adversarial samples and the poor adaptability of detection methods to new forgery techniques. Based on these analyses, this study puts forward the possible research directions in the future, including the optimization of cross-domain detection technology, the exploration of new algorithms, and the study of the model interpretability. This review not only provides researchers with a comprehensive understanding of fake face detection technology but also points out the development direction for subsequent research, possessing high theoretical value and practical application significance.
    2025,34(4):18-33, DOI: 10.15888/j.cnki.csa.009865, CSTR: 32024.14.csa.009865
    Abstract:
    The neural radiation field (NeRF) has significant advantages in generating high-fidelity maps thanks to its neural implicit representation-based scene. The application of NeRF in simultaneous localization and mapping (SLAM), namely the NeRF-based SLAM method, enables continuous 3D modeling while achieving high-precision localization to enhance the quality and detail of the scene reconstruction by rendering new perspectives and predicting unknown regions. To track the latest research results in this field, this study reviews and summarizes the key algorithms of NeRF-based SLAM in recent years. Firstly, the core principle of NeRF technology is introduced and a comprehensive overview of the framework of NeRF-based SLAM methods is given, followed by focusing on the improvements and optimizations of NeRF-based SLAM, including improving the efficiency of neural implicit representation, solving the large-scale scene building problem, adding loopback and global optimization to achieve global consistency and solving the dynamic interference problem. Finally, an outlook on the NeRF-based SLAM method is presented to provide valuable references for related researchers to promote more innovative research.
    2025,34(4):34-44, DOI: 10.15888/j.cnki.csa.009819, CSTR: 32024.14.csa.009819
    Abstract:
    Aiming at the poor accuracy of monocular 3D object detection algorithms caused by the scale differences of objects with different depths in monocular images, a detection algorithm based on fused sampling and depth-scale constraints is proposed. Firstly, to enhance the ability of the sampled features to represent objects at different scales, a multi-scale fusion module (MFM) is constructed. It fuses the sampled features at different levels and scales through hierarchical aggregation and iterative aggregation, thereby improving the ability to extract implicit scale features of the objects. In addition, a depth-scale correlation module (DSCM) is constructed. It uses the linear projection constraint between depth and scale for compensatory scaling of objects at different scales to the same feature level, balancing the model's focus on objects at different distances. Quantitative results based on the KITTI dataset and Waymo dataset show that for both types of datasets, the proposed algorithm improves the overall average accuracy AP3D by 1.56 percentage points and 3.07 percentage points, respectively, compared to similar algorithms under multiple difficulties, which verifies the effectiveness and generalization of the algorithm. Meanwhile, qualitative results based on the two datasets validate that the algorithm significantly mitigates the impact of the object scale differences on detection performance.
    2025,34(4):45-54, DOI: 10.15888/j.cnki.csa.009838, CSTR: 32024.14.csa.009838
    Abstract:
    To address the issues of limited sample size and imbalanced categories in existing rural road image datasets, a data augmentation method based on an improved StyleGAN is proposed. This approach introduces a decoupled mapping network into the original StyleGAN framework to reduce the coupling degree of the W-space latent code. By integrating the advantages of convolution and Transformer, this study designs a convolution-coupled transfer block (CCTB). The core cross-window self-attention mechanism within this module enhances the network’s ability to capture complex context and spatial layouts. These two improvements significantly boost network performance. Ablation experiments comparing the original and improved StyleGAN networks show that the IS index increases from 42.38 to 77.31, and the FID value decreases from 25.09 to 12.42, demonstrating a substantial improvement in data generation quality and authenticity. To verify the impact of data augmentation on model performance, two classic and mainstream object detection algorithms are used for testing. Performance differences between the original and augmented datasets are compared, further confirming the effectiveness of the improved methods.
    2025,34(4):55-63, DOI: 10.15888/j.cnki.csa.009825, CSTR: 32024.14.csa.009825
    Abstract:
    Currently, there are various methods for identifying lies, including the use of lie detectors. However, these methods have limited effectiveness in execution, as they not only require contact with the subject being tested for lies but also require relevant personnel to possess professional knowledge, making them inconvenient and less effective. Psychological research shows that micro-expressions are subtle muscle movements on the face with an extremely short duration, which can reflect a person’s true inner state when they occur. Related studies show that micro-expression features can serve as clues for deception recognition. This study focuses on deception recognition based on micro-expression features. Firstly, a dataset called MED, which contains micro-expression data when people are lying, is constructed. Secondly, a micro-expression feature learning model named MEDR based on a multi-layer self-attention mechanism is designed. It can recognize lies based on the learned micro-expression features in both lying and non-lying situations. Finally, experimental comparisons between the proposed model and some existing models are conducted on the newly constructed dataset. Experimental results show that the proposed model achieves an accuracy of 94.33% on the self-made high-quality dataset, indicating its excellent performance in deception recognition.
    2025,34(4):64-75, DOI: 10.15888/j.cnki.csa.009808, CSTR: 32024.14.csa.009808
    Abstract:
    With the application of network video platform (NVP), network videos often face copyright infringement and cross-platform copyright detection issues when shared across different video platforms. Therefore, this study proposes a blockchain-based cross-platform network video copyright protection scheme (BCVCP), which aims to protect network video copyrights across platforms by means of blockchain and through ownership sequence (OS) generation and detection. This study includes identity authentication, keyframe extraction, ownership sequence generation and detection, and network video control management. Specifically, before operations such as video uploading or access, identity authentication needs to be carried out to ensure identity information security. Secondly, during the process of uploading network videos, an ownership sequence is generated and stored in distributed nodes. Then, the keyframes of the video are extracted and the generated ownership sequence is embedded into these keyframes. Finally, smart contracts are invoked for cross-platform ownership sequence detection and network video dissemination management to avoid infringement behaviors. In the experiments, the robustness of ownership encoding quality and ownership recognition during cross-platform network video transmission is verified, thereby protecting the copyright of network videos.

下载归智APP ,关注本刊

External Links

You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063