Volume 33,Issue 3,2024 Table of Contents

Survey on Relation Extraction Research Based on Graph Neural Network

SHEN Xin-Yi , LI Hua-Yu , YAN Yang , ZHANG Zhi-Kang

2024, 33(3):1-11. DOI: 10.15888/j.cnki.csa.009433 CSTR: 32024.14.csa.009433

Abstract (867) HTML (3113) PDF 1.51 M (4150) Comment (0) Favorites

Abstract:In relation extraction tasks, building dependency trees or syntactic trees is usually adopted to obtain deeper and richer structural information. Graph neural network, as a powerful representation learning method for graph data structures, can better model such complex data structures. This study introduces a relation extraction method based on graph neural network, aiming to gain a deep understanding of the latest research progress and trends in this field. Firstly, it briefly introduces the classification and structure of relation graph neural networks and then elaborates on the core technology and application scenarios of relation extraction methods based on graph neural networks, including sentence-level and document-level methods, and joint entity-relation extraction methods. The advantages, disadvantages, and performance of each method are analyzed and compared, and possible future research directions and challenges are discussed.

Research Progress in Segmentation of Retinal Blood Vessel Images Based on Deep Learning

HE Xin , WANG Xiao-Yan , ZHOU Qi-Xiang , ZHANG Wen-Kai

2024, 33(3):12-23. DOI: 10.15888/j.cnki.csa.009421 CSTR: 32024.14.csa.009421

Abstract (833) HTML (2343) PDF 2.24 M (5095) Comment (0) Favorites

Abstract:Retinal blood vessel image segmentation has a good auxiliary diagnostic effect on various eye diseases such as glaucoma and diabetic retinopathy. Currently, deep learning, with its powerful ability to discover abstract features, is expected to meet people’s needs for extracting feature information from retinal blood vessel images for automatic image segmentation. It has become a research hotspot in the field of retinal blood vessel image segmentation. To better grasp the research progress in this field, this study summarizes the relevant datasets and evaluation indicators and elaborates in detail on the application of deep learning in retinal blood vessel image segmentation. It focuses on the basic ideas, network structure, and improvements of various segmentation methods, analyzing the limitations and challenges faced by existing retinal blood vessel image segmentation methods and looking forward to the future research direction in this field.

Time Series Anomaly Detection With External Autoencoder Based on Graph Deviation Network

ZHANG Fu-Rong , GU Lei

2024, 33(3):24-33. DOI: 10.15888/j.cnki.csa.009423 CSTR: 32024.14.csa.009423

Abstract (673) HTML (988) PDF 2.16 M (1458) Comment (0) Favorites

Abstract:With the improvement of the Internet and connection technology, the data generated by sensors is gradually becoming complex. Deep learning methods have made great progress in anomaly detection of high-dimensional data. The graph deviation network (GDN) learns the relationship between sensor nodes to predict anomalies and has achieved certain results. Since the GDN model fails to deal with time dependence and instability of abnormal data, an external attention autoencoder based on GDN (AEEA-GDN) is proposed to deeply extract features. In addition, an adaptive learning mechanism is introduced during model training to help the network better adapt to changes in abnormal data. Experimental results on three real-world collected sensor datasets show that the AEEA-GDN model can more accurately detect anomalies than baseline methods and has better overall performance.

Outlier Detection Based on Autoencoder Normalizing Flow

ZHONG Hai-Xin , WANG Hui , GUO Gong-De

2024, 33(3):34-42. DOI: 10.15888/j.cnki.csa.009420 CSTR: 32024.14.csa.009420

Abstract (536) HTML (1052) PDF 1.38 M (1626) Comment (0) Favorites

Abstract:Detecting outliers is crucial for practical applications in large and high-dimensional datasets. Outlier detection is the process of identifying data points that deviate from the typical data distribution. This process primarily involves density estimation. Substantial advancements are achieved by models like the deep autoencoder Gaussian mixture model, which initially reduces dimensionality and subsequently estimates density. However, it introduces noise into the low-dimensional latent space and faces limitations in optimizing the density estimation module, such as the requirement to ensure positive definiteness of the covariance matrix. To overcome these constraints, this study introduces the deep autoencoder normalizing flow (DANF) for unsupervised outlier detection. The model employs deep autoencoders to produce low-dimensional latent space representations and reconstruction errors for individual input samples. These outputs are subsequently fed into a normalizing flow (NF) for transformation into a Gaussian distribution. Experimental results on several widely recognized benchmark datasets reveal that the DANF model consistently surpasses state-of-the-art outlier detection methods. The most notable improvement is a remarkable 26.43% increase in the F1-score evaluation metric.

Link Prediction for Social Networks Based on Interacting Degree of Nodes

XU Rui-Yang , XU Zhen-Yu , LI Jia-Yin , XU Li

2024, 33(3):43-51. DOI: 10.15888/j.cnki.csa.009430 CSTR: 32024.14.csa.009430

Abstract (664) HTML (796) PDF 1.88 M (1183) Comment (0) Favorites

Abstract:Link prediction is an important means of mining potential relationships between nodes in the future through known network topology and node attributes, which is an effective method for predicting missing links and identifying false links and has practical significance in the study of social network structure evolution. Traditional link prediction methods are based on the similarity of node information or path information. However, the former considers a single index, resulting in limited prediction accuracy, and the latter is not suitable for application in large-scale networks due to excessive computational complexity. Through the analysis of network topology, this study proposes a social network link prediction method based on the interacting degree of nodes (IDN). The method first introduces the concept of node efficiency based on the path characteristics between nodes in the network, which improves the accuracy of link prediction between nodes without common neighbors. In order to further explore the relevant attributes of common neighbors between nodes, by analyzing the topology of common neighbors between nodes, the method also innovatively integrates the path characteristics and local information to propose the definition of the IDN in a social network, which accurately portrays the degree of similarity between nodes and thus enhances the prediction ability of network links. Finally, this study validates the IDN method with the help of six real network datasets, and the experimental results show that, compared with the current mainstream algorithms, the method proposed in this study shows better prediction performance in both AUC and Precision evaluation indexes, and the prediction results have been improved by an average of 22% and 54%, respectively. Therefore, the proposal of node interaction degree has high feasibility and effectiveness in link prediction.

Session Recommendation Incorporating Dual-branch Dynamic Preferences

SHEN Xue-Li , WANG Le , TIAN Xue-Cheng

2024, 33(3):52-62. DOI: 10.15888/j.cnki.csa.009447 CSTR: 32024.14.csa.009447

Abstract (578) HTML (965) PDF 1.79 M (1332) Comment (0) Favorites

Abstract:Session-based recommendation algorithms only statically model a single preference of users and fail to capture the preference fluctuation of the users affected by the environment, thus reducing the recommendation accuracy. Therefore, this study proposes a session recommendation method that integrates dual-branch dynamic preferences. First, the heterogeneous hypergraph is used to model different types of information, and a dual-branch aggregation mechanism is designed to acquire and integrate the information in the heterogeneous hypergraph and learn the relationship between multiple types of nodes. Then, a price-embedded enhancer is used to strengthen the relationship between categories and prices. Second, a two-layer preference encoder is designed, which uses a multi-scale temporal Transformer to extract the user’s dynamic price preference, and a soft attention mechanism and reverse position encoding are used to learn the user’s dynamic interest preference. Finally, a gating mechanism is used to integrate the user’s multi-type dynamic preferences and make recommendations to users. By conducting experiments on two datasets, namely Cosmetics and Diginetica-buy, the results prove that there is a significant improvement in Precision and MRR evaluation metrics compared with other algorithms.

Traffic Object Tracking Based on In-vehicle Environment

MENG Ling-Chen , MENG Qiao , HUANGFU Jun-Yi , LI Xin

2024, 33(3):63-72. DOI: 10.15888/j.cnki.csa.009410 CSTR: 32024.14.csa.009410

Abstract (1173) HTML (724) PDF 4.39 M (1516) Comment (0) Favorites

Abstract:This study proposes a traffic object tracking method based on improved YOLOv5 and ByteTrack to address the problem of decreased tracking accuracy caused by the difficulty in recognizing small objects in the car environment and camera movement. Firstly, the study introduces the Transformer and weighted feature pyramid network (BiFPN) structure to reconstruct the YOLOv5 detection network. This effectively captures the global dependency relationships of features, alleviates the problem of information loss for small objects in deep convolutional layers, and improves the performance of object detection in vehicular environments. Subsequently, based on ByteTrack, the study proposes the CMC-ByteTrack tracking strategy that adds camera motion compensation. The method more accurately describes the data correlation relationship between the previous and subsequent frames of the video, improving tracking accuracy during significant camera displacement. Experimental results show that the improved YOLOv5 achieves mean average precision (mAP) of 82.2%, and 3.9% increase in comparison with the original algorithm. After integration with CMC-ByteTrack, the multiple object tracking accuracy (MOTA) is increased by 2.8% in comparison with that of the original tracking method.

Semantic Segmentation of Street Scenes Images Based on Multi-scale Feature Pyramid Fusion

QU Hai-Cheng , WANG Ying , DONG Kang-Long , LIU Wan-Jun

2024, 33(3):73-84. DOI: 10.15888/j.cnki.csa.009411 CSTR: 32024.14.csa.009411

Abstract (510) HTML (980) PDF 3.33 M (1625) Comment (0) Favorites

Abstract:This study proposes a semantic segmentation network called LDPANet to address the challenges of significant variations in target sizes and the difficulty of efficient extraction of multi-scale features in semantic segmentation tasks of street scene images. Firstly, the void convolution is combined with the deeply separable convolution introduced into the residual learning unit to optimize the encoder structure, which reduces computational complexity and alleviates the problem of gradient vanishing. Secondly, the network utilizes a layer-wise iterative void spatial pyramid to sequentially fuse top-down feature information, enhancing the effective interaction of contextual information. After multi-scale feature fusion, an attribute attention module is introduced to suppress redundant information and strengthen important features. Furthermore, channel-extended upsampling replaces two-wire interpolation upsampling as the decoder to further improve the resolution of feature maps. Finally, the accuracy of the LDPANet method on Cityscapes and CamVid datasets reaches 91.8% and 87.52%, respectively. Compared with the network model in recent years, the proposed network model can accurately extract pixel position information and spatial dimension information and improve the accuracy of semantic segmentation.

Few-shot Hyperspectral Classification Siamese Network Combining Attention and Improved Sample Selection Method

YANG Yu-Xin , GUO Gong-De , WANG Hui

2024, 33(3):85-94. DOI: 10.15888/j.cnki.csa.009432 CSTR: 32024.14.csa.009432

Abstract (461) HTML (902) PDF 2.01 M (1503) Comment (0) Favorites

Abstract:In order to solve the problem of the insufficient number of hyperspectral image samples due to the difficulty of artificial labeling, a small sample twin network algorithm combining attention and spatial neighborhood is proposed in this study. Firstly, the hyperspectral image is preprocessed by PCA to achieve data dimensionality reduction. Secondly, the training samples of the model are selected by means of interval sampling and edge sampling to effectively reduce redundant information. After that, the Siamese network combines the samples in the form of patches of different sizes and constructs the sample pairs for training as a training set, which not only realizes the effect of data enhancement but also fully extracts the spectral information of target pixels and the spatial information of its neighborhoods while extracting spectral information features. Finally, the attention module of spectral dimension and the similarity measurement module of spatial dimension are added to distribute the weight of spectral information and spatial neighborhood information respectively, so as to improve classification performance. The experimental results show that the proposed method achieves better experimental results compared with common methods on some public datasets.

Pneumonia Detection Based on Deep Residual Network and Dictionary Learning

ZHU Zhi-Qiang , BIAN Wei-Xin , JIE Biao , HUANG Yi , LI Wen-Hu

2024, 33(3):95-102. DOI: 10.15888/j.cnki.csa.009436 CSTR: 32024.14.csa.009436

Abstract (530) HTML (725) PDF 1.75 M (1273) Comment (0) Favorites

Abstract:Due to factors such as air pollution and smoking, pneumonia has become one of the diseases with the highest mortality rates in humans. The application of machine learning and deep learning technology in medical image detection has provided assistance for clinical experts in diagnosing various diseases. However, there is a lack of effective paired lung X-ray datasets, and existing methods for pneumonia detection use universal classification models that are not specific to pneumonia tasks. As a result, it is difficult to detect subtle differences between pneumonia images and normal images, resulting in recognition failure. Therefore, this study expands the normal images in the dataset through data cropping, rotation, and other methods and uses a 50-layer deep residual network to learn the shallow pneumonia features in chest X-rays. Then, through a two-layer dictionary, the pneumonia features learned by the residual network are further abstracted and learned, and subtle differences between different lung images are discovered. Finally, a pneumonia detection model is constructed by fusing the multi-level pneumonia features extracted from residual networks and dictionary learning. To validate the effectiveness of the algorithm, the performance of the pneumonia detection model is evaluated on the chest X-ray pneumonia dataset. According to the test results, the proposed model has a detection accuracy of 97.12%. In the indicator test, the score on the harmonic mean between accuracy and recall is 97.73%. Compared with existing methods, it has achieved higher recognition accuracy.

CT Multi-organ Segmentation Based on MAU-Net

BU Hong-Xi , HE Li-Wen

2024, 33(3):103-110. DOI: 10.15888/j.cnki.csa.009417 CSTR: 32024.14.csa.009417

Abstract (564) HTML (937) PDF 2.13 M (1558) Comment (0) Favorites

Abstract:Accurate segmentation of multiple organs based on computerized tomography (CT) images enables the precise diagnosis of lesions, facilitates rapid treatment planning, and improves the efficiency of clinical work. However, traditional segmentation algorithms often struggle with organs that have large deformations, small volumes, and blurry boundaries, resulting in relatively poor segmentation performance. This study proposes an improved U-Net medical image segmentation network called (MAU-Net), which aims to achieve accurate segmentation of multiple organs by introducing two modules. The multi-scale dilated convolution module captures multi-scale features of the target organs using different kernel sizes. The dynamic attention module precisely extracts important features to achieve weight balance between branches. The superiority of MAU-Net is confirmed through ablation experiments and comparative experiments with other mainstream networks. Compared to the traditional U-Net model, MAU-Net achieves an average Dice similarity coefficient (DSC) improvement of 3.39% and an average 95% Hausdorff distance (HD) reduction of 4.84 mm across all organs. MAU-Net demonstrates remarkable robustness and potential for applications in multi-organ segmentation tasks, contributing to improving clinical workflow efficiency and diagnostic accuracy in medical settings.

Indoor Scene Segmentation Algorithm Based on Fusion of Deep Information

WANG Liu , LIANG Ming-Ju

2024, 33(3):111-117. DOI: 10.15888/j.cnki.csa.009429 CSTR: 32024.14.csa.009429

Abstract (507) HTML (959) PDF 2.59 M (955) Comment (0) Favorites

Abstract:A lightweight semantic segmentation network based on encoder-decoder architecture with fusion attention mechanism is proposed to address the issues of feature loss and effective bimodal fusion in image semantic segmentation in complex indoor scenes. Firstly, two residual networks are used as backbone networks to extract features from RGB and depth images, and a polarized self-attention (PSA) module is introduced into the encoder. Then, a bimodal fusion module is designed and introduced to effectively fuse RGB and depth features at different stages. A context module is introduced to obtain dependencies between regions. Finally, three decoders of different sizes are applied to skip connect and fuse the previous multi-scale feature maps to improve the segmentation accuracy of small targets. The proposed network model is trained and tested on the NYUDv2 datasets and compared with more advanced RGB-D semantic segmentation networks. The experiments show that the proposed network has good segmentation performance.

Image Arbitrary Style Transfer with Preserving Detailed Features

JIANG Heng-Chang , ZHANG Du-Zhen

2024, 33(3):118-125. DOI: 10.15888/j.cnki.csa.009449 CSTR: 32024.14.csa.009449

Abstract (430) HTML (830) PDF 3.94 M (1662) Comment (0) Favorites

Abstract:Some mainstream image arbitrary style transfer models still have limitations in maintaining the saliency information and detailed features of content images, resulting in problems such as content blurring and loss of details in the generated images. To solve the problems, this study proposes an arbitrary style transfer model that can effectively preserve the detailed features of content images. The model includes flexible fusing shallow to deep multi-layer image features extracted from the encoder. A new feature fusion is proposed, which allows for a high-quality fusion of content features and style features. In addition, a new loss function is proposed, which can well preserve the global structure of content and style and eliminate artifacts. The experimental results show that the proposed image arbitrary style transfer model can effectively balance style and content, preserve the complete semantic information and detailed features of the content image, and generate stylized images with better visual effects.

Cross Layer Collaborative Attention and Channel Group Attention for Fine-grained Image Classification

HE Zhi-Xiang , QI Qi , HE Wei , GUO Long-Yuan

2024, 33(3):126-133. DOI: 10.15888/j.cnki.csa.009419 CSTR: 32024.14.csa.009419

Abstract (542) HTML (913) PDF 1.70 M (1529) Comment (0) Favorites

Abstract:The main challenge of fine-grained image classification lies in the high similarity between classes and differences within classes. Most of the existing research is based on deep features and ignores shallow details. However, deep semantic features often lose a lot of details due to multiple convolution and pooling operations. To better integrate shallow and deep information, this study proposes a fine-grained image classification method based on cross-layer collaborative attention and channel grouping attention. First, the pre-trained model loaded by ResNet50 is taken as the backbone network to extract features, and the features extracted by the last three stages are output in the form of three branches. The features of each branch are calculated and coordinated with the features of the other two branches in a cross-layer manner and interactive fusion. Specifically, the features of the last stage pass through the channel grouping attention module to enhance the learning ability of semantic features. Model training can be efficiently trained in an end-to-end manner without bounding boxes and annotations. Experimental results show that the algorithm performs well on three common fine-grained image datasets CUB-200-2011, Stanford Cars, and FGVC-Aircraft. The accuracy rates reach 89.5%, 94.8%, and 94.7%, respectively.

Multi-view Low-rank Sparse Subspace Clustering Algorithm Based on Three-way Decision

FANG Ying-Jie , JIA Tian-Xia , XU Yi , LUO Fan

2024, 33(3):134-145. DOI: 10.15888/j.cnki.csa.009424 CSTR: 32024.14.csa.009424

Abstract (519) HTML (840) PDF 1.67 M (1398) Comment (0) Favorites

Abstract:Multi-view subspace clustering is a method for learning a unified representation of all views from subspaces and exploring the latent clustering structure of data. As a clustering approach for processing high-dimensional data, subspace clustering has become a focal point in the field of multi-view clustering. Multi-view low-rank sparse subspace clustering method combines low-rank representation and sparse constraints. During the construction of the affinity matrix, this algorithm utilizes low-rank sparse constraints to capture both global and local structures of the data, thereby optimizing the performance of subspace clustering. The three-way decision, rooted in the rough set model, is a decision-making concept often applied in clustering algorithms to reflect the uncertainty relationship between objects and clusters during the clustering process. In this study, inspired by the idea of the three-way decision, a voting system is designed as the decision basis. The system is integrated with multi-view sparse subspace clustering to form a unified framework, resulting in a novel algorithm. Experimental results on various artificial and real-world datasets demonstrate that this algorithm can enhance the accuracy of multi-view clustering.

Algorithm for Maximizing Algebraic Connectivity Based on Graph Neural Network

XIA Chun-Yan , HOU Xin-Min

2024, 33(3):146-157. DOI: 10.15888/j.cnki.csa.009435 CSTR: 32024.14.csa.009435

Abstract (470) HTML (873) PDF 2.57 M (1487) Comment (0) Favorites

Abstract:As the number of agents increases, the number of potential communication links in a multi-agent system grows exponentially. Excessive redundant links lead to a significant waste of energy and maintenance costs for the system, while blindly removing links will reduce the stability and security of the system. Algebraic connectivity is one of the important metrics to measure the connectivity of a graph. However, traditional semidefinite programming (SDP) methods and heuristic algorithms for maximizing algebraic connectivity in large-scale scenarios are time-consuming. This study proposes a supervised graph neural network model to optimize the algebraic connectivity of multi-agent systems. The study applies the traditional SDP method in small-scale task scenarios, obtaining a sufficient amount of diverse training samples and labels. Based on this, it trains a graph neural network model that can be used in larger-scale task scenarios. The experimental results indicate that when removing 15 edges, the proposed model achieves an average performance of 98.39% of the traditional SDP method. In addition, the model has extremely limited computational time and can be extended to real-time scenarios.

Multi-objective Path Planning Based on Reinforcement Learning

ZHOU Yi , LIU Jun

2024, 33(3):158-169. DOI: 10.15888/j.cnki.csa.009418 CSTR: 32024.14.csa.009418

Abstract (798) HTML (1121) PDF 4.02 M (2109) Comment (0) Favorites

Abstract:The path planning problem for mobile robots involves a large number of nodes and a wide search space. It also considers factors such as safety and real-time requirements. To address the multi-objective path planning problem for mobile robots, this study proposes a novel multi-objective intelligent optimization algorithm that combines reinforcement learning. Firstly, the algorithm adopts NSGA-II as the base framework and equips individuals with learning capabilities by reinforcement learning. A SARSA operator is designed to improve the global search efficiency of the algorithm. Secondly, to accelerate the convergence speed and ensure population diversity, the study introduces an adaptive simulated binary crossover operator (tanh-SBX) as an auxiliary operator and divides the population into two sub-populations with different properties: elite and non-elite populations. Finally, the study designs four different strategies and calculates the probability of updating strategies using the Metropolis criterion of the simulated annealing algorithm. It allows the most suitable strategy to guide the population’s optimization direction, balancing exploration and exploitation. Simulation experiments demonstrate that the proposed algorithm can find optimal paths in environments with different complexities. Compared to traditional intelligent biomimetic algorithms, the proposed algorithm effectively balances optimization objectives and discovers safer and better paths in more complex environments.

Multi-image Encryption Algorithm Based on Multi-chaotic System

GAO Ruo-Yun , BAI Mu-Dan , HUANG Jia-Xin , GUO Ya-Li

2024, 33(3):170-177. DOI: 10.15888/j.cnki.csa.009412 CSTR: 32024.14.csa.009412

Abstract (496) HTML (762) PDF 2.01 M (1650) Comment (0) Favorites

Abstract:Aiming at the security problem of multiple images in transmission, this study proposes a multi-image encryption algorithm based on multi-chaotic systems. First, discrete wavelet transform is adopted to preprocess multiple images to get a large mosaic image. Then, a chaotic sequence is generated using logistic-sine-cosine (LSC) map to generate a matrix O for scrambling pixel positions. Finally, a hyper-chaotic Lorenz system is applied to generate a four-dimensional chaotic sequence. It is used to perform bidirectional diffusion and row-column scrambling on the scrambled image to obtain the final ciphertext image. The proposed method has a simple encryption and decryption process and high execution efficiency. The experimental results have been analyzed from multiple aspects and show that the algorithm has a large key space and can resist multiple attack methods, with good encryption performance and security.

Foreign Object Recognition Based on Coordinated Attention and Atrous Convolution

WANG Chun-Lin , WU Chun-Lei , LI Can-Wei , ZHU Ming-Fei

2024, 33(3):178-186. DOI: 10.15888/j.cnki.csa.009416 CSTR: 32024.14.csa.009416

Abstract (523) HTML (675) PDF 2.26 M (1347) Comment (0) Favorites

Abstract:In the industrial production of factories in China, belt conveyors play an important role. However, in the process of transporting materials, wooden boards, metal pipes, large metal sheets, etc. are often mixed into the materials, causing damage to the conveyor belt of the belt conveyor and leading to huge economic losses. To detect irregular foreign objects on the conveyor belt, this study designs a new foreign object detection method. It proposes a single stage foreign object recognition method based on coordinated attention and atrous convolution to address the issues of insufficient image feature extraction ability and relatively small network receptive field in traditional foreign object detection methods. Firstly, the network utilizes the coordinated attention mechanism to make the network pay more attention to the spatial information of images and enhance important features in the images, improving the performance of the network. Secondly, while extracting multi-scale features from the network, the static convolution of the original network is transformed into an atrous convolution, effectively reducing the information loss caused by conventional convolution. In addition, the study also uses a new loss function, promoting the property of the network. The experimental results show that the proposed network can effectively identify foreign objects on the conveyor belt and effectively complete the foreign object detection task.

6D Pose Refiner Network Combining Residual Attention and Standard Deviation

DENG Jiang , CHEN Yao-Jie , ZHANG Meng-Jie

2024, 33(3):187-194. DOI: 10.15888/j.cnki.csa.009444 CSTR: 32024.14.csa.009444

Abstract (472) HTML (940) PDF 1.41 M (1573) Comment (0) Favorites

Abstract:In the domain of 6D object pose estimation, existing algorithms often struggle to achieve precise and robust pose estimation of the target objects. To address this challenge, this study introduces an object 6D pose refinement network that incorporates residual attention, hybrid dilated convolution, and standard deviation information. Firstly, in the Gen6D image feature extraction network, traditional convolutional modules are replaced with hybrid dilated convolution modules to expand the receptive field and enhance the capability to capture global features. Subsequently, within the 3D convolutional neural network, a residual attention module is integrated. This assists in distinguishing the importance of feature channels, hence extracting key features while minimizing the loss of shallow-layer features. Finally, the study introduces standard deviation information into the average distance loss function, enabling the model to discern more pose information of the object. Experimental results demonstrate that the proposed network achieves ADD scores of 68.79% and 56.03% on the LINEMOD dataset and GenMOP dataset, respectively. Compared to the Gen6D network, there is an improvement of 1.78% and 5.64% in the ADD scores, validating the significant enhancement in the accuracy of 6D pose estimation brought about by the proposed network.

Non-normal Data Synthesis Based on Mixed Data Type Correlation Measurement

WANG Chun-Dong , ZHANG Shi-Peng

2024, 33(3):195-205. DOI: 10.15888/j.cnki.csa.009441 CSTR: 32024.14.csa.009441

Abstract (333) HTML (647) PDF 2.95 M (1332) Comment (0) Favorites

Abstract:Data plays an extremely important role in research and development in fields such as machine learning and artificial intelligence. However, some real-world factors prevent data consumers from obtaining real datasets that meet their work requirements, such as privacy issues, data scarcity, and poor data quality. In response to this situation, this study develops a non-normal data synthesis algorithm (KMSI) as an improvement to the sampling-iteration (SI) technique. This algorithm utilizes a mixed-type correlation coefficient matrix to reduce measurement errors in various steps of the SI technique, including target setting and control loops. It replaces Bootstrap sampling with kernel density estimation sampling to avoid using real data. Experimental results show that, compared to the SI technique, KMSI is capable of handling complex and mixed-type datasets and does not include real data in the synthetic results. Furthermore, compared to other enhancement methods, KMSI offers users more customization options for the sample size in synthetic datasets.

Medical Image Segmentation Method Based on DH-Swin Unet

WANG Yi-Ni , SHI Hong-Wei

2024, 33(3):206-212. DOI: 10.15888/j.cnki.csa.009440 CSTR: 32024.14.csa.009440

Abstract (526) HTML (878) PDF 1.54 M (1422) Comment (0) Favorites

Abstract:Bone and joint diseases have been one of the most prevalent diseases in human history. With the acceleration of aging, these diseases have become increasingly widespread, posing great challenges to orthopedic surgeons. Research on image segmentation for human joints can assist doctors in clinical diagnosis and treatment. However, due to the presence of noise, blurring, and low contrast, medical image feature extraction is more difficult than ordinary images. In addition, most segmentation models only use simple skip connections between the encoder and decoder, without addressing the issues of information gaps and losses during the skip connection process. To this end, this study proposes the DH-Swin Unet algorithm for medical image segmentation. On the basis of the Swin-Unet model, the densely connected Swin Transformer block and hybrid attention mechanism are introduced into skip connection to enhance the feature information transmission. The real clinical data provided by a hospital ranking top three are used to evaluate the performance of the proposed method. The results show that the DSC and HD of the model reach 86.79% and 32.05 mm respectively, and the model has certain practical value in the clinical diagnosis of joint diseases.

Predicting Grouped Rollers Speed Series in Annealing Furnace Based on BSCWEformer

YUE Xiao-Guang , SHI Yuan-Bo

2024, 33(3):213-219. DOI: 10.15888/j.cnki.csa.009434 CSTR: 32024.14.csa.009434

Abstract (387) HTML (669) PDF 1.34 M (1034) Comment (0) Favorites

Abstract:The length of the strip steel in the annealing furnace is affected by temperature, tension, and other factors, resulting in changes in roller speed and uncertainty in weld position and threatening production safety. To accurately predict roller speed, this study proposes the banded sparse Cauchy weight enhanced Transformer (BSCWEformer) model. The model adopts a banded sparse self-attention structure enhanced by Cauchy distribution weight values calculated from relative positions, which improves the importance of adjacent input sequences and reduces the complexity of self-attention from quadratic to linear. Through experiments with actual production data and comparison with LogSparse Transformer, Transformer, RNMT+, and other models, the BSCWEformer model shows higher accuracy in predicting grouped roller speed series.

Road Defect Detection Based on Image Point Cloud

LI Wei-Xiang , LI Wu-Jin , CHEN Si-Yuan

2024, 33(3):220-225. DOI: 10.15888/j.cnki.csa.009443 CSTR: 32024.14.csa.009443

Abstract (470) HTML (1108) PDF 1.46 M (1892) Comment (0) Favorites

Abstract:To address the challenge of detecting road defects in drone-captured image point clouds, this study introduces a road defect detection method based on point cloud slicing, plane fitting, and clustering. Firstly, drone images are captured to facilitate 3D reconstruction and the generation of image point clouds. Subsequently, point cloud data undergoes slope filtering and statistical outlier filtering to eliminate noise and anomalous data points. Next, the point clouds are sliced, and a random sample consensus (RANSAC) plane fitting algorithm is applied to estimate the road’s plane model. Then, the point cloud DBSCAN clustering algorithm is employed to differentiate between edge noise and road damage point clouds. Finally, the point cloud slicing technique is utilized to assess the extent of the damage. In the experiments, the study employs actual drone-collected point cloud data and compares the proposed method with an approach relying on point cloud verticality features. The experimental results demonstrate that the proposed method exhibits a high level of accuracy and robustness, with a volume estimation error of only 1307 cm³. Compared to traditional methods, the proposed method excels in precisely detecting road damage and adapting to intricate road shape variations.

Improved Clonal Selection Algorithm Based on Directed Mutation Strategy

PENG Xu , YANG Chao , ZHANG Wen-Hao , WANG Dao-Wei , JIANG Bi-Bo

2024, 33(3):226-232. DOI: 10.15888/j.cnki.csa.009359 CSTR: 32024.14.csa.009359

Abstract (488) HTML (790) PDF 1.86 M (1481) Comment (0) Favorites

Abstract:This study proposes an improved clonal selection algorithm based on directed mutation strategy (DMSCSA) to address the problems of the clonal selection algorithm (CSA), such as slow search speed, low convergence accuracy, and easy fall into local optimum. The algorithm introduces the Halton sequence to initialize the population, which enhances the uniformity of the initial population distribution and realizes a more efficient search of the solution space. The golden sine mutation strategy is adopted to conduct the directional mutation of the excellent antibodies in the iterative process, which improves the convergence speed of the algorithm. The introduction of the Cauchy mutation strategy can improve the algorithm’s capability to jump out of the local optimum while ensuring population diversity. Eight different test functions in the CEC2019 test function set are utilized and compared with other algorithms of the same type. The experimental results show that the DMSCSA improves the optimization accuracy and convergence speed.

Vehicle Scheduling Optimization at Unsignalized Intersection Based on Improved Sparrow Search Algorithm

LI Jin-Long , LIU Wei

2024, 33(3):233-244. DOI: 10.15888/j.cnki.csa.009409 CSTR: 32024.14.csa.009409

Abstract (418) HTML (817) PDF 1.93 M (1359) Comment (0) Favorites

Abstract:In this study, the internal area of an unsignalized intersection is divided into multiple road right points, and the road right points occupied by the traffic accident caused by the collision between the vehicle and the pedestrian or the non-motor vehicle are set as “failure points”. This work studies the traffic efficiency of the unsignalized intersection when vehicle failure occurs. The sparrow search algorithm (SSA) is selected to improve traffic efficiency, while SSA is easy to fall into local extreme points in the early stage and has low optimization accuracy in the later stage. To this end, the study introduces the improved strategy of adaptive learning parameters and level-based opposition-based learning to enhance the global search ability in the early stage and the deep exploration ability in the later stage. SSA based on adaptive parameters and level-based opposition-based learning (ALSSA) is proposed. A total of 13 benchmark test functions and the Wilcoxon rank-sum test P value are selected for verification separately. Experimental results show that ALSSA has a great improvement in global search capability and convergence compared with other algorithms. Finally, the optimal traffic time under different traffic flows of two-way two lanes, two-way four lanes, and two-way eight lanes is calculated.

Multimodal Entity Alignment Based on Relation-aware Multi-subgraph Graph Neural Network

JIN Jia-Hui , LI Zhi-Jiang , LIU Yi-Zhang

2024, 33(3):245-254. DOI: 10.15888/j.cnki.csa.009437 CSTR: 32024.14.csa.009437

Abstract (396) HTML (995) PDF 2.26 M (1675) Comment (0) Favorites

Abstract:Multi-modal entity alignment (MMEA) is a crucial technique for integrating multi-source heterogeneous multi-modal knowledge graphs (MMKGs). This integration is typically achieved by encoding graph structure and calculating the plausibility of multi-modal representation between entities. However, existing MMEA methods tend to directly employ pre-trained models and overlook the fusion between modalities as well as the fusion between modal features and graph structures. To address these limitations, this study proposes a novel approach called relation-aware multi-subgraph graph neural network (RAMS) for obtaining multi-modal representation in the context of entity alignment. RAMS utilizes a multi-subgraph graph neural network for fusing modality information and graph structure to derive entity representation. The alignment results are subsequently obtained through cross-domain similarity calculation. Extensive experiments demonstrate that RAMS outperforms baseline models in terms of accuracy, efficiency, and robustness.

Sentiment Triple Extraction Combining Grammatical Structure and Semantic Information

YANG Fang-Jie , FENG Guang , TANG Ye-Kai

2024, 33(3):255-263. DOI: 10.15888/j.cnki.csa.009438 CSTR: 32024.14.csa.009438

Abstract (395) HTML (949) PDF 1.32 M (1605) Comment (0) Favorites

Abstract:Most of the current aspect sentiment triplet extraction methods do not fully consider syntactic structure and semantic relevance. This study proposes an aspect sentiment triplet extraction model that combines syntactic structure and semantic information. First, the study proposes to construct a grammatical graph with a dependency parser to get the probability matrices of all dependency arcs, extracting rich information of syntactic structure. Second, it utilizes the self-attention mechanism to construct a semantic graph, which represents the semantic correlation between words, thus reducing the interference of noisy words. Finally, a mutual affine transformation layer is designed to allow the model to better exchange the relevant features between the syntactic graph and semantic graph to improve the performance of the model in sentiment triplet extraction. The model is validated on several public datasets. The experiments show that compared with the existing sentiment triplet extraction models, the precision (P), recall (R), and F1 value are all improved. This validates the effectiveness of combining syntactic structure and semantic information in aspect sentiment triplet extraction.

Multi-domain Fake News Detection Based on Cross-feature Perception Fusion

WANG Zhen-Qi , CHEN Tao , ZHANG Bao-Yu , ZHANG Ming-Li , SUN Chen-Yu , ZHANG Wei-Shan

2024, 33(3):264-272. DOI: 10.15888/j.cnki.csa.009439 CSTR: 32024.14.csa.009439

Abstract (682) HTML (1039) PDF 1.71 M (2094) Comment (0) Favorites

Abstract:The dissemination of false news in various domains has a serious impact on society. The problem of domain shift and cross-domain correlation of news between different domains also poses a great challenge to the prediction ability of the model. To address the above problems, this study proposes a multi-domain fake news detection method based on cross-feature perception fusion. This method can capture multiple feature differences in news between different domains, mine the correlations between news, and control the feature fusion strategy of the model in different domains from multiple dimensions. In addition, this study proposes a joint training framework that is adopted to train the proposed model. The model achieves a predictive F1 score of 92.84% and 85.49% on the English and Chinese datasets, respectively. Compared to the state-of-the-art model, the prediction results of the proposed model are improved by 1.16% and 1.07%, respectively.

Multimodal Ship Trajectory Prediction Based on S-Transformer

KE Yan , CHEN Yao-Jie

2024, 33(3):273-280. DOI: 10.15888/j.cnki.csa.009446 CSTR: 32024.14.csa.009446

Abstract (602) HTML (1190) PDF 1.57 M (2240) Comment (0) Favorites

Abstract:Ship trajectory prediction is the premise and basis of realizing intelligent ship navigation. At present, most studies on ship trajectory prediction only rely on historical data of the automatic identification system (AIS), without using other sensor information on the ship. This study proposes a multi-modal trajectory prediction model S-Transformer. In this network, the seawater/land in the electronic chart is segmented as an auxiliary training target and integrated with the real Zhoushan port AIS data to train the model. In addition, the future ship sailing trajectory is predicted. The study also introduces Segment Recurrence to capture long-term dependencies of AIS data. The experimental results show that the S-Transformer has excellent prediction results in different ship-traveling situations and outperforms the unimodal benchmark model for related prediction tasks.

Adaptive Real-time Workshop Scheduling Based on Contextual Bandits

CHEN Ming , WANG Chuang , XU Zheng

2024, 33(3):281-287. DOI: 10.15888/j.cnki.csa.009454 CSTR: 32024.14.csa.009454

Abstract (417) HTML (772) PDF 1.32 M (1136) Comment (0) Favorites

Abstract:The traditional multi-agent workshop scheduling method uses a single scheduling rule, ignoring the influence of production environment changes on the applicability of scheduling rules and resulting in poor scheduling results. This study proposes an adaptive real-time workshop scheduling method to model the workpiece scheduling process by analogy through the contextual bandits. After several rounds of learning, the contextual bandit model can make scheduling decisions according to the production environment and obtain excellent scheduling results. Finally, simulation experiments verify the effectiveness of the proposed method.

WeChat

Mobile website

>Survey

Current Issue

Volume

Issue