• Volume 33,Issue 6,2024 Table of Contents
    Select All
    Display Type: |
    • >Survey
    • Research Progress of SAR Weak Object Detection Based on Deep Learning

      2024, 33(6):1-15. DOI: 10.15888/j.cnki.csa.009531 CSTR:

      Abstract (623) HTML (1286) PDF 4.13 M (2360) Comment (0) Favorites

      Abstract:Advancements in synthetic aperture radar (SAR) technology have enabled large-scale observations and high-resolution imaging. Consequently, SAR images now contain numerous small-sized objects with weak features, including aircraft, vehicles, tanks, and ships, which are of high value in civilian and key military assets. However, accurately detecting these objects poses a significant challenge due to their small size, dense connectivity, and variable morphology. Deep learning technology has ushered in a new era of progress in SAR object detection. Researchers have made substantial strides by fine-tuning and optimizing deep learning networks to address the imaging characteristics and detection challenges associated with weak SAR objects. This study provides a comprehensive review of deep learning-based methodologies for weak object detection in SAR images. The primary focus is on datasets and methods, providing a thorough analysis of the principal challenges encountered in SAR weak object detection. This study also summarizes the characteristics and application scenarios of recent detection methods, as well as collates and organizes publicly available datasets and common performance evaluation metrics. In conclusion, this study provides an overview of the current application status of SAR weak object detection and offers insights into future development trends.

    • Overview on Quantum Simulator Optimization

      2024, 33(6):16-27. DOI: 10.15888/j.cnki.csa.009559 CSTR:

      Abstract (315) HTML (1155) PDF 1.39 M (1948) Comment (0) Favorites

      Abstract:In recent years, the rapidly evolving quantum computing has become the focus of attention. However, quantum hardware suffers from scarcity and noise, which makes the study of quantum algorithms and the verification of quantum chips rely on quantum simulators running on classical computers. In this study, the main simulation methods used by different quantum simulators are discussed, and various optimizations of mainstream full-amplitude state vector simulators and tensor network-based quantum simulators are explored. Finally, the current status and future directions of quantum simulators are summarized.

    • Unsupervised Fire Detection Based on Contrastive Learning and Synthetic Pseudo Anomalies

      2024, 33(6):28-36. DOI: 10.15888/j.cnki.csa.009529 CSTR:

      Abstract (308) HTML (502) PDF 2.41 M (742) Comment (0) Favorites

      Abstract:Traditional fire detection methods are mostly based on object detection techniques, which suffer from difficulties in acquiring fire samples and high manual annotation costs. To address this issue, this study proposes an unsupervised fire detection model based on contrastive learning and synthetic pseudo anomalies. A cross-input contrastive learning module is proposed for achieving unsupervised image feature learning. Then, a memory prototype that learns the feature distribution of normal scene images to discriminate fire scenes through feature reconstruction is introduced. Moreover, a method for synthesizing pseudo anomaly fire scenes and an anomaly feature discrimination loss based on Euclidean distance are proposed, making the model more targeted toward fire scenes. Experimental results demonstrate that the proposed method achieves an image-level AUC of 89.86% and 89.56% on the publicly available Fire-Flame-Dataset and Fire-Detection-Image-Dataset, respectively, surpassing mainstream image anomaly detection algorithms such as PatchCore, PANDA, and Mean-Shift.

    • Image Segmentation Based on Improved Hesitant Fuzzy C-means

      2024, 33(6):37-47. DOI: 10.15888/j.cnki.csa.009530 CSTR:

      Abstract (274) HTML (421) PDF 2.25 M (608) Comment (0) Favorites

      Abstract:The hesitant fuzzy C-means (HFCM) clustering algorithm has addressed the uncertainty between different pixel blocks in an image to some extent. However, as its objective function does not contain any local information, it is very sensitive to noise and cannot achieve good segmentation accuracy when the noise is large. This study proposes an image segmentation method based on improved HFCM (IHFCM) to address the above issues. Firstly, the completion method of hesitant fuzzy elements is given, and then a similarity measure between hesitant fuzzy elements is defined. Using the defined similarity measure, the study constructs a novel fuzzy factor and fuses it into the objective function of HFCM. The new fuzzy factor considers not only spatial information in the local window but also the similarity between pixels, balancing the impact of noise while preserving image details. Finally, experimental results on synthesized images, BSDS500 dataset images, and natural images show that the proposed IHFCM algorithm has good robustness to noise and improves segmentation accuracy.

    • Inverse Target Interference for Image Data Augmentation

      2024, 33(6):48-57. DOI: 10.15888/j.cnki.csa.009507 CSTR:

      Abstract (290) HTML (419) PDF 2.99 M (700) Comment (0) Favorites

      Abstract:Mixed sample data enhancement methods focus only on the model’s forward representation of the category to which the image belongs while ignoring the reverse determination of whether the image belongs to a specific category. To address the problem of uniquely describing image categories and affecting model performance, this study proposes a method of image data augmentation with inverse target interference. To prevent overfitting of the network model, the method first modifies the original image to increase the diversity of background and target images. Secondly, the idea of reverse learning is adopted to enable the network model to correctly identify the category that the original image belongs to while fully learning the attributes of the populated image that do not belong to that category to increase the confidence of the network model in identifying the category that the original image belongs to. In conclusion, to verify the method’s effectiveness, the study utilizes different network models to perform many experiments on five datasets including CIFAR-10 and CIFAR-100. Experimental results show that compared to other state-of-the-art data augmentation methods, the proposed method can significantly enhance the model’s learning effect and generalization ability in complex settings.

    • Entity Alignment Integrating Structure and Attribute Attention Mechanism

      2024, 33(6):58-69. DOI: 10.15888/j.cnki.csa.009542 CSTR:

      Abstract (319) HTML (511) PDF 2.34 M (1073) Comment (0) Favorites

      Abstract:For fusing map data from different sources, entity alignment is a key step, and its purpose is to determine equivalent entity pairs between different maps. Most of the existing entity alignment methods are based on graph embedding, which aligns by considering the structure and attribute information of the graph. However, they do not handle the interactive relationship between the two well and ignore the use of relationships and multi-order neighbor information. To solve the above problems, this study proposes an entity alignment method based on the fused structural and attribute attention mechanism model (FSAAM). The model first divides the graph data characteristics into attribute and structural channel data and then uses the attribute attention mechanism to learn attribute information. It adds the learning of relationship information to that of structural information and uses the graph attention mechanism to find the entities aligned with beneficial neighbor features. The Transformer encoder is introduced to better correlate information between entities, and the Highway network is utilized to reduce the impact of noise information that may be learned. Finally, the model applies the LS-SVM network to the similarity matrix of the learned structural channel and the attribute channel information, obtaining the integrated similarity matrix to achieve entity alignment. The proposed model is verified on three sub-datasets of the public data set DBP15K. Experimental results show that compared with the best results in the baseline model, its Hits@1 has increased by 2.7%, 4.3%, and 1.7% respectively, and Hits@10 and MRR have also improved, indicating that this model can effectively improve the alignment accuracy of entities.

    • Colon Polyp Segmentation Fusing Multi-scale Gate Convolution and Window Attention

      2024, 33(6):70-80. DOI: 10.15888/j.cnki.csa.009509 CSTR:

      Abstract (390) HTML (488) PDF 2.09 M (713) Comment (0) Favorites

      Abstract:Accurate segmentation of colon polyps is important to remove abnormal tissue and reduce the risk of polyps converting to colon cancer. The current colon polyp segmentation model has the problems of high misjudgment rate and low segmentation accuracy in the segmentation of polyp images. To achieve accurate segmentation of polyp images, this study proposes a colon polyp segmentation model (MGW-Net) combining multi-scale gated convolution and window attention. Firstly, it designs an improved multi-scale gate convolution module (MGCM) to replace the U-Net convolutional block to achieve full extraction of colon polyp image information. Secondly, to reduce the information loss at the skip connection and make full use of the information at the bottom of the network, the study builds a multi-information fusion enhancement module (MFEM) by combining improved dilated convolution and hybrid enhanced residual window attention to optimize the feature fusion at the skip connection. Experimental results on CVC-ClinicDB and Kvasir-SEG data sets show that the similarity coefficients of MGW-Net are 93.8% and 92.7%, and the average crossover ratio is 89.4% and 87.9%, respectively. Experimental results on CVC-ColonDB, CVC-300, and ETIS datasets show that MGW-Net has strong generalization performance, which verifies that MGW-Net can effectively improve the accuracy and robustness of colon polyp segmentation.

    • Sparse Convolutional Network with Global Context Enhancement for Anti-external Force Damage Detection of Power Grid

      2024, 33(6):81-90. DOI: 10.15888/j.cnki.csa.009514 CSTR:

      Abstract (241) HTML (380) PDF 3.18 M (626) Comment (0) Favorites

      Abstract:In the anti-external force damage inspection of transmission lines, the current lightweight target detection algorithm deployed at the edge has insufficient detection accuracy and slow reasoning speed. To solve the above problems, this study proposes a sparse convolution network (SCN) with global context enhancement for anti-external force damage detection of the power grid, Fast-YOLOv5. Based on the YOLOv5 algorithm, the FasterNet+ network is designed as a new feature extraction network, which can maintain detection accuracy, improve the reasoning speed of the model, and reduce computational complexity. In the bottleneck layer of the algorithm, an ECAFN module with efficient channel attention is designed, which improves the detection effect by adaptively calibrating the feature response in the channel direction, efficiently obtaining the cross-channel interactive information and further reducing the amount of parameters and calculation. The study proposes the detection layer of the sparse convolutional network SCN replacement model with context enhancement to enhance the foreground focus feature and improve the prediction ability of the model by capturing the global context information. The experimental results show that compared with the original model, the accuracy of the improved model is increased by 1.9%, and the detection speed is doubled, reaching 56.2 f/s. The amount of parameters and calculation are reduced by 50% and 53% respectively, which is more in line with the requirements for efficient detection of transmission lines.

    • Joint Entity and Relation Extraction by Integrating Adversarial Training and Global Pointers

      2024, 33(6):91-98. DOI: 10.15888/j.cnki.csa.009537 CSTR:

      Abstract (276) HTML (362) PDF 1.19 M (680) Comment (0) Favorites

      Abstract:Joint entity and relation extraction aims to extract entity relation triples from text and is one of the most important steps in building a knowledge graph. There are issues in joint entity and relation extraction, such as weak information expression, poor generalization ability, entity overlap, and relation redundancy. To address these issues, a joint entity and relation extraction model named RGPNRE is proposed. RoBERTa pre-trained model is used as an encoder to enhance the model’s information expression capability. Adversarial training is introduced in the training process to improve the model’s generalization ability. The use of the global pointer addresses entity overlap issues. Relation prediction is used to exclude impossible relations, reducing redundant relations. Entity and relation extraction experiments on the schema-based Chinese medical information extraction dataset CMeIE show that the final model achieved a 2% improvement in F1 score compared to the baseline model. In cases of entity pair overlap, there is a 10% increase in the F1 score, and in situations of single entity overlap, there is a 1% increase in the F1 score. This indicates that the model can more accurately extract entity relation triples, thereby assisting in knowledge graph construction. In the contrast experiment with 1–5 triples, the F1 score of the model increased by about 2 percentage points in sentences with 4 triples, and by about 1 percentage point in complex sentences with 5 or more triples, indicating that the model can effectively handle complex sentence scenarios.

    • U-Net Based on Residual and Wavelet for Retinal Image Segmentation

      2024, 33(6):99-107. DOI: 10.15888/j.cnki.csa.009501 CSTR:

      Abstract (321) HTML (351) PDF 2.09 M (644) Comment (0) Favorites

      Abstract:Retinal vessel segmentation is a common task in medical image segmentation. Retinal vessel images have the characteristic of small and multiple segmentation targets. In the past, networks could effectively extract coarse blood vessels in segmentation. However, it is easy to overlook small blood vessels, the extraction of which affects the performance of the network to some extent, and even the diagnostic results. Therefore, to extract more continuous fine blood vessels while ensuring the accurate extraction of coarse blood vessels, this study uses a symmetric encoder-decoder network as the basic network and a new convolution module, DR-Conv, to prevent overfitting while improving the learning capability of the network. In the process, regarding the information loss caused by the max-pooling layer, the study uses discrete wavelet transform for image decomposition and inverse discrete wavelet transform for image reconstruction and utilizes mixed loss functions to combine the characteristics of different loss functions to compensate for the insufficient optimization ability of a single loss function. It checks the performance of the network on three public retinal vessel datasets and compares it with the latest networks, showing better performance of the proposed network.

    • Reservoir Lithology Identification Using Hybrid Model BiLSTM-XGBoost

      2024, 33(6):108-116. DOI: 10.15888/j.cnki.csa.009522 CSTR:

      Abstract (258) HTML (338) PDF 2.34 M (578) Comment (0) Favorites

      Abstract:Reservoir lithology classification is the foundation of geological research. Although data-driven machine learning models can effectively identify reservoir lithology, the special nature of well logging data as sequential data makes it difficult for the model to effectively extract the spatial correlation of the data, resulting in limitations in reservoir identification. To address this issue, this study proposes a bidirectional long short-term memory extreme gradient boosting (BiLSTM-XGBoost, BiXGB) model for predicting reservoir lithology by combining bidirectional long short-term memory (BiLSTM) and extreme gradient boosting decision tree (XGBoost). By integrating BiLSTM into the traditional XGBoost, the model significantly enhances the feature extraction capability for well logging data. The BiXGB model utilizes BiLSTM to extract features from well logging data, which are then input into the XGBoost classification model for training and prediction. The BiXGB model achieves an overall prediction accuracy of 91% when applied to a reservoir lithology dataset. To further validate its accuracy and stability, the model is tested on the publicly available UCI Occupancy dataset, achieving an overall prediction accuracy of 93%. Compared to other machine learning models, the BiXGB model accurately classifies sequential data, improving the accuracy of reservoir lithology identification and meeting the practical needs of oil and gas exploration. This provides a new approach for reservoir lithology identification.

    • Acute Lymphocytic Leukemia Classification Based on Attention Residual Network

      2024, 33(6):117-125. DOI: 10.15888/j.cnki.csa.009541 CSTR:

      Abstract (242) HTML (378) PDF 2.01 M (546) Comment (0) Favorites

      Abstract:Currently, when dealing with the classification of acute lymphoblastic leukemia (ALL), there are problems of messy background information and nuanced differences. Since it is still difficult to select key features and reduce background noise in blood sample images, it is difficult for traditional methods to capture important and subtle features, and effectively classify and identify various blood cell types, which affects the accuracy and reliability of the results. This study proposes a classification model based on ResNeXt50, which uses image enhancement to reduce background noise. The model enhances the perception of various scales and context information by improving the hole pyramid feature extraction method. By adding an improved SA attention mechanism, the model can better focus on and learn information that has a greater impact on the outcome. The model is tested on the Blood Cells Cancer public data set of Tehran (Taleqani) Hospital in Iran, and the accuracy and precision rates reach 98.39% and 98.33%, respectively. The results show that the model not only has certain clinical significance and practical value but also provides a new idea for the auxiliary diagnosis of ALL.

    • Visual Recognition System for Farmed Animals Based on UAV Live Broadcast Linkage

      2024, 33(6):126-132. DOI: 10.15888/j.cnki.csa.009543 CSTR:

      Abstract (284) HTML (378) PDF 1.69 M (664) Comment (0) Favorites

      Abstract:The efficient recognition of farmed animals is the basis for animal husbandry farms to conduct all kinds of precision breeding. Therefore, it is essential to build a corresponding recognition system to support it. The system designed in this study uses the UAV live broadcast linkage method for sample collection and cruise recognition. This method allows real-time video uploading to the data center and addresses issues such as fewer small targets and occlusion problems compared to ordinary UAV shooting. On this basis, the study selects the YOLOv7 algorithm model to recognize animal behavior and quantity. Furthermore, it optimizes and lightweights the YOLOv7 algorithm model to enhance the recognition accuracy and reduce the system load. Finally, the recognition data is output to the standard interface for convenient calls by various precision breeding programs. The system not only adapts to the scene needs of the farm but also takes into account the efficient operation of the system. It can provide unified data support for implementing diverse precision breeding in the farm and reduce the cost of repeated design and decentralized management.

    • Batch Reconfiguration Algorithm in Dynamic Flow Scheduling Scenario of Time-sensitive Network

      2024, 33(6):133-142. DOI: 10.15888/j.cnki.csa.009548 CSTR:

      Abstract (225) HTML (363) PDF 1.63 M (657) Comment (0) Favorites

      Abstract:Time-sensitive network technologies are widely used in industrial automation. The flow scheduling methods in this field mainly include static and dynamic scheduling. Static scheduling computes all flows at a time, which can save link and time resources to the greatest extent but has the disadvantages of long computation time and lack of flexibility to handle new flows. Dynamic scheduling computes new flows incrementally, with short computation time but insufficient resource allocation, resulting in time slot fragmentation. The global flow reconfiguration mechanism can periodically replan all flows in the network to optimize the allocation of link and time resources. However, this mechanism only applies to small networks with fewer flows, and an increase in the flow number can cause a sharp increase in computation time, affecting subsequent flows. This study designs a batch reconfiguration algorithm based on the existing dynamic scheduling algorithm. This algorithm provides a new evaluation indicator: network throughput. It can regularly reconfigure some flows to optimize network resource allocation while meeting the dynamic scheduling second-level response time requirement. In addition, the algorithm gives reconfigured flow selection standards and optimizes flow path selection standards and transmission start time calculation. This study conducts simulation experiments on the original and improved algorithm with the batch reconfiguration mechanism. The experimental results show that the improved algorithm can run in large networks with thousands of flows and have a 16.5% and 5.5% improvement in network throughput and the number of successfully scheduled flows while ensuring the second-level calculation time of the algorithm.

    • Siamese Low-light Video Enhancement Network with Fusion of Local and Global Features

      2024, 33(6):143-152. DOI: 10.15888/j.cnki.csa.009533 CSTR:

      Abstract (261) HTML (410) PDF 3.10 M (714) Comment (0) Favorites

      Abstract:Videos captured in low illumination environments often carry problems such as low contrast, high noise, and unclear details, which seriously affect computer vision tasks such as target detection and segmentation. Most of the existing low-light video enhancement methods are constructed based on convolutional neural networks. Since convolution cannot make full use of the long-range dependencies between pixels, the generated video often suffers from loss of details and color distortion in some regions. To address the above problems, this study proposes a Siamese low-light video enhancement network coupling local and global features. The model obtains local features of video frames through a deformable convolution-based local feature extraction module and designs a lightweight self-attention module to capture the global features of video frames. Finally, the extracted local and global features are fused by a feature fusion module, which guides the model to generate enhanced videos with more realistic colors and details. The experimental results show that the proposed method can effectively improve the brightness of low-light videos and generate videos with richer colors and details. It also outperforms the methods proposed in recent years in evaluation metrics such as peak signal-to-noise ratio and structural similarity.

    • Super-resolution Reconstruction of Remote Sensing Images with Cross-scale Hybrid Attention

      2024, 33(6):153-160. DOI: 10.15888/j.cnki.csa.009525 CSTR:

      Abstract (278) HTML (371) PDF 2.03 M (640) Comment (0) Favorites

      Abstract:To address the inadequacy of existing remote sensing image super-resolution reconstruction models in long-term feature similarity and multi-scale feature relevance, this study proposes a novel remote sensing image super-resolution reconstruction algorithm based on a cross-scale hybrid attention mechanism. Initially, the study introduces a global layer attention (GLA) mechanism and employs layer-wise attention to weight and merge global features across different levels, thereby modeling the extended dependency between low-resolution and high-resolution image features. Concurrently, it designs a cross-scale local attention (CSLA) mechanism to identify and integrate local information patches in multi-scale low-resolution feature maps that correspond with high-resolution images, enhancing the model’s ability to restore image details. Finally, the study proposes a local information-aware loss function to guide the image reconstruction process, further improving the visual quality and detail preservation of the reconstructed images. Experiments on UC-Merced datasets demonstrate that the proposed method outperforms most mainstream methods in terms of average PSNR/SSIM across three magnification factors and exhibits superior quality and detail preservation in visual results.

    • Application of Sample-optimized PPO Algorithm in Single Intersection Signal Control

      2024, 33(6):161-168. DOI: 10.15888/j.cnki.csa.009544 CSTR:

      Abstract (267) HTML (375) PDF 1.70 M (716) Comment (0) Favorites

      Abstract:Optimizing the control strategy of traffic signals can improve the efficiency of vehicular traffic on roads and alleviate congestion. To overcome the challenge of efficiently optimizing signal control strategies at single intersections using value function-based deep reinforcement learning algorithms, this study develops a method based on sample optimization called modified proximal policy optimization (MPPO). This approach enhances the quality of model sample selection by maximizing the extraction from the agent target function in the traditional PPO algorithm. It employs a multi-dimensional traffic state vector as input for the model’s observations, enabling it to promptly track and utilize the dynamic changes in road traffic conditions. The accuracy and effectiveness of the MPPO algorithm model are verified by comparing it with value function reinforcement learning control methods using the urban traffic micro simulation software (SUMO). Simulation experiments show that this approach closely resembles real traffic scenarios compared to value function reinforcement learning control methods. It significantly accelerates the convergence speed of cumulative vehicle waiting time, noticeably reduces the average vehicle queue length and waiting time, and effectively improves the traffic throughput at the intersection.

    • Simulation Calculation of Pipeline Network Yield Based on Spatio-temporal Graph Convolutional Neural Network

      2024, 33(6):169-176. DOI: 10.15888/j.cnki.csa.009524 CSTR:

      Abstract (207) HTML (330) PDF 2.08 M (508) Comment (0) Favorites

      Abstract:Flowmeter measurement values have a large deviation in crude oil gathering and transmission pipeline network, and the manual correction of simulation software is cumbersome with poor adaptive. To solve these problems, this study proposes an adaptive spatio-temporal graphic convolutional neural network production calculation method to realize the simulation calculation of crude oil gathering and transmission pipeline network production. The topology of the pipeline network is constructed with the submerged oil electric pump wells as nodes and the oil pipelines as edges. The study utilizes the graph convolutional neural network to extract the spatial information of well distribution and the temporal convolutional neural network to obtain the time series characteristics of the production data, so as to calculate the accurate production simulation results. The experimental validation is carried out on the crude oil gathering and transmission pipeline network system of an oil field. The results show that the proposed method can accurately calculate the production of each electric pump well in the pipeline network system. Compared with other baseline network models, the error indexes are reduced: the average absolute error is reduced to 0.87; the average absolute percentage error is reduced to 4.45%; the mean square error is reduced to 0.84, which proves the validity and accuracy of the proposed method.

    • Decision Method for Processes of Parts Machining Features Driven by Data and Knowledge

      2024, 33(6):177-184. DOI: 10.15888/j.cnki.csa.009549 CSTR:

      Abstract (246) HTML (386) PDF 1.90 M (812) Comment (0) Favorites

      Abstract:In the process planning stage of parts, the generated process schemes strongly depend on the process knowledge selected and applied by designers. However, due to the many deviations between the actual manufacturing logics and the process knowledge selected by designers, the mismatch between the generated process scheme and the actual process has become a problem of concern in the current parts manufacturing field. This study proposes a decision method for processes of machining features driven by data and knowledge to solve the above problems. In this method, an MLP deep learning algorithm based on an attention mechanism is utilized to mine process knowledge from structured process data and correlate machining features with feature process labels. After data processing, the method is applied to train a neural network model. After verification, the method can take the feature process data of parts as input and output the distributions of corresponding feature process labels, providing decision support for the generation of the process scheme of parts.

    • Change Detection for Buildings in Multitemporal BIT Remote Sensing Images

      2024, 33(6):185-191. DOI: 10.15888/j.cnki.csa.009546 CSTR:

      Abstract (279) HTML (407) PDF 1.68 M (689) Comment (0) Favorites

      Abstract:A remote sensing image change detection method of multi-temporal binary change detection based on image transformation (BIT) is proposed to address issues related to seasonal and radiometric variations (color discrepancies) between remotely sensed images acquired at different times but from the same geographic area. This method incorporates remote sensing images from multiple past time points and combines the results of pairwise change detection between the current image and the past temporal images to obtain a stable change detection outcome. This method helps mitigate false alarms caused by seasonal and radiometric variations, thereby enhancing the accuracy of change detection. Multiple remote sensing images from different time points in the past are utilized to eliminate the influence of non-target building changes. The pixel difference value of change points is introduced as a regularization term in the loss function, further improving the robustness and reliability of change detection. In this study, a three-temporal (three images from different time points) example is provided, and experiments are conducted with a remote sensing image dataset of building changes. The experimental results demonstrate that the multi-temporal BIT method outperforms change detection methods that only consider two temporal images in the task of remote sensing image change detection.

    • Road Object Detection Based on Historical Information and Improved SimSiam

      2024, 33(6):192-200. DOI: 10.15888/j.cnki.csa.009520 CSTR:

      Abstract (238) HTML (388) PDF 4.00 M (589) Comment (0) Favorites

      Abstract:Visual navigation uses the visual information in the environment as the navigation basis, and one of the key tasks of visual navigation is object detection. Traditional object detection methods require a large number of annotations and only focus on the image itself, failing to fully utilize the data similarity in visual navigation tasks. To solve the above problem, this paper proposes a self-supervised training task based on historical image information. In this method, multi-moment images at the same location are aggregated. Furthermore, the foreground and the background are distinguished by information entropy, and the images are enhanced and then sent into the simple siamese (SimSiam) self-supervised paradigm for training. In addition, the multi-layer perception (MLP) networks in the projection and prediction layers of the SimSiam paradigm are upgraded into a convolutional attention module and a convolution module, and the loss function is improved into one of the losses among multi-dimensional vectors, thereby extracting multi-dimensional features from the images. Finally, the model pre-trained by the self-supervised paradigm is used to train the model for downstream tasks. Experiments reveal that the proposed method effectively improves the precision of downstream classification and detection tasks on the processed nuScenes dataset. Its Top5 precision on downstream classification tasks reaches 66.95%, and its mean average precision (mAP) on downstream detection tasks reaches 40.02%.

    • Semantic and Syntactic Dependency Enhanced Span-based Aspect Sentiment Triplet Extraction

      2024, 33(6):201-210. DOI: 10.15888/j.cnki.csa.009551 CSTR:

      Abstract (274) HTML (427) PDF 1.98 M (647) Comment (0) Favorites

      Abstract:This study aims to address the issues of current span-based aspect sentiment triplet extraction models, which ignore part-of-speech and syntactic knowledge and encounter conflicts in triplets. A semantic and syntactic enhanced span-based aspect sentiment triplet extraction model named SSES-SPAN is proposed. Firstly, part-of-speech and syntactic dependency knowledge is introduced into the feature encoder to enable the model to more accurately distinguish aspect and opinion terms in the text and gain a deeper understanding of their relationships. Specifically, for part-of-speech information, a weighted sum approach is employed to fuse part-of-speech contextual representation with sentence contextual representation to obtain semantic enhanced representation, aiding in the precise extraction of aspect and opinion terms. For syntactic dependency information, attention-guided graph convolution networks are used to capture syntactic dependency features and obtain syntactic dependency enhanced representation to handle complex relationships between aspect and opinion terms. Furthermore, considering the lack of a mutual exclusivity guarantee in span-level inputs, an inference strategy is employed to eliminate conflicting triplets. Extensive experiments on benchmark datasets demonstrate that the proposed model outperforms state-of-the-art methods in terms of effectiveness and robustness.

    • Denoising and Phase Picking of Microseismic Waveforms Based on Fully Supervised Learning Using Complex Wavelet Transform

      2024, 33(6):211-222. DOI: 10.15888/j.cnki.csa.009563 CSTR:

      Abstract (198) HTML (366) PDF 3.88 M (574) Comment (0) Favorites

      Abstract:The lower signal-to-noise ratio of underground microseismic signals results in a decrease in signal picking accuracy. At present, signal denoising algorithms based on wavelet thresholding encounter problems such as poor generalization and difficulty in measuring thresholds when facing signals with low signal-to-noise ratios. To address this issue, this study investigates a fully supervised learning method based on complex wavelet transform for microseismic waveform denoising. The proposed method first utilizes a complex wavelet transform combined with a convolutional autoencoder to design an encoder-decoder with multiple convolutional and deconvolution operations to complete the image denoising. To verify the effectiveness of this method, Earthquake2023 is first constructed on Stanford’s Earthquake dataset for training and testing. It shows good fitting performance and training results. At the same time, a seismic phase picking method is designed based on the denoised signal obtained from this method and achieves high picking accuracy. This study designs multiple sets of comparative experiments, and the results show that the denoising method can effectively improve the peak signal-to-noise ratio and root mean square error of the signal, which have increased by 16 dB and 24% respectively. Moreover, the error of picking up P-wave and S-wave at the first arrival time is reduced by 0.3 ms compared to STA/LTA.

    • Degraded Calligraphic Document Binarization Based on Multidimensional Side Window Clustering Segmentation

      2024, 33(6):223-231. DOI: 10.15888/j.cnki.csa.009526 CSTR:

      Abstract (217) HTML (363) PDF 2.55 M (527) Comment (0) Favorites

      Abstract:The distribution of grayscale values in calligraphic character document images exhibits significant variations under poor lighting conditions, resulting in lower image contrast in low-light areas and degradation of morphological texture features of the strokes. Traditional methods typically focus on local information such as mean, squared deviation, and entropy, while giving less consideration to morphological texture, rendering them insensitive to the features of low-contrast areas. To address these issues, this study proposes a binarization method called clustering segmentation-based side-window filter (CS-SWF) specifically designed for degraded calligraphic documents. Firstly, this method utilizes multi-dimensional SWF to describe pixel chunks with similar morphological features. Then, with multiple correction rules, it utilizes downsampling to extract low-latitude information and correct feature regions. Finally, the clustered blocks in the feature map are classified to obtain the binarization results. To evaluate the performance of the proposed method, it is compared with existing methods using F-measure (FM), peak signal-to-noise ratio (PSNR), and distance reciprocal distortion (DRD) as indicators. Experimental results on a self-constructed dataset consisting of 100 handwritten degraded document images demonstrate that the proposed binarization method exhibits greater stability in low-contrast dark regions and outperforms the comparison algorithm in terms of accuracy and robustness.

    • Multi-level Feature Interaction Transformer for Multi-organ Image Segmentation

      2024, 33(6):232-241. DOI: 10.15888/j.cnki.csa.009528 CSTR:

      Abstract (264) HTML (503) PDF 4.37 M (733) Comment (0) Favorites

      Abstract:Clinical diagnoses can be facilitated through the utilization of multi-organ medical image segmentation. This study proposes a multi-level feature interaction Transformer model to address the issues of weak global feature extraction capability in CNN, weak local feature extraction capability in Transformer, and the quadratic computational complexity problem of Transformer for multi-organ medical image segmentation. The proposed model employs CNN for extracting local features, which are then transformed into global features through Swin Transformer. Multi-level local and global features are generated through down-sampling, and each level of local and global features undergo interaction and enhancement. After the enhancement at each level, the features are cross-fused by multi-level feature fusion modules. The features, once again fused, pass through up-sampling and segmentation heads to produce segmentation masks. The proposed model is experimented on the Synapse and ACDC datasets, achieving average dice similarity coefficient (DSC) and average 95th percentile Hausdorff distance (HD95) values of 80.16% and 19.20 mm, respectively. These results outperform representative models such as LGNet and RFE-UNet. The proposed model is effective for multi-organ medical image segmentation.

    • Optimization of Raft Consensus Algorithm Based on Log Replica

      2024, 33(6):242-250. DOI: 10.15888/j.cnki.csa.009550 CSTR:

      Abstract (184) HTML (422) PDF 1.85 M (567) Comment (0) Favorites

      Abstract:In a distributed storage system based on a three-replica strategy, when a hard disk on the storage node fails, a common processing method is to wait for the system’s preset time. If the faulty disk doesn’t recover within the specified timeout, the recovery of the replicas on the faulty hard disk will begin. The issue with this handling approach is that when there is a faulty replica within the three-replica group, another disk failure in the same group will result in the system being unable to continue providing services and recover automatically. This study introduces an improved Raft consensus algorithm based on log replicas, namely log replica based Raft (LR-Raft). Log replicas do not have a complete state machine, allowing them to quickly join the cluster and participate in voting and consensus, thereby enhancing system availability in the presence of a faulty disk. It can address the problem of unavailability and data loss in the cluster caused by the failure of two replicas in a three-replica setup in a short period. The experimental results indicate that with the introduction of log replicas into the replica group, LR-Raft significantly reduces read and write latency and substantially improves throughput compared to the original Raft across various workloads.

    • Construction and Feature Analysis of WUI Fire Causal Factor Network Based on Text Mining

      2024, 33(6):251-258. DOI: 10.15888/j.cnki.csa.009527 CSTR:

      Abstract (211) HTML (348) PDF 1.77 M (623) Comment (0) Favorites

      Abstract:To prevent and reduce the occurrence of WUI fires, this study mines the key causal factors of WUI fires and clarifies the action mechanism between the causal factors. First, this study obtains the causal factors from WUI fire accident cases based on the proposed mining technology and uses the Apriori algorithm to obtain association rules between the causal factors. Then it uses the complex network theory to construct the WUI fire causal factor network, calculate the topological parameters of the network, and analyze the characteristics of the WUI fire causal network. Finally, the study introduces the risk index of the WUI fire causal chain, mines the high-risk connecting edges, and proposes the chain breaking measures. The results show that the WUI fire causal factor network has a small-world characteristic, and high temperature, strong wind, and drought have a greater influence on other causal factors. Burning waste, plant fire, emergency response speed, human arson, and strong wind have important roles in the conversion of different causal factors, which should be controlled better. The most risky side of the network is burning waste → plant fire, and the risk chain can be cut off by enacting regulations such as the prohibition of unauthorized burning waste, to achieve the prevention and active control of WUI fires.

    • Combining Roberta and Bi-FLASH-SRU for Chinese Event Causality Extraction

      2024, 33(6):259-267. DOI: 10.15888/j.cnki.csa.009539 CSTR:

      Abstract (222) HTML (375) PDF 2.27 M (608) Comment (0) Favorites

      Abstract:Aiming at the problems of too many forms and insufficient acquisition of form features in the existing form-based event relationship extraction methods, this study proposes TF-ChineseERE, a Chinese event causality extraction method that combines Roberta and Bi-FLASH-SRU. The method transforms the text into labeled forms by formulating a form-filling strategy that takes advantage of the labeled relationships in the text. The study proposes the Roberta pre-training model and the bidirectional built-in flash attention simple recurrent unit (Bi-FLASH-SRU) to obtain the subject-object event features. It then uses the table feature recurrent learning module to mine the global features and finally performs table decoding to obtain event causality triples. The experiments are validated with two public datasets in the financial domain. The results show that the F1 values of the proposed method reach 59.2% and 62.5%, respectively, with a faster training speed of the Bi-FLASH-SRU model and less number of filled forms, which proves the effectiveness of the method.

Current Issue


Volume , No.

Table of Contents

Archive

Volume

Issue

联系方式
  • 《计算机系统应用》
  • 1992年创刊
  • 主办单位:中国科学院软件研究所
  • 邮编:100190
  • 电话:010-62661041
  • 电子邮箱:csa (a) iscas.ac.cn
  • 网址:http://www.c-s-a.org.cn
  • 刊号:ISSN 1003-3254
  • CN 11-2854/TP
  • 国内定价:50元
You are the firstVisitors
Copyright: Institute of Software, Chinese Academy of Sciences Beijing ICP No. 05046678-3
Address:4# South Fourth Street, Zhongguancun,Haidian, Beijing,Postal Code:100190
Phone:010-62661041 Fax: Email:csa (a) iscas.ac.cn
Technical Support:Beijing Qinyun Technology Development Co., Ltd.

Beijing Public Network Security No. 11040202500063