2024, 33(8):1-17. DOI: 10.15888/j.cnki.csa.009535 CSTR:
Abstract:Considering the characteristics of the adjacent container terminals in the same region, such as similar logistics functions, overlapping cargo hinterlands, severe disorderly competition, and low resource utilization rates, this study focuses on the problem of multiple container terminal tactical berth and yard incorporate integrative scheduling (MCT-TBY-IIS), where the terminals are managed by the same organization and located adjacent to each other. Based on computational logistics, the MCT-TBY-IIS problem is decomposed into two subproblems of moderate coupling: the multi-terminal dynamic and continuous berth allocation problem (MDC-BAP) and the multi-terminal periodic and rolling yard allocation problem (MPR-YAP). This decomposition is achieved by using the multiple knapsack problem, as well as considering berth depth constraints and export containers with transferable terminal options. Subsequently, the hierarchical nesting-oriented two-stage improved imperialist competitive algorithm (HNO-TSI-ICA) is customized to optimize MCT-TBY-IIS under the guidance of problem-oriented exploration. Lastly, with typical examples of multi-terminal joint operations in the southeast coastal region in China, a combination of two algorithms is selected and applied to HNO-TSI-ICA for solving the MCT-TBY-IIS problem: the prosperity and destruction-oriented improved imperialist competitive algorithm with double assimilation, and the binary imperialist competitive algorithm for the 0-1 knapsack problem. Moreover, the structure of the target cost of the storage yard operation subsystem is stable and not affected by the port load or the length of the planning period. Notably, the horizontal transportation cost of containers in the export container area makes the largest contribution to the sub-target cost of storage yard operations, maintaining a stable proportion of 83%. Through the modeling and optimization of MCT-TBY-IIS, it is found that the multi-terminal cooperative operation mode has great potential to help the neighboring multiple terminals in the same organization reduce costs, increase efficiency, and improve the utilization rate of core resources.
2024, 33(8):18-29. DOI: 10.15888/j.cnki.csa.009582 CSTR:
Abstract:Clustering algorithm based on the minimum spanning tree (MST) can identify clusters with arbitrary shapes, but the algorithm has limitations in efficiently constructing a minimum spanning tree and identifying invalid edges and is easily influenced by noise points. This study proposes an MST clustering algorithm based on local density peaks and label propagation (DPMST) by combining the advantages of the density peaks clustering algorithm to find local density peaks and exclude noise points with the MST algorithm. The DPMST algorithm adopts the shared neighbors-based distance between local density peaks and uses the neighborhood information between local density peaks to efficiently construct minimum spanning trees and identify invalid edges, enabling the discovery of clusters with complex structures. Label propagation is used to enhance the strong labels and weaken the weak labels to refine wrong labels, which can improve the quality of clustering results, especially for border region points as well as revealing complex manifolds. The experimental results on several synthetic and real-world datasets show that the DPMST algorithm outperforms classical clustering algorithms DPC, MST, K-means, DBSCAN, AP, SC, and BIRCH.
GUI Xiang-Quan , TIAN Shi-Wen , LI Li , LYU Rui
2024, 33(8):30-39. DOI: 10.15888/j.cnki.csa.009621 CSTR:
Abstract:Gansu painted pottery has the most complete spatial and temporal sequence among all kinds of painted pottery cultures in China. However, no study has been specifically designed for the style transfer of Gansu painted pottery. To promote the excellent traditional Chinese culture, this research constructs the Gansu painted pottery dataset and proposes a geometric style transfer method. The method generates a neural distortion field that deforms Gansu painted pottery into the geometric style of the target object while maintaining the texture of the pottery. Two modules are incorporated into the network structure, namely position embedding and feature enhancement, to improve the quality of feature encoding. Shape consistency loss and a smooth regularization term are introduced to the loss function to prevent the details of the painted pottery from mutating and improve the deformation effect. The experimental results show that the model can achieve large-scale geometric style transfer between Gansu painted pottery and objects from different classes, maintaining the details of the pottery and providing new visual experiences.
GAO Qin-Qin , LING Song-Song , YU Jie , YU Xu
2024, 33(8):40-50. DOI: 10.15888/j.cnki.csa.009558 CSTR:
Abstract:Cross-project defect prediction (CPDP) has emerged as a crucial research area in software engineering and data mining. Using defective code from other data-rich projects to build prediction models solves the problem of insufficient data during model construction. However, the distribution difference between the code files of source and target projects results in poor cross-project prediction. Most studies adopt the domain adaptation methods to solve this problem, but the existing methods only focus on the influence of conditional or marginal distribution on domain adaptation, ignoring its dynamics. On the other hand, they fail to choose appropriate pseudo-labels. Based on the above two aspects, this study proposes a cross-project defect prediction method based on dynamic distribution alignment and pseudo-label learning (DPLD). Specifically, the proposed method reduces the marginal and conditional distribution differences between projects in the domain alignment and category alignment modules, respectively, by means of the adversarial domain adaptation method. Additionally, it dynamically and quantitatively characterizes the relative importance of the two distributions using dynamic distribution factors. Furthermore, this study proposes a pseudo-label learning method to enhance the accuracy of pseudo-labels as real labels through the geometric similarity between data. Experiments conducted on the PROMISE dataset show that DPLD achieves average improvements of 22.98% and 15.21% in terms of F-measure and AUC, respectively. These results demonstrate the effectiveness of the DPLD method in reducing distribution differences between projects and improving the performance of cross-project defect prediction.
SUN Zi-Xiang , QIAN Xu-Wei , YANG Ping , HANG Ren-Long
2024, 33(8):51-59. DOI: 10.15888/j.cnki.csa.009593 CSTR:
Abstract:Semantic segmentation of remote sensing images plays a crucial role in environmental detection, land cover classification, and urban planning. Convolutional neural networks and their improved models are the mainstream methods for semantic segmentation of remote sensing images. However, these methods focus more on learning local contextual features and cannot effectively model the global distribution relationship between different objects, thereby restricting the segmentation performance of the model. To address this issue, this study constructs a global semantic relationship learning module based on convolutional neural networks, which fully learns the symbiotic relationships between different objects and effectively enhances the model’s representation ability. In addition, a multi-scale relationship learning module is constructed to integrate global semantic relationships of different scales, given the scale differences of the objects to be segmented in the same scene. To evaluate the performance of the model, sufficient experiments are conducted on two commonly used remote sensing image datasets, Vaihingen and Potsdam. The experimental results show that the proposed method can achieve higher segmentation performance than existing models based on convolutional neural networks.
2024, 33(8):60-67. DOI: 10.15888/j.cnki.csa.009573 CSTR:
Abstract:The lattice Boltzmann method (LBM) is a computational fluid dynamics (CFD) method based on molecular motion theory. Improving the parallel computing capability of LBM is an important research topic in the high-performance computing field. This article is based on the SW26010Pro processor and achieves multi-level parallelism of LBM through optimization methods such as region decomposition, data reconstruction, double buffering, and vectorization. Based on the above optimization methods, a grid size of 56 million is tested, and the implementation results show that compared to message passing interface (MPI) level parallelism, the average acceleration factor of the collision process reaches 61.737, and that of the migration process reaches 17.3. At the same time, strong expansion testing is conducted on the lid-driven cavity flow case, with a grid size of 1200×1200×1200. Based on 62 000 computing cores, the parallel efficiency of one million cores exceeds 60.5%.
LI Yuan-Lu , WANG Jian-Xiang , FAN Xiao-Ting , ZHOU Xin , WU Ming-Xuan
2024, 33(8):68-77. DOI: 10.15888/j.cnki.csa.009596 CSTR:
Abstract:Effective segmentation of clouds and their shadows is a critical issue in the field of remote sensing image processing. It plays a significant role in surface feature extraction, climate detection, atmospheric correction, and more. However, the complex features of clouds and cloud shadows in remote sensing images, characterized by their diverse, irregular distributions and fuzzy boundary information that is easily disturbed by the background, make accurate feature extraction challenging. Moreover, there are few networks specifically designed for this task. To address these issues, this study proposes a dual-path network combining vision Transformer (ViT) and D-UNet. The network is divided into two branches: one is a convolutional local feature extraction module based on the dilated convolution module of D-UNet, which introduces a multi-scale atrous spatial pyramid pooling (ASPP) to extract multi-dimensional features; the other branch comprehends the context semantics globally through the vision Transformer, enhancing feature extraction. Finally, the study performs an upsampling through a feature fusion decoder. The model achieves superior performance on both a self-built dataset of clouds and cloud shadows and the publicly available HRC_WHU dataset, leading the second-best model by 0.52% and 0.44% in the MIoU metric, achieving 92.05% and 85.37%, respectively.
WANG Zhong , WANG Jing-Yu , YU Hao-Ran , XU Wen , LIANG Hong-Tao
2024, 33(8):78-89. DOI: 10.15888/j.cnki.csa.009590 CSTR:
Abstract:The knowledge tracing task aims to accurately track students’ knowledge status in real time and predict students’ future performance by analyzing their historical learning data. This study proposes a deep memory network knowledge tracing model incorporating knowledge point-relationships (HRGKT) to address the problem that current research has neglected complex higher-order relationships in the knowledge points covered by the questions. Firstly, HRGKT uses the knowledge point relationship graph to define the relationship information between nodes in the graph, which represents the rich information between knowledge points. GAT is used to obtain higher-order relationships between them. Then, forgetting exists in the learning process, and HRGKT considers four factors affecting knowledge forgetting to track students’ knowledge status more accurately. Finally, based on the experimental comparison results on real online education datasets, HRGKT performs more accurately in tracing students’ knowledge mastery status and has better prediction performance than current knowledge tracing models.
ZHANG Cheng , LIU Yan , SONG Hui-Hui
2024, 33(8):90-97. DOI: 10.15888/j.cnki.csa.009595 CSTR:
Abstract:The task of camouflaged object detection involves locating and identifying camouflaged objects in complex scenes. While deep neural network-based methods have been applied to this task, many of them struggle to fully utilize multi-level features of the target for extracting rich semantic information in complex scenes with interference, often relying solely on fixed-size features to identify camouflaged objects. To address this challenge, this study proposes a camouflaged object detection network based on multi-scale and neighbor-level feature fusion. This network comprises two innovative designs: the multi-scale feature perception module and the two-stage neighbor-level interaction module. The former aims to capture rich local-global contrast information in complex scenes by combining multi-scale features. The latter integrates features from adjacent layers to exploit cross-layer correlations and transfer valuable contextual information from the encoder to the decoder network. The proposed method has been evaluated on three public datasets: CHAMELEON, CAMO-Test, and COD10K-Test, and compared with the current mainstream methods. The experimental results demonstrate that the proposed method outperforms the current mainstream methods, achieving excellent performance across all metrics.
ZHANG Deng-Fan , YUAN Yi-Lin , YANG Fan , LI Zi-Chen
2024, 33(8):98-107. DOI: 10.15888/j.cnki.csa.009589 CSTR:
Abstract:This study is designed to address the issues of group user authorization management and integrity verification for shared medical data. First, to prevent group users from overstepping their authority, authorization identifiers are introduced. Medical data owners use authorization identifiers to allocate different access rights to group users, according to user identities. The mathematical construction of authorization identifiers effectively ensures that it cannot be forged. Second, to record revoked users and deprive them of access rights, a revoked user list based on a skip list is introduced. As skip list can support fast lookup and insertion, the overhead of revoking a user is only O(logn). Afterward, the concrete process and mathematical design of shared data integrity verification are improved. Finally, the security analysis and simulation experiments prove the security and efficiency of the scheme.
WANG Yao-Kun , FU Xiao-Wei , LI Xi , XU Wei
2024, 33(8):108-114. DOI: 10.15888/j.cnki.csa.009575 CSTR:
Abstract:The image segmentation of surface defects on solid oxide fuel cell (SOFC) is of great significance for the quality inspection of monolithic SOFC. Aiming at the problems of blurred edges and complex backgrounds of surface defect images of monolithic SOFC, this study proposed a self-attention fusion method for SOFC surface defect image segmentation. Firstly, a multi-channel self-attention module is proposed to enhance the inter-channel correlation and improve the channel representation. Secondly, a multi-scale attention fusion module is utilized to further improve the network’s ability to extract defect features at different scales; and finally, a triplet joint loss function is proposed to supervise the training process. Experiments show that the proposed method can effectively extract surface defects of monolithic SOFC while improving network segmentation performance.
2024, 33(8):115-122. DOI: 10.15888/j.cnki.csa.009601 CSTR:
Abstract:Sparse mobile crowdsensing (MCS) is an emerging paradigm that collects data from a subset of sensing areas and then infers data from other areas. However, there is a shortage or uneven distribution of workers when sparse MCS is applied. Therefore, with a limited budget, it is important to prioritize the involvement of the more important workers in data collection. Additionally, many sparse MCS applications require timely data. Consequently, this study considers data freshness, with age of information (AoI) serving as a freshness metric. To address these challenges, a simplified AoI-aware sensing and inference (SASI) framework is proposed in this study. This framework aims to optimize AoI and inference accuracy by selecting suitable workers for data collection under budget constraints and accurately capturing spatiotemporal relationships in sensed data for inference. Moreover, limited budgets and worker availability may result in a reduced volume of data. Thus, methods for streamlining data inference models are also proposed to enhance inference efficiency. Experiments have substantiated the superiority of this framework in practice.
XIA Jing-Ming , DAI Ru-Chen , TAN Ling
2024, 33(8):123-131. DOI: 10.15888/j.cnki.csa.009627 CSTR:
Abstract:This study proposes a deep learning model for short-term precipitation forecasting, called MSF-Net, to address the limitations of traditional methods. This model integrates multi-source data, including GPM historical precipitation data, ERA5 meteorological data, radar data, and DEM data. A meteorological feature extraction module is employed to learn the meteorological features of the multi-source data. An attention fusion prediction module is used to achieve feature fusion and short-term precipitation forecasting. The precipitation forecasting results of MSF-Net are compared with those of various artificial intelligence methods. Experimental results indicate that MSF-Net achieves optimal threat score (TS) and bias score (Bias). This suggests that it can enhance the effectiveness of data-driven precipitation forecasting within a 6 h prediction horizon.
SUN Hao , SHUAI Hui , XU Xiang , LIU Qing-Shan
2024, 33(8):132-144. DOI: 10.15888/j.cnki.csa.009574 CSTR:
Abstract:As point cloud acquisition technology develops and the demand for 3D applications increases, real-world scenarios require continuous and dynamic updating of the point cloud analysis network with streaming data. This study proposes a dual feature enhancement for the class-incremental 3D point cloud object learning method, which adapts point cloud object classification to scenarios where new category objects keep emerging in newly acquired data through incremental learning. This study proposes a discriminative local enhancement module and knowledge injection network respectively to alleviate new class bias problems in class-incremental learning by studying the characteristics of point cloud data and old class information. Specifically, the discriminative local enhancement module characterizes the various local structural characteristics of 3D point cloud objects by perceiving expressive local features. Subsequently, the importance weights of each local structure are obtained based on the global information of each local structure, enhancing the perception of differential local features and improving the differentiation of new and old class features. Furthermore, the knowledge injection network injects old knowledge from the old model into the feature learning process of the new model. The enhanced hybrid features can effectively mitigate the increased new class bias caused by the lack of old class information. Under the incremental learning experimental settings of the 3D point cloud datasets ModelNet40, ScanObjectNN, ScanNet, and ShapeNet, extensive experiments show that compared with existing state-of-art methods, the method in this study has an average incremental accuracy improvement of 2.03%, 2.18%, 1.65%, and 1.28% on the four datasets.
DING Yu-Chen , XU Jian-Jun , CUI Wen-Quan
2024, 33(8):145-154. DOI: 10.15888/j.cnki.csa.009606 CSTR:
Abstract:Implicit feedback data plays a crucial role in recommender systems, but it often suffers from sparsity and biases, including exposure bias and conformity bias. Existing debiasing methods tend to address only one type of bias, which can impact personalized recommendation effectiveness, or require a expensive debiased dataset as auxiliary information for multiple debiasing. To address this issue, a collaborative filtering recommendation algorithm specifically designed for sparse implicit feedback data, which can simultaneously debias exposure bias and conformity bias, is proposed. The algorithm utilizes the proposed dual inverse propensity weighting method and a contrastive learning auxiliary task to remove the two biases contained in the implicit feedback data which are input into dual-tower autoencoders so that the complete algorithm can estimate users’ preference probability to items. Experimental results demonstrate that the proposed algorithm outperforms comparative algorithms in terms of normalized discounted cumulative gain (NDCG@K), mean average precision (MAP@K), and recall (Recall@K) on publicly available debiased datasets such as Coat and Yahoo!R3.
ZHU Jia-Jia , YANG Xue-Zhi , LIANG Hong-Bo , YANG Xiang-Yu
2024, 33(8):155-165. DOI: 10.15888/j.cnki.csa.009521 CSTR:
Abstract:Synthetic aperture radar (SAR) and optical image fusion aim to leverage the imaging complementarity of satellite sensors for generating more comprehensive geomorphological information. However, existing network models often exhibit low imaging accuracy during the fusion process due to the heterogeneity in data distribution of each single satellite sensor and differences in imaging physical mechanisms. This study proposes the DNAP-Fusion, a novel SAR and optical image fusion network that incorporates dual non-local attention perception. The proposed method utilizes a dual non-local perceptual attention module to extract structural information from SAR images and texture details from optical images within a multi-level image pyramid with a gradually decreasing spatial scale. It then fuses their complementary features in both spatial and channel dimensions. Subsequently, the fused features are injected into the upsampled optical image through image reconstruction, resulting in the final fusion outcome. Additionally, before network training, image encapsulation decisions are employed to enhance the commonality between objects in SAR and optical images within the same scene. Qualitative and quantitative experimental results demonstrate that the proposed method outperforms state-of-the-art (SOTA) multisensor fusion methods. The correlation coefficient (CC) in the objective evaluation indices is 0.990 6, and the peak signal to noise ratio (PSNR) is 32.156 0 dB. Moreover, the proposed method effectively fuses the complementary features of SAR and optical images, offering a valuable idea and method for enhancing the accuracy and effectiveness of remote sensing image fusion.
WANG Yan-Gen , CHEN Fei , CHEN Quan
2024, 33(8):166-175. DOI: 10.15888/j.cnki.csa.009571 CSTR:
Abstract:Due to the small inter-class differences and large intra-class differences of fine-grained images, the key to fine-grained image classification tasks is to find subtle differences between categories. Recently, Vision Transformer-based networks mostly focus on mining the most prominent discriminative region features in images. There are two problems with this. Firstly, the network ignores mining classification clues from other discriminative regions, which can easily confuse similar categories. secondly, the structural relationships of images are ignored, resulting in inaccurate extraction of category features. To solve the above problems, this study proposes two modules: dynamic adaptive modulation and structural relationship learning. The dynamic adaptive modulation module forces the network to search for multiple discriminative regions, and then the structural relationship learning module is used to construct structural relationships between discriminative regions. Finally, the graph convolutional network is used to fuse semantic and structural information to obtain predicted classification results. The proposed method achieves testing accuracy of 92.9% and 93.0% on the CUB-200-2011 dataset and NA-Birds dataset, respectively, which is superior to existing state-of-the-art networks.
LI Chen-Wei , MO Zi-Peng , ZHAO Meng-Fei
2024, 33(8):176-186. DOI: 10.15888/j.cnki.csa.009599 CSTR:
Abstract:With the development of GPS positioning technology and mobile Internet, various location-based services (LBS) applications have accumulated a large amount of spatio-textual data with location and text markup. These data are widely used in location selection decision-making scenarios such as marketing and urban planning. The goal of spatio-textual location selection is to mine the optimal locations from a given candidate set to build new facilities to influence the largest number of spatio-textual objects, such as people or vehicles, where the closer the spatial location and the more similar the text, the greater the influence. However, existing solutions not only fail to consider prevalent peer competition in real life but also ignore user evaluation factors for facilities. To make more reasonable location selection decisions in a peer competition environment combined with user ratings, this study proposes a more rational spatio-textual location selection problem, CoSTUR. To solve the limitation in traditional models where objects can only be influenced by a single facility, a threshold that makes a trade-off between the certainty and quantity of facility influence on objects is introduced, which also models the real-world situation in which multiple facilities could simultaneously influence a specific user. Based on the classical competitive equalization model, quantification of competition among facilities with different ratings is achieved. To reduce the high computational cost for large volumes of data, a novel spatio-textual index structure, TaR-tree, is constructed and two pruning strategies based on influence range are designed with a combination of thresholds to achieve two branch-and-bound solutions for spatial connectivity and range queries. Experimental results on real and synthetic datasets demonstrate that the computational efficiency can be improved by nearly one order of magnitude compared to baseline algorithms, verifying the effectiveness of the proposed method.
LI Zhi-Jie , YANG Sheng-Jie , LI Chang-Hua , ZHANG Jie , DONG Wei , JIE Jun
2024, 33(8):187-195. DOI: 10.15888/j.cnki.csa.009591 CSTR:
Abstract:Ancient Chinese texts are rich in historical and cultural information. Studying entity relationship extraction of such texts and constructing related knowledge graphs play an important role in cultural inheritance. Given the large number of rare Chinese characters, semantic fuzziness, and ambiguity in ancient Chinese texts, the entity relation joint extraction model based on the BERT-ancient-Chinese pre-trained model (JEBAC) is proposed. First of all, BERT-ancient-Chinese pre-trained model integrates the BiLSTM neural network and attention mechanism (BACBA), identifies all subject and object entities in sentences, and provides a basis for joint extraction of relation and object entities. Next, the normalized coding vector of the subject entity is added to the embedding vector of the whole sentence to better understand the semantic features of the subject entity in the sentence. Finally, combined with the sentence vector with the characteristics of the subject entity and the prompt information of the object entity, the relationship and object entity in the sentence are jointly extracted by BACBA to obtain all triple information (subject entity, relationship, and object entity) in the sentence. The performance of Chinese entity relation extraction DuIE2.0 datasets and the classical Chinese entity relation extraction C-CLUE small sample datasets of CCKS 2021 are compared with that of the existing methods. Experimental results show that the proposed method is more effective in extraction performance, with F1 values up to 79.2% and 55.5%, respectively.
LIU Pan-Pan , AN Dian-Long , FENG Yan
2024, 33(8):196-204. DOI: 10.15888/j.cnki.csa.009613 CSTR:
Abstract:In computer vision segmentation, the Transformer-based image segmentation model needs a large amount of image data to achieve the best performance. However, the data volume of medical images is very scarce compared with natural images. Convolution, with its higher inductive bias, is more suitable for medical images. To combine the long-range representation learning of Transformer with the inductive bias of CNN, a residual ConvNeXt module is designed to simulate the design structure of Transformer in this research. The module, composed of deep convolution and point wise convolution, is used to extract feature information, which greatly reduces the number of parameters. The receptive field and feature channel are effectively scaled and expanded to enrich the feature information. In addition, an asymmetric 3D U-shaped network called ASUNet is proposed for the segmentation of brain tumor images. In the asymmetric U-shaped structure, the output features of the last two encoders are connected by residual connection to expand the number of channels. Finally, deep supervision is used in the process of upsampling, which promotes the recovery of semantic information. Experimental results on the BraTS 2020 and FeTS 2021 datasets show that the dice scores of ET, WT, and TC reach 77.08%, 90.83%, 83.41%, and 75.63%, 90.45, 84.21%, respectively. Comparative experiments show that ASUNet can fully compete with Transformer-based models in terms of accuracy while maintaining the simplicity and efficiency of standard convolutional neural networks.
HE Ke-Cai , XU Lin , JIANG Jin-Kang , TAO Yu-Chuan , WANG Xue-Yuan
2024, 33(8):205-213. DOI: 10.15888/j.cnki.csa.009594 CSTR:
Abstract:In solid oncology, on fluorescence microscopy images of interphase nuclei processed with fluorescence in situ hybridization (FISH) technology, DNA amplification often appears as diffraction-limited blobs. Imaging conditions limit image quality, resulting in a low image signal-to-noise ratio of the image, serious background interference, and non-blob structure interference. Designing suitable blob detection methods to provide objective and quantitative data helps doctors diagnose cancer. The algorithm first uses three-layer wavelet multiscale summation to denoise the fluorescence image, then uses the multiscale Laplacian of Gaussian operator to enhance the blob area, and finally suppresses the non-blob area through unilateral second-order Gaussian kernels in four directions to complete blob detection. Experimental results show that for 83 images in the self-built database, the average F-score reaches 0.96, and the average running time is less than 0.5 s.
2024, 33(8):214-221. DOI: 10.15888/j.cnki.csa.009572 CSTR:
Abstract:The original artificial fish swarm algorithm (AFSA) has weak global search ability and poor robustness and is easy to fall into local extremum. Given these problems, an adaptive and differential mutation artificial fish swarm algorithm (ADMAFSA) is proposed. Firstly, it utilizes an adaptive vision field and step length strategy to improve the fine search ability of individuals in better areas of the population and enhance the optimization accuracy of the algorithm. Secondly, to explore potential better areas, the opposition-based learning mechanism is introduced into the random behavior of artificial fish swarms. Thereby, the algorithm can get better global searching ability and avoid premature convergence. Finally, inspired by the differential evolution algorithm, a mutation operation is applied to poorly performing artificial fish to increase the diversity of the fish swarm and reduce the possibility of the algorithm falling into the local extremum. To validate the performance of the improved algorithm, the proposed algorithm is tested with six benchmark test functions and eight CEC2019 functions. The experimental results indicate that, compared to other AFSA variants and novel intelligent algorithms, ADMAFSA demonstrates improvements in terms of optimization accuracy and robustness. Furthermore, in designing the train of gears, the optimization effectiveness of the improved algorithm is further proved.
DUAN Wei , GAO Chen-Qiang , LI Peng-Cheng , ZHU Chang-Jie
2024, 33(8):222-230. DOI: 10.15888/j.cnki.csa.009562 CSTR:
Abstract:The malicious use of facial recognition technology may lead to personal information leakage, posing a significant threat to individual privacy security. Safeguarding facial privacy through universal adversarial attacks holds crucial research significance. However, existing universal adversarial attack algorithms primarily focus on image classification tasks. When applied to facial recognition models, they often encounter challenges such as low attack success rates and noticeable perturbation generation. To address these challenges, this study proposes a universal adversarial attack method for face recognition based on commonality gradients. This method optimizes universal adversarial perturbation through the common gradient of the adversarial perturbations of multiple face images and uses dominant feature loss to improve the attack capability of the perturbation. Combined with the multi-stage training strategy, it achieves a balance between attack effect and visual quality. Experiments on public datasets prove that the method outperforms methods such as Cos-UAP and SGA in the attack performance on facial recognition models, and the generated adversarial samples have better visual effects, indicating the effectiveness of the proposed method.
LI Xin , WANG Xue-Zhen , HONG Jin-Sheng , ZHONG Jing , SHI Peng
2024, 33(8):231-239. DOI: 10.15888/j.cnki.csa.009577 CSTR:
Abstract:In recent years, image segmentation applications based on convolutional neural networks (CNNs) have been quite extensive, and great progress has been made in feature extraction. However, with convolutional layers increasingly deep, the receptive field is continually enlarged, which makes the model lose local feature information and affects model performance. Using graph convolution network (GCN) to process information on graph data structures preserves local features without losing local information as the layers deepen. This study focuses on combining U-Net (a kind of symmetric full convolutional networks) feature extraction based on CNN structure with GCN-based image segmentation to extract global and local, shallow, and deep multi-scale feature sets for multimodal glioma MR sequence image segmentation. The process can be divided into two stages. Firstly, U-Net is used to extract features from brain multimodal glioma MR sequence images, and multiple pooling layers are used to realize multi-scale feature extraction and up-sampling for feature fusion, in which the bottom layer outputs lower-level features, and the top layer outputs more abstract high-level features. Secondly, the feature map data obtained by U-Net is converted into the graph structure data required by GCN by dilating neighborhood and sparsification, and the image segmentation problem is converted into the graph node classification problem. Lastly, the graph structure data is classified by cosine similarity. Experimental results achieved segmentation accuracy of 0.996 and sensitivity of 0.892 on the BraTS 2018 public database. Compared with other deep learning models, this method, by fusing multi-scale features and using GCN to establish topological connections between high and low level features, ensures that local information is not lost to achieve better segmentation results, which meets the needs of analyzing clinical glioma MR images, and then effectively improves the diagnostic accuracy of gliomas.
2024, 33(8):240-249. DOI: 10.15888/j.cnki.csa.009629 CSTR:
Abstract:To solve the flow shop scheduling problem with limited buffers and machine processing gears (FSSP_LBMPG), this research establishes a mathematical programming model for green flow shops with limited buffers. The model has two objective functions: the minimized values of maximum completion time and processing energy consumption. With buffer capacity as a constraint, the processing speed and energy consumption are coordinated by reasonably selecting machine processing gears. Based on the characteristics of the problem model, an improved dandelion optimization algorithm (IDOA) is proposed. The algorithm first designs a DOA double-layer real-valued encoding mechanism to represent the solution to the problem according to the characteristics of the scheduling problem. By introducing an initialization mechanism, the quality and efficiency of the initial solution are improved. During algorithm iteration, a real-valued crossover strategy and a variable neighborhood search strategy are designed to compensate for the poor local search ability of the original dandelion algorithm and enhance the development capabilities of the improved algorithm. Comparative experiments on designed cases show that the proposed improved algorithm effectively enhances the performance of the original algorithm, thereby verifying the effectiveness and robustness of the improved algorithm.
LIANG An-Yuan , XIAO Xue-Zhong
2024, 33(8):250-256. DOI: 10.15888/j.cnki.csa.009602 CSTR:
Abstract:In the task of 3D human pose estimation, the complex topology formed by the connection relationship between human joints presents a challenge. Effective capture of the connections between local joints is possible through modeling this structure with a graph convolutional network. Although non-adjacent joints lack direct physical connections, Transformer encoders establish contextual relationships between joints, which is crucial for better human posture inference due to the biomechanical constraints influencing human motion and pose, as well as the synergistic interaction of human joints. Balancing model performance with a reduction in the number of parameters is of particular importance for large-scale models. To tackle these challenges, a multi-layer spatial feature fusion network model (MLSFFN) based on graph convolution and Transformer is designed. This model proficiently fuses local and global spatial features with a relatively minimal parameter set. Experimental results demonstrate that the proposed method achieves a mean point per joint error (MPJPE) of 49.9 mm on the Human3.6M dataset with only 2.1M parameters. Moreover, the model demonstrates a robust generalization capability.
YE Ming-Wei , TANG Jia , GUO Yan , WU Gui-Xing
2024, 33(8):257-263. DOI: 10.15888/j.cnki.csa.009586 CSTR:
Abstract:While natural language generation (NLG)-based large language models, represented by ChatGPT, perform well in various natural language processing tasks, their performance in sequence recognition tasks, such as named entity recognition, is somewhat inferior to that of bidirectional encoder representations from Transformer (BERT)-based deep learning models. To address this issue, this study first transforms the existing Chinese named entity recognition problem into a machine reading comprehension problem. A new name entity recognition method based on in-context learning and fine tuning is proposed, thereby enabling NLG-based language models to achieve good results in named entity recognition without changing base model pre-training parameters. Additionally, since named entities are generated by the model rather than classified from original data, there are no boundary issues. To verify the effectiveness of the new framework on named entity recognition tasks, experiments are conducted on some Chinese named entity recognition datasets. On the Resume and Weibo datasets, the F1 scores reach 96.04% and 67.87% respectively, a gain of 0.4 and 2.7 percentage points over the state-of-the-art models, confirming that the new framework can effectively utilize the text generation advantages of NLG-based language models to complete named entity recognition tasks.