LI Rui-Xin , CAI Zhao-Xin , WANG Bing-Bing , PAN Jia-Hui
2021, 30(2):1-11. DOI: 10.15888/j.cnki.csa.007777
Abstract:Continuous emotion recognition based on multimodal physiological data plays an important role in many fields. However, it needs more physiological data to train emotion recognition models due to the lack of subjects’ data and subjectivity of emotion, and it is largely affected by homologous subjects’ data. In this study, we propose multiple emotion recognition methods based on facial expressions and EEG. Regarding the modality of facial images, we propose a multi-task convolutional neural network trained by transfer learning to avoid over-fitting induced by small datasets of facial images. With respect to the modality of EEG, we propose two emotion recognition models. The first is a subject-dependent model based on support vector machine, possessing high accuracy when the validation and training data are homogeneous. The second is a cross-subject model for reducing the impact caused by the individual variation and non-stationarity of EEG. It is based on a long short-term memory network, performing stably under the circumstance that validation and training data are heterogeneous. To improve the accuracy of emotion recognition for homogeneous data, we propose two methods for decision-level fusion of multimodal emotion prediction: Weight enumeration and adaptive boost. According to the experiments, when the validation and training data are homogeneous, under the best circumstance, the average accuracy that multimodal emotion recognition models reached in both arousal and valence dimensions were 74.23% and 80.30%; as the validation and training data are heterogeneous, the accuracy that the cross-subject model reached in both arousal and valence dimensions are 58.65% and 51.70%.
DING Xiao-Qian , XIANG Yong , LI Xi-Wang , WU Yi
2021, 30(2):12-19. DOI: 10.15888/j.cnki.csa.007520
Abstract:As the widespread application of information technologies in Industrial Control System (ICS), ICSs have gradually transformed from closed systems to open and interconnected ones, encountering with the challenges to network security. This paper elaborates on the current situation of information security in ICS through security events. Then, it focuses on the ICS’s architecture and the differences between ICS information security and traditional network security. Moreover, it systematically analyzes the proceedings of the ICS & SCADA Cyber Security Research 2018 (ICS-CSR 2018). Besides, it classifies and examines the proposed security solutions regarding the system architecture and communication protocols. Finally, drawing on the current solutions and in response to actual requirements, this paper summarizes three key directions: network attack models, the ICS’s simulation platforms, and the non-technical Human-Machine Interface (HMI) technology.
CHEN Ying-Rou , TIAN Qiu-Hong , YANG Hui-Min , LIANG Qing-Long , BAO Jia-Xin
2021, 30(2):20-27. DOI: 10.15888/j.cnki.csa.007748
Abstract:Static hand gesture recognition based on multi-feature weighted fusion is proposed to solve the problems of singularity and omission in convolutional neural network for feature extraction. Firstly, the Fourier and Hu moments of the segmented gesture image are extracted and fused as the local features. Besides, a dual-channel convolutional neural network is designed to extract the deep features of the gesture image, which are further treated by dimensionality reduction by principal component analysis. Secondly, the extracted local and deep features are weighted and fused as effective description for hand gesture recognition. Finally, gesture images are classified with Softmax classifier. Experimental results verify the proposed method, and the recognition accuracy reaches over 99% on the image dataset.
CAO Jin-Bo , PAN Hai-Peng , ZHANG Yi-Bo
2021, 30(2):28-34. DOI: 10.15888/j.cnki.csa.007601
Abstract:A trajectory tracking control scheme based on an improved nonlinear disturbance observer is presented to address the position and attitude errors caused by inaccurate modeling and vulnerability to external disturbances of wall-climbing robots. First, a kinematic controller is designed through back stepping control to provide reference centroid velocity and angular velocity for dynamic control of robots. Secondly, an improved nonlinear disturbance observer serves as a feed forward controller to estimate modeling errors and external disturbances of the dynamic model, ensuring exponential convergence of disturbance errors. Finally, a sliding mode controller is designed based on the dynamic model with an interference observer. The scheme compensates for external disturbances quickly and its stability is proven by Lyapunov’s theorem. The simulation results demonstrate that the control scheme performs well in avoiding modeling errors and external interferences.
2021, 30(2):35-42. DOI: 10.15888/j.cnki.csa.007769
Abstract:Singular Value Decomposition (SVD) is adopted for image compression of the data matrix to obtain an optimal compression ratio and a clear compressed image. The principle of SVD and its application to compressing images are elaborated. Two methods for obtaining the better number of eigenvalues are proposed including the ratio threshold of eigenvalue number and the ratio threshold of eigenvalue sum. The experiments reveal that when the ratio threshold of eigenvalue number is 0.1, a clear image is obtained with the compression ratio of 5.99. When the ratio threshold of eigenvalue sum is 0.85, a clear image is also acquired with the compression ratio for PNG images of 7.89 and that for JPG images of 5.92. Case study indicates that the first 1% of eigenvalues represent more data characteristics. When the ratio threshold of eigenvalue number is determined, the compression ratios for PNG and JPG images are identical. When the ratio threshold of the eigenvalue sum is determined, the compression ratio for PNG images is higher than that for JPG images. The method for obtaining the eigenvalue number according to the ratio threshold of eigenvalue sum is more universal. It can be applied to solving alpha channel redundancy and setting a unified ratio threshold of eigenvalue sum for large-scale image compression.
WU Yuan-Hao , XIE Da , WANG Yu
2021, 30(2):43-51. DOI: 10.15888/j.cnki.csa.007798
Abstract:At the age of globalization, as the increase in the personnel mobility rate, the transmission modes and speed of epidemic diseases go far beyond those in the past. The basic principles and methods of building the personnel monitoring and management system based on the blockchain technology were expounded. When a major infectious disease attacks, the people from the epidemic area and confirmed or suspected patients are monitored in real time. The system can inquire into the history of personnel mobility, find close contacts, and search for the sources of infection, which can support isolation measures for epidemic prevention and control.
XU Si-Te , HUANG Zi-Shuo , MA Zhen-Kai , WU Bin , LIU Jia-Xing , SHENG Tao , DAI Rui-Ming , LUO Li , ZHANG Tian-Tian
2021, 30(2):52-62. DOI: 10.15888/j.cnki.csa.007791
Abstract:A performance evaluation system is developed for medical insurance departments to objectively supervise the medicare-designated pharmacy. According to the standard set of the medicare-designated pharmacy, Vue.js and Django Rest Framework are selected to build an evaluation system in Node.js and REST styles. The system has evaluated the performance of 881 medicare-designated pharmacies in Shanghai. It builds a database of Shanghai’s pharmacy performance and issues a customized assessment report. The system makes the management and assessment of the medicare-designated pharmacy more efficient, reducing the supervision cycle from 4 weeks to 1 week and administrative manpower by 128 person-months/year. As an information-based tool, it can support pharmacy managers, expert assessment teams and administrative supervisors and help medical insurance services better benefit people.
MA Zhi-Rou , MA Xin-Yu , LIU Jie , YE Dan
2021, 30(2):63-69. DOI: 10.15888/j.cnki.csa.007639
Abstract:During the court’s case-trial-enforcement process, the parties involved in multiple cases or the facts of the case are sometimes the same, namely “one person with multiple cases”, resulting in the waste and unreasonable allocation of judicial resources. Then a risk pre-warning system of one person with multiple cases based on deep learning is designed and implemented. This system is based on deep learning technology and massive judgment documents. Case identification and similarity measure for legal documents are proposed by modeling the vector representation of the case text, and one person with multiple cases is associated with the legal business rules, providing the risk pre-warning report. This system can offer technical support for judicial resource coordination, supporting courts to try cases fairly and efficiently.
JIA Jin-Ming , SONG Huan-Sheng , LIANG Hao-Xiang , YUN Xu , DAI Zhe
2021, 30(2):70-76. DOI: 10.15888/j.cnki.csa.007731
Abstract:People’s behavior in industrial inspections is closely bound up with safe production, and the design of inspection monitoring methods has become a hot research area. Aiming at the problem that current monitoring and analysis of inspection depend on manual judgment with low accuracy, this study proposes a monitoring and analysis system for industrial inspection based on machine vision. Firstly, the people in the video stream are detected by the YOLOv3 network. According to the detection results, in-scene interferences are removed by behavior analysis to obtain real behavior of inspectors. Finally, the inspection process is evaluated based on the behavior, and then results are stored in the database and posted to the web page. Videos with multiple monitoring perspectives are used for experiments. Results demonstrate that the system proposed in this study can accurately detect inspectors and analyze their behavior in complex environments, while achieving real-time processing. This result can serve as a reference for the intelligent monitoring of industrial inspection.
2021, 30(2):77-82. DOI: 10.15888/j.cnki.csa.007778
Abstract:With fine parallel processing capability and flexibility, Field Programmable Gate Array (FPGA) has been widely applied to hardware-accelerated computation, especially in Convolution Neural Networks (CNN). However, traditional image convolution on FPGA has limited modular design and large space overhead. This study builds a general experiment platform of image convolution for hardware acceleration. Through the modular design, it greatly improves the flexibility in image convolution for different convolution kernels. In addition, an image batch-processing system is adopted to enable memory sharing due to data repetition, reducing the need for storage space. Experimental results present that the proposed platform boasts a better reconfigurable architecture in terms of modular design. Besides, the complexity of BRAM only increases linearly with higher parallelism, which has the advantage of reducing power consumption.
CAI Zhao-Xin , LI Rui-Xin , DAI Yi-Dan , PAN Jia-Hui
2021, 30(2):83-88. DOI: 10.15888/j.cnki.csa.007818
Abstract:The textile industry is a pillar of China’s economy. In the supply chain of fabrics, fabric defects largely affect their quality. At present, textile production enterprises mainly rely on people’s eyes to check the quality of fabrics. This method has high cost of employment, low efficiency, and a high rate of missed and false detections, failing to meet the requirement for high-speed industrial development. This study enhances the data to address the uneven number of categories in the data set. Categories of fabric defects are identified by the Faster RCNN model, and the RPN network in the Faster RCNN model is improved regarding the small concentrated defect targets in the dataset. In addition, this study develops a fabric defect recognition system to display categories of fabric defects and precisely locate the defects. Through the comparison of experimental results, the average detection rate of this method is 79.3%, which is 5.75% higher than that of Fast RCNN.
HU Zhong-Qi , ZHAO Feng , LI Xue-Qiang , YAN Jun , JIANG Fan
2021, 30(2):89-96. DOI: 10.15888/j.cnki.csa.007795
Abstract:With regard to the hidden security and reliability problems of dam monitoring data, this study proposes a system architecture composed of sensor cloud and the Unmanned Aerial Vehicle (UAV) cloud based on BlockChain (BC) technology for monitoring dams to ensure the security and reliability of data. The sensor cloud provides various sensing data, while the UAV cloud collects these data and transmits them to the Dam Monitoring Center (DMC), and the BC technology is applied to ensuring integrity, authenticity, security and traceability of data. The analysis shows that the proposed system has good scalability, which can effectively guarantee the reliable sources of monitoring data and the security of data transmission and prevent potential data attacks. Finally, this study assesses the work performance by evaluating the delay rate of data transfer. The simulation results reveal that the delay rate of the designed system is positively correlated with the probability of generating events and revisit time, but negatively correlated with the alarm interval time, and it presents a higher payment success rate.
KE Qiang , CHEN Zhi-Hua , HU Jing-Wei , CHEN Huan-Jun , PI Zhi-Wang , ZHANG Han , ZHOU Xue-Song
2021, 30(2):97-102. DOI: 10.15888/j.cnki.csa.007796
Abstract:At present, the power grid contains a large number of multi-source information data, but due to the large size of the data types and high multi-dimensions, it is difficult to achieve effective data retrieval.According to the data structure of actual power operation system and multi-source database sample analysis, an improved decision tree algorithm based on mutual information is proposed as the kernel of data mining, and a parallel processing architecture suitable for power system is put forward, which can retrieve multi-source data fast and efficiently. The information is directly extracted from the original data of multi-source information according to the representative feature subset during searching. The index information is judged and sorted to form the decision tree model, and multi-source data is extracted simultaneously through Spark MapReduce Python data decomposition and parallel retrieval, so as to shorten the retrieval time. Taking a regional power grid database as an example to simulate and verify, the results show that the method can realize multi-source heterogeneous information extraction of power distribution network, effectively avoid duplicate data, and meet the requirements of online engineering decision.
GAO Yi-Yuan , LI Hao , GE Rong-Cun , LI Jia-Qi , LI Chuang , SHE Jiang-Feng
2021, 30(2):103-109. DOI: 10.15888/j.cnki.csa.007773
Abstract:Vegetation, terrain, and man-made buildings are the primary parts in a 3D geography scene. Trees, as the main component of vegetation, bring greater difficulties in presentation of themselves in a 3D scene than others due to complex natural forms. Geometry-based model, with huge computational burden to real-time scene rendering, could present 3D tree details much better. However, a simplified tree model based on texture has generally better rendering performance than the geometry-based model, but with rather rough 3D effect. Reasonable compromise on the rendering performance and visual 3D presentation is a big challenge, becoming a research focus in the field of virtual geographic environment. This study proposes a self-adapting billboard to present a tree in a 3D scene. Firstly, a set of tree pictures should be taken from different viewpoints or pre-constructed based on different perspectives of the 3D tree model. Then, a billboard is taken as a geometrical carrier of dynamical texture. While viewpoint moves, the direction from the viewpoint to the target tree will be regarded as a determinative parameter to pick out one best-match picture of the tree from the pre-constructed picture set to replace current texture on the billboard. It makes the rendering effect of trees much similar to the real observations from a relative viewing angle. At last, a 3D scene with many trees is rendered in a browser based on WebGL. The result proves that the rendering efficiency and effect are both acceptable.
QI Xin-Jiu , HUANG Feng-Hua , LI Chuan-Lin , LIN Guo-Bin , CAO Jun
2021, 30(2):110-116. DOI: 10.15888/j.cnki.csa.007633
Abstract:This study takes the campus of Yango University in Mawei District, Fuzhou City as an example to study the feasibility and accuracy of 3D modeling of UAV tilt photography in complex terrain. It adopts DJI Matrice series of UAVs carrying cloud eye series of five-lens cameras to collect tilt image data in the survey area. A real-time kinematic instrument connects Qianxun CORS account to complete the acquisition of control points in the survey area. With ContextCapture, a real-world modeling software of Bentley company, the data collected from the external operation is processed for internal operation. Consequently, the high-resolution 3D scene model, Digital Surface Model (DSM) and True Digital Orthophoto Map (TDOM) of the campus are obtained, and the accuracy of the 3D model is analyzed. In order to ensure the accuracy of the model, the experiment improves the accuracy of the model by setting more control points, making sub-regional aerial survey, and improving the overlap of heading and the side direction. Experimental results reveal that the mean square error of the plane position and the mean square error of the elevation of the 3D real scene model are less than 5 cm, which can meet the requirements of large-scale measurement and provide important data support for the secondary development of the 3D real scene model of the campus in the later stage.
ZHANG Lu-Lu , GUANG Xiao-Li , LIU Ji-Zeng
2021, 30(2):117-124. DOI: 10.15888/j.cnki.csa.007768
Abstract:At present, Vehicle Ad hoc NETworks (VANETs) have received great attention in the automotive industry and research areas, especially in terms of users’ privacy protection. As an extension of cloud computing, fog computing is more reactive since it can effectively reduce network latency. Compared with cloud computing, fog computing decreases the volume of data sent to and from the cloud and further lowers the security risk. Since Ciphertext Policy Attribute-Based Encryption (CP-ABE) is suitable for fine-grained access control of data stored on the cloud and searchable encryption based on keywords, users can quickly find interesting data stored on cloud servers and not leak information about any search keywords. For this reason, we propose attribute-based searchable encryption and attribute update in this study, which is a combination of attribute-based encryption schemes and keyword-search encryption schemes. The proposed scheme supports user attribute update, and illegal vehicle users will not have access to the stored data, thereby realizing the cancellation of illegal vehicle users. At the same time, it also achieves the communication among the vehicles, fog, and cloud. In the communication process, it outsources part of the encryption and decryption calculations to the fog nodes, reducing the users’ calculation cost. In addition, performance analysis shows that the proposed scheme has better advantages in both functionality and computational overhead.
2021, 30(2):125-131. DOI: 10.15888/j.cnki.csa.007530
Abstract:The common strategy adopted by most existing multi-label learning algorithms in model training is to predict all the label categories based on the same label feature set. However, this idea does not take into account the label-specific features of each label, which are very helpful for distinguishing other categories of labels and describing itself in the label space. For this reason, an improved ML-KNN algorithm based on label-specific features, i.e., MLF-KNN, is proposed in this study. Different from the previous multi-label algorithms which directly operate on the original training data set, the algorithm proposed in this study first builds features for each category of label by preprocessing the training data set. Then, it further constructs and optimizes L1-norm in the obtained label space, thus introducing the correlation between labels. Finally, the improved algorithm is applied for prediction and classification. The experimental results show that the improved algorithm has achieved certain advantages compared with the ML-KNN algorithm and other three multi-label learning algorithms on the public image and yeast data sets.
HUI Fei , TANG Shu-Yu , XING Mei-Hua , GUO Jing
2021, 30(2):132-139. DOI: 10.15888/j.cnki.csa.007821
Abstract:LTE-V2X can provide vehicles with reliable and efficient communication capabilities. In LTE-V2X, Vehicle User Equipment (VUE) has two communication modes: centralized scheduling (Mode 3) and distributed scheduling (Mode 4). Aiming at the packet collision caused by excessive VUEs in the vehicle dense scenes, in this study, we classify the V2X applications according to such indices as communication delay and distance and propose an autonomous vehicle scheduling algorithm. In this way, different V2X applications can choose an appropriate communication mode according to their own performance requirements, alleviating the pressure of V2X resouce allocation in the dense scenes. Furthermore, we build a Matlab simulation platform to verify the performance of the proposed algorithm. The simulation results show that the proposed algorithm can keep the Packet Delivery Ratio (PDR) above 0.6 in dense scenes, superior to the mainstream resource scheduling algorithms in a single communication mode.
JIANG Zong-Li , TIAN Cong-Cong
2021, 30(2):140-146. DOI: 10.15888/j.cnki.csa.007762
Abstract:The traditional collaborative filtering algorithms do not fully consider the user-item interaction information and face problems such as data sparseness or cold start, which results in inaccurate results of the recommendation system. For this reason, we propose a new recommendation algorithm, which is a collaborative filtering algorithm of graph neural network based on fusion meta-path. To be specific, first, the user-item historical interactions are embedded by a bipartite graph and the high-level features of users and items are obtained through multi-layer neural network propagation. Then, latent semantic information in the heterogeneous information network is acquired according to the random walk of meta-paths. Finally, the high-level features and latent features of users and items are combined for scoring prediction. The experimental results show that compared with the traditional recommendation algorithms, the proposed algorithm has been significantly improved.
2021, 30(2):147-153. DOI: 10.15888/j.cnki.csa.007770
Abstract:In the safety management of the operation sites, the supervision of fence crossing by non-construction personnel has always been essential. However, at present, there are many problems in the construction sites, such as a wide range of operation and a difficulty in the management of construction personnel, leading to the inefficiency of manual supervision. As an important research hot spot in the field of computer vision, video-based human action detection is widely used in public security monitoring. Therefore, in view of the shortcomings of the traditional manual supervision, in combination with the current computer vision technology, an intelligent detection and recognition method for fence crossing violations is proposed in this paper. In this method, video frames are acquired continuously through monitoring, and clips composed of video frames are taken as input. In addition, temporal and spatial features are extracted by 3D and 2D convolutions respectively. After fusion of the two parts of features, classification and boundary box regression are carried out. Furthermore, a comparative experiment is conducted to verify the effect of this method. The experimental results show that the proposed method can detect the fence crossing behavior accurately in a short time, featuring strong working ability.
ZHANG Xiao-Bin , ZHANG Jia-Cheng
2021, 30(2):154-159. DOI: 10.15888/j.cnki.csa.007698
Abstract:In a participatory sensing system, since the quality of the perceived data may be affected by the participants, a reputation calculation model based on the cumulative behavior of users is proposed to help select the trustworthy users. According to the extensiveness of the user groups and the uncertainty of the core users in the perceived environment, this model uses the OPTICS clustering algorithm to define the user scenarios and divide the behavioral data set. Furthermore, it introduces time stamps to label information and discard some old behaviors, thus updating the user reputation. The experimental results show that the proposed reputation model can combine old and new behaviors to calculate and adjust the user reputation well, displaying a good application prospect with respect to the evaluation of user reputation in the perceived environment.
FANG Zhou , LI Lei , ZHU Feng , LU Xing-He , LI Ze-Yu
2021, 30(2):160-164. DOI: 10.15888/j.cnki.csa.007801
Abstract:In order to achieve fast and accurate hand-eye calibration of a monocular vision system, we propose a new two-step hand-eye calibration method, which divides hand-eye calibration into the two steps of solving rotation and translation relationships. First, a robot carrying a calibration plate performs twice translation motions to solve the rotation relationship. Then, the tool frame on the robot carries out several times of rotation motions to solve the translation relationship. The proposed method is simple and fast and does not require expensive external equipment. Finally, the feasibility of this method is experimentally verified.
WANG Meng , LI Wei , GAO Rong , WANG Sa
2021, 30(2):165-170. DOI: 10.15888/j.cnki.csa.007761
Abstract:Aiming at the Foreign Object Debris (FOD) detection of runways, this study designs a system based on intelligent vehicle-mounted 3D cameras to collect road information and detect foreign objects. This system preliminarily screens out normal roads through the difference in the distribution of the depth quantified value of the depth image, and then through the point cloud abnormal value filtering and uneven reduction algorithm to correct the parameters and reduce the amount of data, the streamlined point cloud is adapted to the road data. Improved network for foreign object detection. This network uses the X convolution in the PointCNN network to extract point cloud data for spatial features through four convolutions, which preserves the spatial information of foreign objects as much as possible and improves the detection accuracy. Through test experiments on the collected data, the method designed in this study can accurately identify foreign objects and uneven roads with an accuracy rate close to 90%.
MOU Sen , CHEN Hong-Gang , QING Lin-Bo , HE Xiao-Hai , WANG Si-Yi
2021, 30(2):171-175. DOI: 10.15888/j.cnki.csa.007779
Abstract:Text detection in natural scenes is one of the difficulties in the field of image processing. An efficient and accurate scene text detector (EAST) algorithm is an excellent text detection algorithm in recent years, but the AdvancedEAST algorithm after the addition of post processing still has the problem of missed detection caused by the loss of the head and tail boundaries of the activated pixels. Thus, the detection effect of dense texts is not ideal. For this reason, an improved algorithm of dilated-corner attention EAST (DCA_EAST) is proposed, and a dilated convolution module and a corner attention module are added to the network structure to improve the missed detection. For the loss function, weight factors of category and sample difficulty are introduced to effectively improve the detection effect of dense texts. The experimental results show that the proposed algorithm has an accuracy of 93.02%, a recall rate of 76.69%, and an F-measured value of 84.07% on the ReCTS dataset of ICDAR2019, thus being superior to the AdvancedEAST algorithm.
2021, 30(2):176-181. DOI: 10.15888/j.cnki.csa.007775
Abstract:Video surveillance, military object recognition, consumer photography, and many other fields have high requirements for image sharpness. In recent years, deep neural networks have made great progress in the applied research on visual and quantitative evaluation, but the results generally lack the details of image textures, and the edges are too smooth, providing blurry visual experience. For this reason, we propose a method of improving image sharpness based on the generative adversarial network in this study. In order to better delivery the image details, this method adopts the improved residual block and skip connection as the main structure of the generative network, and the generator loss function consists of content loss, perception loss, and texture loss in addition to adversarial loss. Finally, the experiments on the DIV2K dataset prove that the proposed method exhibits good visual experience and quantitative evaluation in terms of improving image sharpness.
2021, 30(2):182-187. DOI: 10.15888/j.cnki.csa.007780
Abstract:In this study, we propose a method of unsupervised domain adaptive person re-identification. Given a labeled source-domain training set and an unlabeled target-domain training set, we explore how to improve the generalization ability of the person re-identification model on the target-domain test set. For this purpose, during the training of the model, the source-domain and target-domain training sets are simultaneously input into the model for training. While extracting global features, we extract local features to describe the person images and learn more fine-grained features. Furthermore, we apply Long Short-Term Memory (LSTM) for the modeling of a person in an end-to-end manner, treating the person as a sequence of body parts from the head to feet. Specifically, the method in this paper mainly includes two steps: (1) StarGAN is adopted to enhance the data of unlabeled target domain images; (2) the data sets of source domain and target domain are input into global branch and LSTM-based local branch at the same time for joint training. Finally, on the Market-1501 and DukeMTMC-reID data sets, the proposed model has achieved sound performance, which fully reflects its effectiveness.
YIN Shi-Gang , AN Yang , CAI Xin-Hua , QU Xiao-E
2021, 30(2):188-193. DOI: 10.15888/j.cnki.csa.007782
Abstract:Feature selection, whose premise is feature extraction, is a key step to improve the accuracy and efficiency in retweeting prediction through achine learning methods. Currently, the approaches commonly adopted in feature selection include Information Gain (IG), mutual information, and CHI-square test (CHI). In the traditional feature selection methods, such problems of IG and CHI as negative correlation and interference calculation elicited by low-frequency words lead to low classification accuracy. In view of these problems, we introduce a balance factor and a word frequency factor in this study to increase the algorithm accuracy. Then, according to the spread characteristics of Weibo information, combined with the improved IG and CHI algorithms, we propose the feature selection method based on Balance Information Gain-Word Frequency CHI-square test (BIG-WFCHI). Furthermore, we experimentally test the proposed method with five classifiers including maximum entropy model, support vector machine, naive Bayes classifier, K-nearest neighbor, and multi-layer perceptron on two heterogeneous data sets. The results show that our method can effectively eliminate both irrelevant and redundant features, increase the classification accuracy, and reduce the running time.
2021, 30(2):194-200. DOI: 10.15888/j.cnki.csa.007783
Abstract:Aiming at the problems in the existing fuzzy clustering segmentation algorithms, such as poor noise robustness and insufficient image feature extraction, we propose a multi-feature FCM segmentation algorithm combining morphological reconstruction and superpixels. First, the original image is subject to morphological closing reconstruction, which improves the robustness and detail-preserving ability of the algorithm. Secondly, the mean-shift method is employed to pre-segment the reconstructed image and obtain a set of superpixels. Thirdly, the color, texture and gradient features of each superpixel in the reconstructed image are extracted and defined by an averaging strategy to form the multi-dimensional feature vectors. Finally, these vectors are clustered by using the framework of the EWFCM algorithm, taking superpixels as the unit and the nuclear induced distance as the distance measure. Furthermore, six images in the BSDS300 data set are selected for the experimental comparison. The results show that the algorithm in this study has higher segmentation accuracy.
WANG Hong , ZHANG Qiang , WANG Ying , GUO Yu-Jie
2021, 30(2):201-206. DOI: 10.15888/j.cnki.csa.007789
Abstract:In order to increase the accuracy of the Classification And Regression Tree (CART) regression algorithm, we propose an improved CART regression algorithm based on Extreme Learning Machine (ELM-CART for short). The proposed algorithm mainly applies the ELM for modeling at each leaf node in the process of creating a CART, which can get the true regression prediction value, improve the generalization ability, and compensate for such disadvantages of the CART regression algorithm as easy overfitting and constant predictive output. The experimental results show that the proposed algorithm can effectively improve the prediction accuracy of target data in regression analysis, and its accuracy is higher than that of the counterparts.
2021, 30(2):207-212. DOI: 10.15888/j.cnki.csa.007785
Abstract:With the advent of the 5G era, there exist a large number of Internet of Things (IoT) terminals in the open campus network such as industrial area and campus network. Due to the huge data flow of IoT terminals, the problem of counterfeiting IoT terminals for network attack becomes increasingly serious, and the cost of computing resources of the existing IoT terminals identification technologies in the face of massive data increases gradually. To solve these problems, we propose a real-time IoT terminals identification algorithm for large-scale flow based on the time-sharing index of files. Firstly, the metadata for the time-sharing index of memory is established. Secondly, the time-sharing index of files is used to store the intermediate data of the construction session. Thirdly, the metadata trigger for the time-sharing index of memory is controlled to extract features from a small number of files and perform IoT terminals identification. In the experiment, on the premise of maintaining the accuracy of the IoT terminals identification algorithm, only a little disk space is occupied and the memory consumption is reduced by 92%. These results show that the proposed algorithm can be used in the framework of real-time IoT terminals identification.
XU Jia-Yu , LIN Chu-Ye , CHEN Zhi-Tao , DENG Zhuo-Ran , PAN Jia-Hui , LIANG Yan
2021, 30(2):213-218. DOI: 10.15888/j.cnki.csa.007819
Abstract:In order to solve the problem of difficult recognition due to the large variety of handwritten calligraphy fonts and reduce the threshold for people to appreciate calligraphy, we propose a handwritten calligraphy font recognition algorithm based on deep learning. In the process of recognition, image processing methods, such as projection method, are first used to locate and segment the Chinese characters in the calligraphy works. Then, the GoogLeNet Inception-v3 model and ResNet-50 residual network are used to recognize the styles and shapes of Chinese characters. Consequently, this algorithm can recognize the styles and shapes of regular script and seal script in an entire calligraphy work at single-character recognition rates of 91.57% and 81.70%, demonstrating its practicability.
2021, 30(2):219-225. DOI: 10.15888/j.cnki.csa.007792
Abstract:MapReduce-based systems are increasingly being used for large-scale data analysis applications. Apache Hadoop is one of the most common open-source implementations of such paradigm. Minimizing the execution time is vital for MapReduce as well as for all data-processing applications, and the accurate estimation of execution time is essential for optimization. In this study, the author created a MapReduce performance model for Hadoop2.x that can precisely estimate the execution time of workload in MapReduce. This model combines a precedence tree model that can capture dependencies between different tasks in one MapReduce job, and a queueing network model that can capture the intra-job synchronization constraints. Such an analytical performance model is a particularly attractive tool as it might provide reasonably accurate job response time at significantly lower cost than the simulation experiment of real data-analysis systems. Furthermore, a clear understanding of systematic job response time under different circumstances is key to making decisions in MapReduce workload management and resource capacity planning.
SUN Wei , SHEN Ke-Qin , ZHANG Xin-Nan , HE Ya-Jin
2021, 30(2):226-230. DOI: 10.15888/j.cnki.csa.007793
Abstract:Aiming at the problem that Fractional Repetition (FR) codes in distributed storage system are mostly homogeneous, two new simpler algorithms are proposed to construct and design heterogeneous FR codes with different storage capacity based on Hadamard matrix and based on simple graphics. Heterogeneous FR codes with different storage capacity can be converted from homogeneous to heterogeneous based on Hadamard matrix. Extensible heterogeneous FR code based on simple graphics can be extended. It can be found through the comparison with theoretical analysis of RS codes that the designed heterogeneous FR codes have lower repair locality, repair bandwidth overhead, and repair complexity. And this method can realize accurate and non-coding repair of fault nodes at high repair efficiency, reducing the repair time of fault nodes.
2021, 30(2):231-236. DOI: 10.15888/j.cnki.csa.007788
Abstract:It is inefficient and highly risky to identify whether pedestrians are wearing a mask or not through naked eyes during the prevention and control of the COrona VIrus Disease 2019 (COVID-19). To solve this, we devise an algorithm to detect whether the pedestrians are wearing masks in the natural scenes with the improvement in the loss function of bounding box regression. The algorithm improves the YOLOv3 loss function and uses GIoU to calculate the bounding box loss to detect whether pedestrians wear masks in natural scenes. The algorithm is trained on the open-source WIDER FACE dataset and MAFA dataset. When the natural scene pictures are collected for testing, the mAP (mean Average Precision) of whether pedestrians wear masks is as high as 88.4%. In the detection of natural scene videos, the average number of frames per second is 38.69, which meets the requirements of real-time detection.
2021, 30(2):237-242. DOI: 10.15888/j.cnki.csa.007797
Abstract:Studying the information dissemination in social marketing is of great significance in putting forward reasonable strategy suggestions and enhancing the competitiveness of enterprises. Information dissemination is a complex process that involves individual interaction. Most information dissemination models simplify the reality, failing to consider the influence of individual heterogeneity on information dissemination, nor can they reflect the interaction between individuals. This work studies the information dissemination process from the perspective of a complex system, uses the multi-Agent method to establish the information dissemination model, and builds the multi-Agent interaction mechanism based on the improved Deffaunt model. Also, this study analyzes the influence of different factors on information dissemination through simulation, and finds the individual heterogeneous attributes, mutual influence between individuals, and external environmental factors all have influence on the speed and scope of information dissemination.
LI Can , TIAN Xiu-Xia , ZHAO Bo
2021, 30(2):243-249. DOI: 10.15888/j.cnki.csa.007557
Abstract:The power customer service order data records the demand of power users in text. A reasonable work order classification method is helpful to accurately identify the demand of users and improve the operating efficiency of the power system. To solve the problems of sparse feature data and strong dependency of work order data, this study optimizes the structural model that combines character-level embedded Bidirectional Long-Short-Term Memory network (BiLSTM) and Convolution Neural Network (CNN). Firstly, this model obtains the feature representation of text by noise reduction on the term vectors trained by the Word2Vec model. Secondly, it uses the BiLSTM network to recursively learn the time sequence information of the text to extract the feature information of sentences. Finally, those obtained are input into the double-channel pooled CNN for the extraction of local features. The test experiments on the real work order data set of power customer service demonstrate that the model has good accuracy and robustness in the task of classifying work orders of power customer service.
2021, 30(2):250-254. DOI: 10.15888/j.cnki.csa.007763
Abstract:As big data and artificial intelligence technologies are booming, high-performance, real-time streaming computing systems are gradually replacing traditional batch computing systems based on data warehouses. As an open-source distributed big-data streaming computing platform that is highly fault-tolerant and can realize real-time processing, Apache storm supports a variety of task distribution schemes such as average task distribution strategy and single-machine task assignment strategy. When there are multiple tasks in the task topology and only certain machines in the cluster support the execution of a certain task, the traditional task scheduling method can only allocate a single task to a single designated machine, failing to make best use of resources in the entire cluster. By the adjustment to the task scheduling strategy, the eligible machine queue is obtained. Then, the assigned tasks are evenly distributed to available work nodes in the machine queue, and other tasks are distributed to the remaining machines in the cluster through the default strategy. In this way, multi-task group scheduling strategy can be achieved.
LI Xiao-Hui , MIAO Miao , RAN Biao-Jian , ZHAO Yi , LI GANG
2021, 30(2):255-259. DOI: 10.15888/j.cnki.csa.007772
Abstract:In recent years, transportation has been playing an important role in blooming logistics. Indeed, transportation accounts for more than 50% of the whole logistics cost. Express UAVs could effectively reduce the cost. Moreover, proper path-planning of UAV is also essential. Particularly, UAV should accurately avoid non-fly zones during flight. In this study, the obstacle avoidance path planning of UAVs is comprehensively discussed. With the improved A* algorithm and considering various obstacles, we propose a method that can find the optimal obstacle avoidance path between any two customer points. The simulation results prove the effectiveness of solving the obstacle avoidance path planning problem for express UAVs.
2021, 30(2):260-264. DOI: 10.15888/j.cnki.csa.007790
Abstract:E-commerce is a new business mode on a large scale and with great potential that is flourishing along with the emerging Internet technology. Forecasting short-term sales of products can help e-commerce companies respond more quickly to market changes. This study establishes a forecast model of short-term sales applied to the e-commerce accounting system based on historical data on e-commerce sales and clicks on portal products. With the adoption of AdaBoost idea, the forecast results of multiple traditional BP neural networks are assembled, leading to a higher accuracy. According to the characteristics of the short-term sales in e-commerce, we plan the timing design of time window and establish a forecast model of sales in the unit of day considering the weekend effect. Experiments show that the forecast error of this model can be controlled within 20%.
2021, 30(2):265-267. DOI: 10.15888/j.cnki.csa.007781
Abstract:Naive Bayes classifier is a simple and effective machine learning tool. Based on the principle of naive Bayes classifier, this study deduces the formula of “naive Bayes combination” and constructs the corresponding classifier. It is found through testing that the classifier has superior classification performance and practicality as it overcomes the shortcoming of poor accuracy of naive Bayes classifier and is faster than other classifiers without significant loss of accuracy.