ZHANG Qiao-Li , CHI Xue-Bin , ZHAO Di
2018, 27(9):1-9. DOI: 10.15888/j.cnki.csa.006538
Abstract:In the world, about seven to ten million elderly people are suffering from the Parkinson's Disease (PD). PD is a common degenerative nervous system disease. Its clinical characters are tremor, muscle rigidity, bradykinesia, and the degression of independent ability. These characters are similar with the Multiple System Atrophy (MSA). Research shows that patients with PD are often irreparably diagnosed, so people are constantly exploring new ways to differentiate PD with MSA and get early diagnosis. With the advent of the big data era, deep learning has made major breakthroughs in image recognition and classification. Therefore, the study uses the deep learning methods to differentiate PD, MSA, and healthy people. The data is from 301 Hospital of Beijing. The pre-treatment of the original Magnetic Resonance Image (MRI) is directed by the physicians of 301 Hospital of Beijing. The focus of this experiment is to optimize the neural network and make it get good results in medical image recognition and diagnosis. Based on the pathological characteristics of PD, the study proposed an improved algorithm, and it gets the better experimental results in loss, accuracy, and other indicators.
CHEN Ye-Tian , MI Chuan-Min , XIAO Lin
2018, 27(9):10-17. DOI: 10.15888/j.cnki.csa.006503
Abstract:In view of that the existing attractions recommendation algorithm ignores the user's implicit trust and trust transfer when dealing with users' relationships, and the difficulties of making accurate recommendation for users in the new city due to the lack of user history records, this paper presents a personalized attraction recommendation algorithm based on users' social trust and tag preferences. According to user's rating behavior and context information, user's implicit trust is tapped, and the trust among users is obtained through trust transfer, which effectively alleviates the data sparsity. Then, by analyzing the relationships among users, attractions, and tags, the user's preference is decomposed into the preference of different attraction labels to further explore the user's long-term interest preferences. Experimental results on the data collected on Flickr website show that the hybrid recommendation algorithm proposed in this study effectively improves the accuracy of recommendation and relieves cold start and new city problems to a certain extent.
WANG Jie , ZHANG Rui-Dong , WU Chen-Sheng
2018, 27(9):18-24. DOI: 10.15888/j.cnki.csa.006552
Abstract:Named entity recognition is a basic task of natural language processing. Traditional recognition methods often require external knowledge and manual screening features, which require high labor costs and time costs. Aiming at the limitation of traditional methods, this study proposes a named entity recognition model based on GRU (Gated Recurrent Unit). This model uses word vector as input unit, extracts features through bi-directional GRU layer, and obtains label sequence through output layer. In this study, this model has been tested on a specific domain named entity. The experimental results show that the recurrent neural network model of the article can identify the named entities well, and can save the tedious work of designing the features manually and provide the end-to-end identification method.
2018, 27(9):25-32. DOI: 10.15888/j.cnki.csa.006518
Abstract:As an effective way to find correspondences between images, Belief Propagation (BP) is widely used for estimating optical flow in recent years. Nevertheless, its application to directly estimating high-accuracy large displacement optical flow needs huge label space and long time to process. In order to overcome the drawback of BP, we propose a Hierarchical Belief Propagation (HBP) algorithm to estimate high-accuracy large displacement optical flow. We treat input images as Markov Random Fields (MRFs). To accelerate computation, we perform BP on hierarchical MRFs, i.e., superpixel MRF and pixel MRF. The basic displacements obtained on the superpixel MRF are used as a coarse reference to constrain label space to a smaller size on the pixel MRF. Based on this constrained label space, we can estimate accurate optical flow efficiently. Experiments on MPI Sintel dataset show that the proposed method is competitive on speed and accuracy.
2018, 27(9):33-39. DOI: 10.15888/j.cnki.csa.006479
Abstract:Image feature extraction is always the core task of computer vision and image processing. With the rapid development of deep learning, the Convolutional Neural Network (CNN) has gradually replaced the traditional image feature operator and became the main algorithm for feature extraction. Combined with CNN and sum pooling, we propose a new image feature extraction algorithm based on depth prior aiming at the data association problem in the crowd sourcing labeling system for urban remote sensing data. The feature can effectively focus on the objects in the vicinity of outdoor images and verify their good characterization of outdoor images via image retrieval experiments.
2018, 27(9):40-46. DOI: 10.15888/j.cnki.csa.006533
Abstract:Many natural language applications need to represent the input text into a fixed-length vector. Existing technologies such as word embeddings and document representation provide natural representation for natural language tasks, but they do not consider the importance of each word in the sentence, and also ignore the significance of a sentence in a document. This study proposes a Document Representation model based on a Hierarchical Attention (HADR) mechanism, taking into account important sentences in document and important words in sentence. Experimental results show that documents that take into account the importance of words and importance of sentences have better performance. The accuracy of this model in the sentiment classification of documents (IMBD) is higher than that of Doc2Vec and Word2Vec models.
2018, 27(9):47-51. DOI: 10.15888/j.cnki.csa.006542
Abstract:Deep learning is a branch of machine learning, creating a new era in the development of neural networks. As an important part of deep learning structure, self-coding algorithm plays a crucial role in unsupervised learning and nonlinear feature extraction. Firstly, the basic concepts and principles of self-encoding algorithm are introduced. Then, the improved algorithm based on self-encoding algorithm is presented. Finally, the well-known cases and development trends of self-encoding algorithm applied in several fields are elaborated.
ZHANG Shao-Wei , GE Bin , ZHANG Lei , WEI Ling-Xuan , WU Jin-Ping
2018, 27(9):52-60. DOI: 10.15888/j.cnki.csa.006523
Abstract:Radionuclide imaging agent is necessary for PET/CT imaging. At present, the dispensing method of radionuclide imaging agent is mainly traditional manual dilution dispensing, so there are problems such as low efficiency of dispensing, large dose of radiation to human body, and so on. In order to solve the above problems, the researchers have worked on the automatic dispensing system, but the function of the control interface is single, and it cannot monitor the dispensing work in real time. So this study designed an automated Dispensing Hot Cells (DHC) control interface based on Qt. This interface can receive the data through the network transmission of remote cameras to realize real-time video display of the internal situation of the DHC and used the brightness equalization algorithm to process the image. And the interface uses the serial port protocol to connect the STM32F429 demo board, designed the corresponding functional controls to operate the demo board, realizing the automation control of DHC with the dispensing mechanical system. The verification shows that, this interface realizes the friendly interface function of human-machine interaction. It improves the quality of monitoring and the response speed of the system. Compared with the commonly used monitoring software MiniVCap and VCam, the video delay and the CPU occupancy rate have been significantly reduced, can be stably connected with STM32F429, and meet the design requirements in terms of performance and indexes.
YE Feng , OUYANG Zhi-Chao , CHEN Wei-Biao , ZHOU Yi-Qin , ZHOU Xiao-Ling
2018, 27(9):61-67. DOI: 10.15888/j.cnki.csa.006398
Abstract:To bring more reasonable scheduling of taxi resources, this study proposes an intelligent taxi forecasting system based on machine learning. Firstly, the GPS data set of Porto taxi is preprocessed, and a part of the training sets are taken as the research object. Then the echo state network algorithm is used to predict the travel destination of the taxi under the premise of predicting the travel destination. Finally, the taxi arrival time is predicted by using random forest algorithm in the same circumstances. Experiments show that the system can predict the actual taxi destination of the part of the journey and the time required for the journey, thus achieved the purpose of reducing the waste of taxi resources based on the current Porto taxi GPS data set.
2018, 27(9):68-73. DOI: 10.15888/j.cnki.csa.006504
Abstract:Qinghai Lake is China's largest inland lake, which plays a crucial role in the local ecosystem. To effectively monitor the Qinghai Lake water body has become a research direction. The current water body recognition research is mostly realized using single machine, this method has the problem of slow recognition and low degree of automation. With the increasing amount of remote sensing data, traditional identification methods cannot meet the demand. Based on Hadoop and Spark distributed big data framework, this study designs and implements an automatic water body recognition system. The system mainly realizes the data storage, data reading, data processing, model prediction, and other functional modules of remote sensing images, and finally implements the automated execution of the system through shell scripting. Finally, this study selects the three-day remote sensing image data of Qinghai Lake area to verify the system. The experimental results show that the system can automatically complete the water body recognition process and accurately predict the water body.
CHEN Lei-Ming , ZHANG Wei-Guang , LI Xiao-Ran , LI Ning-Ning
2018, 27(9):74-80. DOI: 10.15888/j.cnki.csa.006551
Abstract:With the rapid development of Internet information construction, more and more IT systems are widely used. Due to the huge amount and complexity of oilfield applications, how to quickly evaluate the operation and safety status of various systems has become an important issue in oilfield. When using the business system, some access information was recorded in the form of logs at the same time. By analyzing the log data, the user's access preferences can be excavated and the potential network security problems of the business system can be found, thus providing a decision basis for the evaluation of oil field applications. However, with the rapid increase of business access, the amount and storage capacity of logs also increase. Relying on single computer environment, analyzing massive log data has been unable to meet the needs of applications. In view of this problem, this study proposes a log behavior analysis method based on Spark calculation framework and designs a service platform for visual management based on Web.
2018, 27(9):81-86. DOI: 10.15888/j.cnki.csa.006528
Abstract:The logs generated by Docker containers are scattered in different isolated containers, and the container has the characteristics of "ready to use". The traditional solution is to mount the log files to the host, but the containers often drift, bringing challenges to the unified view of the log, while the traditional Docker container log analysis system has the problems of weak expansibility and low efficiency. This study uses Kubernetes to implement container management, service discovery and scheduling, uses Filebeat to collect log files on containers and host computers, Redis as a cache, Logstash forwarding, and uses the mainstream open source log collection system ELK to store, view, and retrieve log. The system has the characteristics of real time, reliability and extensibility, and improves the efficiency of operation and maintenance personnel.
SHENG Nian-Zu , ZHAO He , WANG Wei-Dong , ZHANG Zhong-Xian , LYU Bo , LI Xiao-Feng
2018, 27(9):87-92. DOI: 10.15888/j.cnki.csa.006520
Abstract:The exercise bike is indoor fitness equipment, which is used in cardiopulmonary endurance assessment and self-service fitness exercise. But the existing exercise bike is high cost and poor universality because of the using of serial communication. The mobile client of exercise bike is based on Wireless Local Area Network (WLAN), which communicates with the exercise bike, assists exercise bike to connect the WLAN, and provides real-time sports information display and exercise bike control. Compared with the traditional serial communication, it makes the exercise bike more widely range of using, and easier for popularize. The result shows that the mobile client can communicate with the exercise bike stably, and reduce the cost obviously.
LIU Zheng-Yin , GE Jing-Guo , LI Tong , HAN Chun-Jing , WU Jia-Lei
2018, 27(9):93-99. DOI: 10.15888/j.cnki.csa.006561
Abstract:The Service Function Chain (SFC) proposed by the IETF solves the problems that the service functions are tightly coupled with the hardware devices and have poor flexibility in the deployment process. The NSH protocol is used to support the implementation of the SFC. However, the standard OpenFlow protocol is not sufficient to support the NSH protocol and causes the problems of compatibility after implementation. Based on Software Defined Network (SDN) and Network Function Virtualization (NFV) technologies, this study proposes an SFC based on Protocol Oblivious Forwarding (POF), which implements the NSH protocol using the ability of POF that can be programmed deeply in the data plane. In this study, we implements the SFC based on FloodLight controller and POF switch. The experimental results show that the SFC based on POF can efficiently implement the deployment of service functions.
WANG Jing-Zhong , YANG Yuan , HE Yun-Hua
2018, 27(9):100-106. DOI: 10.15888/j.cnki.csa.006517
Abstract:To filter the variety of pornographic images in the reality Internet, the study proposed a Pornographic Images Recognition (PIR) framework based on multi-classification and deep Residual Network (ResNet). Traditional methods usually consider the PIR task as a binary classification, while the approach presented in this paper divides porno images into 7 detailed classes based on its variety features with 2 more benign image classes (with or without human in it). The approach relies on 50-ResNet to extract image features automatically, and then decides whether it belongs to porno images based on the highest score and gives threshold value. At training stage, a feedback-reconstruct training tactics is adopted for the network to collect better features. To deal with images in different scales, a monolateral sliding window method is taken to get better performance. After testing on the data set constructed with collected images from the Internet, the experimental result shows that the approach can reach high accuracy with lower time cost.
WANG Feng , ZHANG Jing , ZHANG Tong , MA Wei-Gang
2018, 27(9):107-111. DOI: 10.15888/j.cnki.csa.006553
Abstract:Multi-tenant is the core of cloud computing, it solves the problem of sharing system resources and applications among multiple users, and improves the utilization of software and hardware of a system. How to improve server utilization without degrading the quality of tenant service is a challenging problem. This study suggests a rural drinking water survey database system based on Eucalyptus platform and multi-tenant technology. Data storage uses the sharing mode of shared database, and this paper discusses a Multi-tenant Improved Genetic Placement Algorithm (MIGPA) which places tenant in a proper location in a virtual environment with minimum consumption of hardware resources and guaranteed service quality. Experiment results show that the algorithm is feasible and effective.
LIU Huan , CHEN Neng-Cheng , CHEN Ze-Qiang
2018, 27(9):112-117. DOI: 10.15888/j.cnki.csa.006534
Abstract:In response to computing problems of massive remote sensing images, a method based on Apache Spark is proposed and implemented in retrieving MODIS Sea Surface Temperature (SST) by optimizing and improving the image acquisition, algorithm, and computing process. It applied four bouts of network requests to acquire user-defined data of specific time and zones to improve the efficiency of image acquisition. For a parallelizable algorithm, improvements that reduce parameters and simplify intermediate models are added to the split window algorithm, thus to adapt to fast parallelized computing. Taking advantage of narrow dependence between Resilient Distributed Datasets (RDD), delays for partitions' interactions are evaded. With comparison between single mode and cluster mode, the latter incorporated with Apache Spark has an efficiency of ten times to the former. This study proves that, comparing with a single machine's, programs that retrieving MODIS SST with cluster computing techniques has a higher efficiency.
ZHENG Hai-Yang , GAO Jun-Bo , QIU Jie , JIAO Feng
2018, 27(9):118-123. DOI: 10.15888/j.cnki.csa.006498
Abstract:With the development of the mobile Internet, Microblog topic has become popular. A single hot topic may have tens of thousands of comments. The stance detection of Microblog topic aims to automatically determine whether the author of a text is in favor of the given target, against the given target, or neither. Firstly, Word2Vec trains out each word of the corpus of vector to extract semantics information from sentence. Then, TextRank keywords extracted method is used to construct the thematic words set as the stance's feature, meanwhile, the sentiment lexicon is used to extract the sentiment information of the sentence. Finally, the word vector of feature selection is trained and predicted by Support Vector Machine (SVM), so as to complete the model of stance detection. The experimental result shows that the stance feature based on the combination of thematic words and sentiment words can obtain good stance detection effect.
2018, 27(9):124-129. DOI: 10.15888/j.cnki.csa.006513
Abstract:The MCRA minimum recursive algorithm is accurate for the noise estimation, and the changes of noise power spectrum in a speech can be tracked accurately. However, if the noise power spectrum increases too much suddenly, the original algorithm needs a period of time to get the accurate noise, and in this adaptive period, it will leave strong residual noise and affect people's hearing experience. This paper introduces a Voice Activity Detection (VAD) algorithm which uses the maximum log-likelihood ratio with energy-zero ratio, and an improved noise estimation algorithm on the basis of MCRA is obtained. Experimental simulation also proves that the improved algorithm is better than the original algorithm in noise estimation speed.
WANG Meng-Xue , XU Zhe-Xin , WU Yi , LIN Xiao
2018, 27(9):130-136. DOI: 10.15888/j.cnki.csa.006540
Abstract:In Vehicular Ad-hoc NETwork (VANET), a Modified Decentralized Adaptive Time Division Multiple Access (TDMA) Scheduling mechanism (MDATS) is proposed by combining distributed TDMA and Space Division Multiple Access (SDMA) in order to solve the problem of contention collision when multiple nodes access channels simultaneously. The nodes with MDATS protocol, obtain the time slot allocation of two-hop neighbors through Frame Information (FI), which can determine the available time slot set. Then, the competition zone will be divided into several logical sections based on the number of available time slots. The nodes competitively select their time slots based on the mapping between the logical sections and the available time slots. This protocol reduces the contention collision among the nodes which access the channel simultaneously by the spatial dispersion of the available time slots. The simulation results show that MDATS protocol can achieve higher channel access success rate, lower access delay, and higher time slot utilization rate when it is compared with other similar MAC protocol.
2018, 27(9):137-142. DOI: 10.15888/j.cnki.csa.006532
Abstract:Scientific and accurate access to the classification of land cover in Qinghai Lake area is of great significance to the study of the ecological environment changes in this region. In this study, we use the 30 meter resolution LandSat 8 OLI remote sensing image data of Qinghai Lake to carry out the related research. The 30 m resolution is of medium resolution. The methods for classification of medium resolution remote sensing image still have defects of difficult feature extraction and low classification accuracy. In this study, using the GoogLeNet inception structure, a Convolutional Neural Network (CNN) model for feature extraction and classification is designed and proposed. We analyzed the effect of the neighborhood window size used for sample generation on the classification results, and compared it with the maximum likelihood classification and SVM classification method. The results show that when the window size is 9×9, the overall classification effect of the CNN model is the best, and the classification results of CNN are obviously better than that of maximum likelihood classification and SVM.
NI Yong-Feng , YAN Lian-Shan , CUI Yun-He , LI Sai-Fei
2018, 27(9):143-150. DOI: 10.15888/j.cnki.csa.006559
Abstract:In order to detect advanced persistent threat in software defined network, an efficient mechanism utilized in SDN is proposed to detect covert communication in this study, based on analyzing the architecture of SDN and covert communication in advanced persistent threat. When detecting covert communication, this mechanism firstly captures the transmitted traffic from the underlying network. Subsequently, it extracts SSL certificates from the captured packets and calculates several eigenvalues of the extracted SSL certificates. At last, using isolation forest algorithm, it detects whether these SSL certificates are abnormal taking advantages of the extracted eigenvalues. Based on the detection result of SSL certificates, this mechanism can judge whether there is covert communication in this network. Experimental results verify that the proposed mechanism can improve the detection accuracy and reduce false positive of covert communication. At the same time, this mechanism has high scalability, which makes it easily implemented in other scenarios.
MA Chao , DU Jun-Wei , HU Qiang
2018, 27(9):151-156. DOI: 10.15888/j.cnki.csa.006516
Abstract:Scenarios are effective mechanisms for analyzing the occurrence, development, and possible consequences of an accident. However, lack of effective model to model or limitation of models to analysis, scenario-based early warning mechanisms are difficult to popularize in practice. Abstract fault tree is a high-level abstraction of the same kind of fault tree. Based on historical cases and expert experiences, it can characterize the mechanism, evolution process, and possible consequences of the accident, and can effectively support scenario-based early warning analysis. A method of early warning of chemical accidents based on abstract fault tree is proposed. Based on the abstract map relation, hazard degree, and importance level of nodes are calculated. The scenario-evolved cutting set model is transformed into Bayesian network model. Board method is used to measure risk of accident hazard. The ranking of defense events can be used to predict the accident risk and propose the best coping strategies based on different evolution paths of scenarios. The experimental results show the effectiveness of this method in accident analysis and early warning.
SHI Meng-Fei , YANG Yan , HE Liang , CHEN Cheng-Cai
2018, 27(9):157-162. DOI: 10.15888/j.cnki.csa.006536
Abstract:The goal of question categorization is to classify natural language questions that user raised into predefined categories. How to classify question sentences accurately and efficiently is an important task in community question answering. In this study, we propose a question categorization method based on deep neural network. Firstly, the words of the question are transformed to vectors. Then, we use a novel Bidirectional Long Short-Term Memory (Bi-LSTM) based Convolutional Neural Network (CNN) model with attention mechanism to capture the most important features in a question. Finally, the features are fed into the classifier to predict the category of the question. We use the Bi-LSTM and CNN to capture the features of question because of their benefits in representing sentence level documents. We also use the answer set to enrich the information of the question. The experimental results on several datasets demonstrate the effectiveness of the proposed approach.
ZHU Bin , QI He-Yuan , MA Jun-Cai
2018, 27(9):163-169. DOI: 10.15888/j.cnki.csa.006545
Abstract:In the field of species identification, the traditional algorithm is based on the BLAST method, which is regarded as the authoritative method, but the method has a series of problems such as complex calculating process, time-consuming, as well as space-consuming. In this study, we propose an improved VSM algorithm based on K-String compositional vector method, and give the alternative norm-format formula in calculating the genetic distance between species in the Banach space for the reference of other scientific researchers. In this study, the computational efficiency and the result of the species identification are the two aspects to determine the properties of the improved method. The conclusion is that the calculating time of improved VSM algorithm based on 2-norm has decreased obviously than that of the BLAST algorithm, in addition, the result of classification demonstrates good consistence and convergence with the comparison result in terms of detection rate.
YU Bo , FANG Ye-Quan , LIU Min , DONG Jun-Tao
2018, 27(9):170-175. DOI: 10.15888/j.cnki.csa.006548
Abstract:In the process of video or image transmission, there may be random error, sudden error, packet loss, and so on, which will also have a serious impact on the decoded image data. This paper presents an image reconstruction algorithm based on depth learning:an unsupervised image reconstruction neural network model based on image background prediction to generate fuzzy region content. In order to reconstruct a vivid image, a neural network model not only needs to understand the content of the image, but also to reconstruct the missing part of a reasonable assumption. The loss function includes standard pixel level reconstruction loss and counterwork loss. When training the convolution neural network model, the loss function can better deal with the structure details in the image and produce clearer results. Through experiments, we can find that the neural network model of depth convolution designed in this study has better effect in image reconstruction than the algorithm based on sample interpolation.
2018, 27(9):176-181. DOI: 10.15888/j.cnki.csa.006546
Abstract:In the era of the increased computation capacity of computer, in order to improve the efficiency of checking emptiness of timed automaton, and make more efficient using of the advantages of multi-core processors, we use multi-core emptiness checking algorithm of timed Büchi automata to rebuild CTAV, making it become a multi-core model checker for linear temporal logic, which improves the efficiency of model checking. Since there are equivalence and inclusion relationship between symbolic states, by making use of this relationship, a model checker can find accepted path faster and avoid unnecessary state explorations. Thus, checking takes less time. Finally, the effectiveness of presented method is demonstrated by case study.
YU Bo , YANG Hong-Li , LENG Miao
2018, 27(9):182-187. DOI: 10.15888/j.cnki.csa.006496
Abstract:Although traditional collaborative filtering recommendation algorithm can easily find potential users' interests, it remains cold-start problem and sparsity problem. In order to solve these problems, a new hybrid recommendation algorithm is proposed. Firstly, this study builds topic distribution matrix through the LDA topic model, and user interest matrix is created using topic distribution matrix. Secondly, the user interest model is obtained by combining user's historical behavior information and user's content information. Finally, the TOP-N recommendation list is output after calculating the similarity of user and candidate movies. Experiments on the Douban Movies dataset reveals that the results obtained from improved recommendation algorithm are obviously better than that from traditional recommendation algorithm, and it can better deal with sparse data and cold-start problems.
2018, 27(9):188-192. DOI: 10.15888/j.cnki.csa.006529
Abstract:Refactoring is a common way of improving software maintainability and software quality in modern software development and maintenance. In daily revision, refactoring patterns usually mix with code changes accomplishing other tasks such as bug fixing and feature addition, which makes the change understanding very complicated. It facilitates changes understanding to identify refactoring patterns that can isolate refactoring from other types of code changes. As far as I am concerned, there is no method and tool which have been proposed to identify code refactoring by code change types and similarity comparison. We proposed An identifying algorithm for refactoring patterns, based on fine-grained changes type and text similarity. The algorithm is used for refactoring patterns of extract class. The algorithm has been tested on 4 open source projects, with an average 82.6% accuracy.
ZHANG Zhi-Yu , JI Yuan-Yuan , MAN Wei-Shi
2018, 27(9):193-198. DOI: 10.15888/j.cnki.csa.006537
Abstract:An improved random forest node splitting algorithm is proposed in this study for improving the accuracy of image classification. The independent splitting method ID3 and CART are re-combined, and new splitting rules are obtained by adaptive parameter selection. On the basis of the bag-of-words model, the spatial pyramid model is introduced to extract image features. After dividing the image into different grids, k-means algorithm is then used to character clustering. Finally, it uses the algorithm for verification on a large number of images on Spark. The results show that the algorithm can be applied to distributed systems, and can greatly improve the classification accuracy while ensuring the efficiency of the algorithm at the same time.
LIU Mei , GAO Cen , TIAN Yue , WANG Song , LIU Lu
2018, 27(9):199-204. DOI: 10.15888/j.cnki.csa.006544
Abstract:Swarm is a management tool for cluster Docker image and containers, and it may get a number of identical weights nodes in the calculation of weights. The existing Swarm scheduling strategy is just a random allocation of these nodes. Because the resources load of the same weight nodes are different, this will cause the load imbalance of the nodes. To solve this problem, this study proposes a dynamic scheduling algorithm to optimize the Swarm scheduling strategy. Through the experiment, it is proved that the dynamic scheduling algorithm can make the node load more balanced in the cluster, and improve the overall resource utilization of the cluster.
2018, 27(9):205-209. DOI: 10.15888/j.cnki.csa.006570
Abstract:In order to solve the problem of unstable sparseness of Non-negative Matrix Factorization (NMF), an improved NMF on orthogonal subspace with L1 norm constraints was proposed. L1 norm constrained was introduced into the objective function of NMF on Orthogonal Subspace (NMFOS), which enhanced the sparsity of the decomposition results. The multiplicative updating procedure was also produced. Experiments on UCI, ORL, and Yale show that this algorithm is superior to other algorithms in clustering and sparse representation.
MA Cun , GUO Rui-Feng , GAO Cen , SUN Yong
2018, 27(9):210-214. DOI: 10.15888/j.cnki.csa.006554
Abstract:Short text research has been a hot topic in the field of natural language processing. Due to the sparseness of short texts and serious colloquialisms, its clustering model has the problems of high dimensionality, poor focus of theme, and unclear semantic information. In view of the above problems, this study proposes a short text clustering algorithm with improving the feature weight. Firstly, the rules of multi-factor weight are defined, the comprehensive evaluation function is constructed based on part-of-speech and symbolic sentiment analysis, and the feature words are selected according to the relevancy between the term and the text content. Then, a word skip vector model (continuous skip-gram model) trained in large-scale corpus to obtain a word vector representing the semantic meaning of the feature words. Finally, the RWMD algorithm is used to calculate the similarity between short texts and the K-means algorithm is used to cluster them. The clustering results on the three test sets show that the algorithm effectively improves the accuracy of short text clustering.
LI Jie-Cheng , TAO Yao-Dong , SUN Yong , GAO Cen
2018, 27(9):215-219. DOI: 10.15888/j.cnki.csa.006564
Abstract:The location of logistics distribution facilities has a great impact on logistics costs and deliver time. Its features include:the interaction between the location of delivery facilities and the delivery route planning, the multi-level location, the balance of shipment quantity, etc. Through the analysis of the characteristics of logistics distribution facilities location, a BIRCH-based logistics distribution facility location algorithm, a combination of BIRCH clustering algorithm and Dijkstra-based gravity center method, is designed to provide a better location and save long-term operating costs.
CHEN Da-Cai , LYU Li , GAO Cen , SUN Yong
2018, 27(9):220-223. DOI: 10.15888/j.cnki.csa.006521
Abstract:As business and users increase, it becomes more and more important to improve the efficiency of server clusters. In this study, the machine learning algorithm is used to predict the response time of new requests by training the historical data. According to the estimated response time of each server node, the request is allocated to the server node with the least response time. The balanced allocation of requests in a cluster has been improved and improves the efficiency of the cluster. In this study, experiments on three kinds of machine learning algorithms show that this strategy can reduce the average response time of system in small-scale high-concurrency clusters.
XU Hao-Hao , LIAN Liang , YAO Hao-Li
2018, 27(9):224-228. DOI: 10.15888/j.cnki.csa.006541
Abstract:Aiming to solve current status of data deficiencies in the process of meteorological websites and other applications migrating to local government cloud, choosing ETL tools by considering functionality, development cost, and flexibility, a job scheduling system based on Quartz framework was developed to automatize the meteorological data ETL processes which are modeled by using Kettle software, an SQL Server database cluster was built, overall established a meteorological data warehouse in government cloud. This data warehouse fulfills the purpose of multiple-source meteorological data synchronizing and storing to government cloud in real-time, provides data support for those meteorological application systems has been or will be deployed in government cloud, also lays the foundation for meteorological department involving in E-government data sharing and exchanging.
LI Zhen-Tao , REN Yong-Mao , ZHOU Xu , ZHOU Ya-Qiu
2018, 27(9):229-235. DOI: 10.15888/j.cnki.csa.006515
Abstract:With the development of network technology, loss-based congestion control mechanism has drawbacks because of low bandwidth utilization and BufferBloat problem. A congestion-based congestion control mechanism BBR has been adopted by Google and attracts wide attention. In this study, we evaluate the performance of BBR based on experiment, including protocol's transmission efficiency, convergence, as well as fairness, and provide future improving direction of BBR protocol.
LI Xiao-Yan , ZHANG Yong , CHEN Lei
2018, 27(9):236-242. DOI: 10.15888/j.cnki.csa.006478
Abstract:Following some rules, the Cognitive Radio Networks (CRNs) can be divided into several clusters, each cluster has a Common Control Channel (CCC) to exchange control information. The cluster-based CCC is one of the solutions to the problem of spectrum sharing on MAC protocol for CRNs. In order to validate the clustering structure, we propose a MAC protocol based on clustering. In our protocol, channel access time is divided into a sequence of superframes. Each period in superframes is corresponding to the operation of cluster node. This mechanism can make the cluster structure more robust to the primary users' activities. Simulation results reveal that our proposed MAC protocol can achieve higher throughput and lower delay in high traffic conditions.
2018, 27(9):243-248. DOI: 10.15888/j.cnki.csa.006509
Abstract:In this study, the decentralized network access authentication was focused on. Based on non-interactive zero-knowledge proof and technologies of blockchain, we improved practical Byzantine fault tolerance and designed a scheme that the hosts which have been connected to the network verify the host applying for access, by certificating the ownership of the public key. According to the scheme, BchainNAC is designed and implemented in the SDN network.
YAN Kai , SUN Jun-Mei , LIU Xue-Jiao , ZHU Min
2018, 27(9):249-255. DOI: 10.15888/j.cnki.csa.006505
Abstract:Smart phones and tablets are gaining popularity with their rich and powerful input, but their rich input features add to the complexity of the testing. Existing GUI-based recording and playback tools are inadequate for capturing the input of sensor devices, GUI gestures, and have precise timing requirements. This study designs and implements a tool named RARA. RARA can be recorded and replay both GUI events and sensor events, and replaying with microsecond accuracy. Finally, it is verified by experiment:(1) RARA is effective; (2) The playback time overhead is only about 1%, will not affect the performance of the host APP; (3) The application bug can be reproduced through the RARA.
2018, 27(9):256-261. DOI: 10.15888/j.cnki.csa.006543
Abstract:In this study, an embedded character recognition processing platform based on ARM Cortex-A9 is set up. The method of cross compiling OpenCV, Qt, BootLoader in the Linux operating system and transplanting the related drive to the embedded platform are studied. Based on the classical algorithm, the image character segmentation and recognition processing is realized by using the OpenCV library function. At last, the development program is transplanted on the embedded platform and the experiment is carried out. The results show that the image character can be segmented and recognized well.
2018, 27(9):262-267. DOI: 10.15888/j.cnki.csa.006547
Abstract:The Android system provides various mechanisms for interactions between apps, of which the exported activity is an activity that can be launched by other apps during runtime without complex inter-process communication. Most of the existing works on testing Android apps mainly focus on the functionalities bound to the GUI components in the app, while the app often does not include the GUI callbacks to activate its exported activities. This study proposes a method to systematically test the exported activities in the way of generating a set of agent apps as test drivers to launch these activities. It first statically analyzes the APK file to figure out the exported activities and extract the keys and types of their required data items, and then fills this corresponding data to a pre-set template to build the test drivers. All these proposed techniques are implemented into a prototype tool called EASTER. The preliminary experiments on several real-world apps show that without comprehensive testing, some exported activities are vulnerable to various external apps launches.
LIANG Jian-Fei , WANG Jun-Yuan , LIU Min
2018, 27(9):268-272. DOI: 10.15888/j.cnki.csa.006563
Abstract:In order to obtain the transliteration results quickly, a multi-lingual transliteration software based on rules is designed on the basis of manual processing in this study. The software can meet the transliteration needs of various lingual, because algorithm and rule are designed separately. The complete transliteration process includes four steps of word pre-treating, letter recognition and segmentation, letter recombination and localization, and rule table searching. In the letter recombination, this study proposes a method of determining the best syllable division, which can reduce the error rate of syllable division effectively and improve the quality of final transliteration results. The results of the experiments for English, Roman, and Russian in this study show that the transliteration accuracy can reach to 95% or more.
SUN Jian-Wei , CHEN Li , WANG Wei
2018, 27(9):273-277. DOI: 10.15888/j.cnki.csa.006519
Abstract:In recent years, with the user's requirements on the audio and video communications quality being higher, WebRTC has been widely used for its powerful multimedia processing capabilities. WebRTC only provide a kind of weak signaling JSEP, but enterprise-class converged communications applications must be combination of WebRTC and the actual signaling protocol. SIP protocol is the core technology of IMS, which plays a very important role in the control of multimedia conversation. This paper introduces the existing schemes of WebRTC and SIP protocol integration, studies the problems of WebRTC and SIP protocol integration, and presents a scheme of converged communications of combining WebRTC PeerConnection and SIP protocol based on clients. This study also compares the advantages and the disadvantages of this scheme with other schemes.
CHEN Ruo-Yu , LIN Lei , HU Zhong-Yu
2018, 27(9):278-282. DOI: 10.15888/j.cnki.csa.006432
Abstract:With the development of intelligent mobile devices and social media, more and more end-user-oriented applications have emerged. How to fully understand the needs of end users and reduce the risk of software projects have become an urgent problem. Among all kinds of software projects, the development of open source software projects is special, which reflects in the extensiveness and multi-level of participants, as well as the multi-facetedness and instability of user needs. Based on the assessment of project maturity, a setup and assessment model of open source software project is proposed, and the method of maturity evaluation and the process of open source software project setup based on maturity are introduced in detail. The setup and assessment process is demonstrated with a concrete project scenario.
CHENG Xu , ZHANG Bin , LIU Yi-Tian
2018, 27(9):283-287. DOI: 10.15888/j.cnki.csa.006434
Abstract:Traditional business processes are usually designed with process elements-activities and transitions. However, when the process is becoming complicated, such as dozens of activities which could jump to each other in one complex workflow, the process modeling is becoming harder, and the system will probably turn to poor performance. In order to solve this problem, this paper presents a system of workflow state machine cloud based on cloud computing and the mechanism of finite state machine, which can simplify the activities and transitions by coding the flow process to achieve high performance, and elaborates the mechanism of the state machine framework in cloud. Finally, the developing method of business application based on the workflow state machine cloud is introduced, and the performance results tested in container are given:the time consuming of one-step flow between activities running in two containers is very short. That means the design based on Docker can achieve high performance as well as maintaining scalability.