Abstract:The traditional log analysis technology has low efficiency, simple function, poor scalability in practice in processing the large-scale log. To solve these problems, a large-scale log collection and analysis system based on Docker is designed. There are five layers including data collection, data cache, data forwarding, data storage, data retrieval and display in the system. And the system can take in any type of log files from different data sources, provide reliable data transmission through Kafka message queue, utilize Elasticsearch to realize distributed storage and retrieval of data, and analyze log by means of visualization. Meanwhile, the use of docker container technology can realize rapid deployment and version control of the system. The system has the characteristics of real-time, scalability, easy deployment and so on. The experimental results show that the system is feasible and effective with good practical value.