Abstract:In the smart city environment, a wide variety of data are collected from sensors and devices to achieve value-added services. In this paper, we especially focus on data taken from smart houses in the smart city, and propose a platform, called Scallop4SC, that stores and processes the large-scale house data. The house data is classified into log data or configuration data. Since the amount of the log is extremely large, we introduce the Hadoop/MapReduce with a multi-node cluster. On top of this, we use HBase key-value store to manage heterogeneous log data in a schemaless manner. On the other hand, to manage the configuration data, we choose MySQL to process various queries to the house data efficiently. We propose practical data models of the log data and the configuration data on HBase and MySQL, respectively. Then we show how Scallop4SC works as an efficient data platform for smart city services. We conduct an experimental evaluation to calculate device-wise energy consumption, using actual house log recorded for one year in our smart house. Based on the result, we discuss the applicability of Scallop4SC to city-scale data processing.