Abstract:The construction method of field relevance is a difficult problem in the Web data generation. A new algorithm for fields' priority relevance based on maximal information coefficient is proposed. The algorithm is completely different from the existing method. Firstly, the maximal information coefficient between the appropriate fields needs to be extracted from real Web log data. Then, combined with the field of heavy tailed characteristics, the field is modeled by stretched exponential distribution. Finally, real data's field dependence is simulated by the fields' relevance model, so as to generate a realistic target data set. The experiments show that the generated data sets can maintain a reasonable balance between the fields and the similarity between the nodes.