Abstract
The design dataset is the backbone of data-driven design. Ideally, the dataset should be fairly distributed in both shape and property spaces to efficiently explore the underlying relationship. However, the classical experimental design focuses on shape diversity and thus yields biased exploration in the property space. Recently developed methods either conduct subset selection from a large dataset or employ assumptions with severe limitations. In this paper, fairness- and uncertainty-aware data generation (FairGen) is proposed to actively detect and generate missing properties starting from a small dataset. At each iteration, its coverage module computes the data coverage to guide the selection of the target properties. The uncertainty module ensures that the generative model can make certain and thus accurate shape predictions. Integrating the two modules, Bayesian optimization determines the target properties, which are thereafter fed into the generative model to predict the associated shapes. The new designs, whose properties are analyzed by simulation, are added to the design dataset. This constructs an active learning mechanism that iteratively samples new data to improve data representativeness and machine learning model performance. An S-slot design dataset case study was implemented to demonstrate the efficiency of FairGen in auxetic structural design. Compared with grid and randomized sampling, FairGen increased the coverage score at twice the speed and significantly expanded the sampled region in the property space. As a result, the generative models trained with FairGen-generated datasets showed consistent and significant reductions in mean absolute errors.