This paper presents a data management scheme for the Pecan Street smart grid demonstration project in Austin, Texas. In this project, highly granular data with 15-second resolution on resource generation and consumption, including total consumption of electricity, water, and natural gas and solar generation, are collected for more than 100 homes. Furthermore, this testbed, see Figure 1, of homes represents the nation’s highest density of rooftop solar PV and electric vehicles, and includes a substantial subset of homes that are highly instrumented with meters on up to 6 sub-circuits in addition to the whole-home meter. Consequently, this demonstration project generates a one-of-a-kind dataset with excellent temporal and geographic fidelity.
One consequence of this extensive dataset is that there are hundreds of parallel data streams that need to be remotely (wirelessly) collected, filtered, processed, managed, stored and analyzed to be useful for researchers. Cumulatively, they represent 100s of gigabytes of data after just a few months of collection, which represents a formidable barrier to conducting research.
In partnership with the Texas Advanced Computing Center (TACC), which is an NSF-sponsored cluster of supercomputers at UT-Austin, a data collection and management scheme has been developed. For storing the data, we have built a single column oriented database that so far has shown tremendous performance benefits. This paper shows the data schema, an example of MySQL query, and a developed program for rapid and automated data extraction, analysis and display. We expect that the findings of this work will be beneficial to researchers interested in grid-scale data management.