Why We Should Treat Public Data Like Water

The revolutionary potential of the internet means that we can do more than simply build a more beautiful user interface for antiquated government operations.


“Citizens of the 21st century need public technologists like citizens of the 19th century needed municipal engineers to build the drains and clean water supplies.”

—Tom Steinberg, A manifesto for public technology

For three years, Advanced Research in Government Operations Labs (ARGO Labs) has envisioned, planned, and deployed public data infrastructure for integrated urban water use data across California. After Governor Brown’s historic executive order in May 2016 to “make water conservation a way of life,” our team calculated how much water Californian retail water utilities should reasonably expect their residential customers to need, and created a data visualization based on that data to analyze user selected scenarios. Our work leveraged freely available imagery and academic research partnerships to deliver those statewide estimates of reasonable water use for just 5 percent of the $3 million the state originally budgeted for the project.

This increased focus on efficiency highlights the transformational changes underway in the California water industry. Water utilities were originally created to supply and sell clean water. Today, fresh from California’s worst drought in 600 years and faced with future water supply uncertainty, we need to modernize that business model to maximize the benefit of an increasingly scarce resource. Yet most industries are not in the business of getting customers to buy less of their product. So with local utility partners, we have developed and deployed open source analytics to power an integrated vision for the future of water resource planning, rate setting, and water efficiency program development so that utilities can both incentivize more efficient water use and pay for fixed infrastructure costs.

Screenshot courtesy Patrick Atwater

Since embarking on this work, I have begun to wonder: Why not manage public data like water — a public resource required for all life? We too quickly forget that “legacy” institutions like public water utilities were radical innovations for their time. For most of the 19th century, clean water was a luxury. The concept of a public utility enabled (near) ubiquitous access to clean drinking water and the creation of infrastructure that safeguarded water supplies for future generations. California’s water industry in particular has a long history of pioneering everything from man-made aqueducts that can be seen from space to advanced recycled water technologies. Perhaps most importantly, public water utilities provide the institutional structure to ensure that a vital public resource is stewarded for the benefit of everyone.

So as we look to answer the call in Tom Steinberg’s manifesto, what might the public technology movement learn from the public utility model of the California water industry? First off, public data should be managed professionally, and local governments should prioritize funding public technologists who can build the digital equivalent of the physical drains and pipes we take for granted. The California State Water Project—a transformational aqueduct supplying water to twenty five million people—wasn’t built on nights and weekends. Digital infrastructure shouldn’t be either. Building modern public data infrastructure requires seed capital to catalyze change and reallocate existing funds.

For example, this summer ARGO’s data team conducted an IT audit of 235 systems from 14 water utilities. A single Oracle data system from one of those utilities costs more than 200x the amount invested in ARGO’s water data work. Rather than powering modern open source analytics, that system requires utility staff to work with multiple consultant-created query tools to actually access their own data.

Vendor capture of the government IT market and an Excel-for-everything mindset not only costs more but precludes new approaches to tackling civic challenges by scattering key information across local machines. Instead, public data ought to be piped through integrated infrastructure, flowing seamlessly and securely across municipal jurisdictions. This integration should be purpose-driven and build upon collaboratively-developed data standards to ensure the resulting infrastructure supports meaningful measurement of important public outcomes.

While open by default, sensitive, personally-identifiable data could be securely shared with the academic researchers and other qualified analysts that can make use of it for public benefit using proven models for sharing sensitive data and conducting independent research that measures what works. By streamlining data sharing, measuring impact through “little speedometers” could become as regular and routine as professional budgeting. NYU GovLab estimates that only $1 out of every $100 in U.S. spending is actually backed by rigorous evidence, and treating public data like a utility could help address that massive impact measurement gap.

Further, this outcome-oriented public service can enable more creativity and a new standard of excellence in public service. For example, optimizing inspections like fire safety using best practices such as New York City’s model of fire inspections should be routine and expected. That would help make tragedies like the Oakland ghost ship fire, which killed over 36 people, a thing of the past. In that vein, cities everywhere need integrated public data infrastructure in the spirit of the excellent U.K. Administrative Data Research Network, which provides a secure repository of public data across the U.K. to streamline social science research.

While it is unrealistic to expect all of America’s tens of thousands of municipalities to hire Chief Data Officers and in-house public technologists, integrating public data across cities and sharing the cost of technical talent can make such pioneering practices the new normal in city government operations. Here again the visionary technological achievements of the California water industry and the institutional structures that enabled that success highlight profound opportunities for the public technology movement. The Metropolitan Water District of Southern California was created by thirteen cities in 1928 that came together to build an aqueduct to the Colorado River to ensure a reliable water supply for the region into the future. Why not do the same for data infrastructure?

The exciting explosion in data collaboratives—organizations coming together to pool data to create public value—begs for the integration and the development of common standards to ensure interoperable infrastructure. The bipartisan “Commission on Evidence Based Policy-making” recently called for a new National Secure Data Service to modernize the federal government’s data infrastructure and streamline data sharing for statistical research to support evidence-based policymaking. Those goals are good, though a broader vision is necessary to link the many, many emergent urban and issue specific data collaboratives across America and the explosive growth in data collaboratives across the globe. The U.S. federal government is far from the only important government actor both internally, given our fragmented federal system of state and local governments, and globally in an increasingly multi-polar world.

That polycentric landscape mirrors Southern California’s fragmented water industry and could learn from its success in groundwater management. Rather than relying on top down mandates or “thou shalt” dictates, groundwater management succeeded in adjudicating basins by creating a compelling business logic for each of its members using data on the aquifers to make recommendations on sustainable water policies. More than just a single data service to-rule-them-all, there is a key need for increased connectivity across data collaboratives and most importantly a value proposition so that data is not simply being integrated for the sake of data integration.

We should approach this frontier with great humility. Change is hard. Yet history shows that big systemic change — like the shift to (near) ubiquitous clean drinking water — is possible.

Like the internet itself, the proper scope of this integration is ultimately global. And as in the development of the internet, California can play a leading role. For example, building from our model of integrating metered water use data pioneered in California, every member of the Under 2 MOU, an agreement between subnational governments around the world to limit carbon emissions, could band together to share actual metered water, electricity, and gas data on a common platform. Measuring actual natural resource consumption would enable creative, bottom up collaborations to model price changes and target cost effective opportunities to save water and energy. Those efficiencies save both money and natural resources and are common sense no matter your view on climate change.

That global ambition also highlights the opportunity for the public technology movement to mature and increase its aspirations for impact. The revolutionary potential of the internet means that we can do more than simply build a more beautiful user interface for antiquated government operations. Two decades after Gov works, the canonical web 1.0 e-government startup, and four years after healthcare.gov, today all the pieces are in place to truly transform government operations. Flowing like water, public data can power a common set of facts and deliberative discourse about what actually works to address specific issues, enabling us to better tackle the massive public challenges we face in the twenty-first century.

Patrick Atwater is the water data project manager at ARGO Labs, and thanks friends at Code for America, CUSP alumni, and many others who provided invaluable feedback and revisions to many of the ideas on the near future of government operations that evolved into this post.