Building a Data-Ready Ecosystem for Public Health Response

June 27, 2025

As part of the INSPIRE: Readiness portfolio, the Building a Data-Ready Ecosystem for Public Health Response webinar focuses on the power and potential use of regional data ecosystems to support public health decision-making. Featuring experts from the University of North Carolina, University of Utah, University of Washington, and Utah Population Database, this webinar discusses regional ecosystem approaches and and explains the value of combining data from various sources for comprehensive, actionable data to inform public health efforts. Discover the benefits of integrated data during emergencies and the future of data-driven public health.

Key Topics

  • Development of regional data models
  • Benefits of integrated data during an emergency

Speaker

  • Kimberley Shoaf, Professor and Executive Director, Rocky Mountains & High Plains Center for Emergency Public Health, University of Utah
  • Mike Newman, Associate Director for Data Science and Management, Utah Population Database (UPDB), University of Utah
  • Janet Baseman, Professor, Department of Epidemiology, University of Washington School of Public Health
  • Vincent Toups, Senior Data Scientist, Collaborative Studies Coordinating Center, UNC Chapel Hill

Transcript

This text is based on live transcription. Communication Access Realtime Translation (CART), captioning, and/or live transcription are provided to facilitate communication accessibility and may not be a totally verbatim record of the proceedings. This text is not to be distributed or used in any way that may violate copyright law.

ALYSSA BOYEA
Hello, everyone, and welcome. Thank you for joining us today for our Inspire Readiness webinar, entitled Building a Data Ready Ecosystem for Public health Response. My name is Alyssa Boyer, and I'm the director of infectious disease preparedness at ASTHO, and I will be your moderator today.

For those who may not be familiar. ASTHO's Inspire Readiness is a platform for public health professionals to really share insights and solutions, enhancing response to communicable diseases, outbreaks, and disasters. While advancing public health preparedness.

We invite you to leverage our library of stories and resources to enhance your agency's ability to tackle public health challenges with innovative approaches a link to the platform will be provided in the chat shortly.

As we've seen time and time again, rapid access to accurate actionable data is crucial during public health emergencies by collecting, sharing and using that data, especially across diverse communities and systems, is not always straightforward.

This is where regional data ecosystems can play a crucial role. Our discussion today will explore how these data ecosystems are being designed to build networks of partners, streamline data processes at a regional scale and tackle common challenges, such as data, interoperability, security, logistics, and much more.

This work has broad and meaningful implications, not only for public health researchers and emergency response teams, but for State, territorial, local and tribal health officials, informatics and data specialists, policymakers, and so many more.

We're really excited to be joined by thought leaders from the University of Utah, the University of North Carolina, at Chapel Hill and the University of Washington, who have been developing and refining these innovative models.

We hope today's discussion gives you some practical insights, useful strategies, and new connections, to be able to support your work and help strengthen how we all prepare for and respond to public health emergencies.

With that, let me introduce you to all of our panelists. First, we have Kimberly Schaaf, Professor and director of the Rocky Mountains and High Plains Center for Emergency Public Health at the University of Utah. Next, we have Mike Newman, the Associate Director for Data, Science and management at the Utah population database. Next, we have Janet Basement professor within the Department of Epidemiology at the University of Washington School of Public Health. Finally, we have Vincent Toops, senior data scientist in the collaborative studies coordinating center at the Unc. Chapel Hill.

Before we begin, I want to remind folks that this session is being recorded and will be distributed to participants in the coming weeks. The Q and A box is also available. If you would like to add any specific questions, we will do our absolute best to allow time to answer those questions. At the end of today's session. Any resources will also be dropped in the chat during the session. So take a look at those as well.

Without further ado. I'm going to turn the mic over to Mike, who's going to provide a quick overview and foundation of a data ecosystem.

MIKE NEWMAN
Hi there! My name is Mike Newman from the Utah Population Database. Here at the University of Utah. I have a background in computer science and medical informatics. I started off my life as a software engineer and moved into informatics and ultimately finished with a PhD in public health. Not long ago. And I'm going to talk about the data ecosystem that we have here called the Utah Population Database. So this resource was created more than 40 years ago here in Utah. And so we have these data use agreements and these relationships that we've built with the Department of Health and Human Services and other data providers here in Utah, over the course of this time. We are primarily used for research-based studies here at the University of Utah. We're not operationally focused. But this is a pretty good example of a data ecosystem that we can talk about.

So we were created with a mandate to kind of reduce morbidity and mortality, look at cancer incidence and outcomes, and we have a fee-for-service model. There are lots of confidentiality and privacy protections built into our system, as there need to be, with all kinds of PHI aggregation resources. So there's two things to be aware of here on this slide. So the RGE is the Resource for Genetic and Epidemiological Research. They're kind of the regulatory body that governs the data use agreements with our providers, and ultimately, what investigators at the University of Utah receive in analytic data sets.

And so we have our usual IRB process, of course. But this is just an extra regulatory layer, because the data we have in UPDB is so sensitive and it's all linked together. We have something like a dozen different data sources. And we have a group of linkers—five full-time linkers—that load and link these data sets together so that we can aggregate data on an individual person level.

So it's very productive and very confidential. And then on the UPDB side, we do the actual data set construction. And we store the data. It's important to note that the data that we receive from our data sources—we don't own—we merely steward the data. And so through these data use agreements, that gives the power to the data providers to revoke those agreements if they ever see fit to do that. So that also incentivizes us to be very responsible with the data that we steward.

So here's some of our major data contributors in Utah. We have the Department of Health and Human Services, which provides much of our data around birth certificates and death certificates and the All Payer Claims Database. I'm sure other states that folks are in that are on this call also have APCDs, things like that. We have the Utah Cancer Registry, which is a SEER cancer registry that has provided data since the 1960s and then we have Intermountain Healthcare and the University of Utah Health. We have links to those databases I'll describe in just a moment.

We like to say we're population-based but person-centered at the Utah Population Database. We have over eleven million unique individuals in the database. So the current population of Utah is about three and a half million. So most of everyone living in Utah now, but then also historically, going back into the 1800s, we have genealogical data where we have people in the database from that time period.

So lots of different records from multiple sources. And we have a lot of administrative-type data—demographics, medical histories. So we have a lot of the All Payer Claims data. We have encounter data. What we don't have is that clinical data that exists within the University of Utah Health and Intermountain Healthcare systems. So we provide links to those things to those systems from UPDB. So this slide is just demonstrating that we're taking all this data from all these different data providers and linking it together at the person level.

And so we get updates to these different data sources typically every year, sometimes more frequently, sometimes less frequently. But again, like vital records, is the birth and death certificates. We have driver's license data that allows us to see people healthy individuals that aren't contacting the healthcare facilities.

UCR, the Utah Cancer Registry diagnoses data. And then, of course, we have these links to the University of Utah Health and Intermountain Healthcare systems. So we don't contain that clinical data like lab values or radiology reports or anything like that within UPDB. But we can help investigators link to both of those systems, bring that clinical data back into UPDB and then collect it all together in one data set and be able to deduplicate the individuals and have it all be under one individual. And then one of the other things that we do is that we have these genealogy records, and through birth and death certificates we've enhanced those to where we have more than four million people in UPDB that have three or more generations of relational data. So we're able to do things like gene discovery studies and things like that.

This is a more detailed view of the data that we have. So there's a lot going on here. I'm not going to take time to go into this in detail. But this is on our website, which there's going to be a link to on the end slide. Or if you just Google Utah Population Database, it's typically the first link.

But again, we have this genealogy data that goes clear back into the 1800s. We've layered birth certificates on top of that, built these giant pedigrees, and then linked these kinds of claims, encounter, administrative-type data, Utah Cancer Registry data that goes back to 66.

And in addition to the kind of administrative and claims and diagnosis data that we get, we also get location data from almost every single one of these data sources. So when a record comes in for a person, we get that person at a place at a point in time—a kind of triple combination of data—so that we can locate people over time. And since we have all this longitudinal data for people that stay in Utah, we can look at where they've been and where they've moved around Utah. So that, of course, enables us to do things like calculate environmental exposures or look at social determinants of health data like distance to care, or if you're in a food desert, or whatever you want to look at.

Here, this just shows the breakdown of the studies that we do at UPDB. We do about half cancer studies. We are housed within the Huntsman Cancer Institute here at the University of Utah—about half cancer and half non-cancer. So a lot of school of medicine type studies, and even outside of medicine, like social work and some other departments have used us.

So we do a lot of work. We support about 25 external funding submissions a year. We have lots of active projects, many IRBs, and we support something like 70 manuscripts using UPDB data every year.

So this is my name and email. If you have any questions about UPDB, please feel free to reach out, and I'll be happy to respond. Thank you.

ALYSSA BOYEA
Thank you, Mike. Next I'm going to turn the mic over to Kim to share a bit about the regional center's project work and collaboration. Kim.

KIMBERLEY SHOAF
Thank you very much, Alyssa. Mike just did a great overview of what an example of a data ecosystem looks like, and I will admit—and many other people might say—that when we first started working on this, I kept thinking, what is a regional data ecosystem? And what does that mean? Let me talk a little bit about what we did and where that came from.

The overall goal for these regional data ecosystems was to facilitate the exchange of data across a region to enhance coordination, responsiveness, and resiliency for emergency preparedness and response efforts. This was initially funded by the CDC. And this was partly as part of the planning for regional data ecosystems. Contracts were let for across the HHS regions for planning for public health preparedness and emergency response centers. And in three of those regions, the CDC also funded, in addition to that broader public health emergency preparedness and response assessments, the ability to look at planning for regional data ecosystems. The three regions that were funded were Region 4 (UNC Chapel Hill), Region 8 (here at the University of Utah), and Region 10 (the University of Washington). Each of us under that funding model—it was a one-year contract—was, as I said, part of this broader planning for public health preparedness. We were to create communities of practice and develop a model for an ecosystem for our region, and then to put out that model as well as lessons that were learned across that. As we looked at it, I realized that all three centers that worked on this took different strategies to pull together our communities of practice. We came up with slightly different models for ecosystems. So there's a lot of different ways that we can do this.

But I think we learned a lot of the same lessons across the three centers. And that's not surprising, given the types of things that we know in public health—the difficulties that we have in sharing data. So this is Region 8. And this is who we had in our community of practice. We had communicable disease epi supervisors, the Director of Public Health Preparedness, a health officer, chief data officers, chief information officers, some state and local health directors, informatics directors, we had state epidemiologists, emergency response coordinators, and a number of academics. This represented both healthcare emergency management and public health. So we had a broad range of people engaged in these discussions.

And as we looked at it, one of the biggest information questions that we had for them was: what were their data needs? And this is just some of the data needs that they talked about. There were some hazard-specific data needs—what do we know about the hazards that exist, and how do we get information about those hazards into a data system? Outcome data like patient outcomes, population health, what kind of exposures were people going to have, what kind of epidemiologic data, disease surveillance data. We talked a lot about some of the same things that Mike just talked about in UPDB—demographic data, vital statistics data, information about populations, vulnerable populations that comes from the census, and also other types of privately available data like EMS transports or environmental data. Being able to link all of these different pieces of information together was what we were looking at.

Our team came together, and we have an example of what perhaps a dashboard from this—if we pulled all this data together—could look like for a health department. This is an example of a dashboard that would be on a normal day in our region. Some of the information that could be flowing includes things like school absenteeism data, environmental data, weather data, CDC sentinel data. So, looking at a number of different types of data sets.

And then on a day where we did have a disaster—so the next day, if there was a wildfire—how that could then be narrowed down into a more concentrated geographic space.

As I said, the models that we have all looked quite different, but I think there were some really important lessons that we learned across all three of the centers working on these data ecosystems.

One was that there just isn't—certainly not in our region and not in the other regions—any cohesive infrastructure for data exchange. Current platforms either rely on very basic one-to-one data sharing options, or they're using spreadsheets or very simple ways of sharing data.

There's no standardized approach to data sharing policies within states or across states. Similarly, we have data silos between public health, healthcare entities, and emergency management entities.

There's no clear assignment of responsibility for maintaining, training, or funding a data infrastructure. Funding is certainly a big issue. Regular maintenance and comprehensive training programs are essential for potential users of the system. So this can't be something that's just thrown out there—we really need to think about how to maintain something like this.

So those are our lessons. I do want to say that all three of these projects were funded by CDC contracts under the Public Health Emergency Preparedness and Response efforts to develop the centers that currently exist.

All of this work was done primarily on the backs of our participants—our community of practice participants. They were the ones who told us what they needed and wanted. And that's really important—anything like this needs to have that community engaged in developing what the model would look like.

So let me turn this now back to Alyssa and to our other panelists.

ALYSSA BOYEA
Thank you so much, Kim. Next we're going to hear from Janet, who will provide information on the University of Washington's approach. Janet, the floor is yours.

JANET BASEMAN
Thank you so much, Alyssa. My name is Janet Baseman. I am an epidemiology professor at the University of Washington School of Public Health, where I've spent my whole career working on public health surveillance, public health informatics, and public health emergency preparedness and response systems—always in collaboration with public health practice partners to hopefully improve public health systems and services. I'm really happy to be here representing an entire team of people—whose names you can see here on this slide—who came together to work on our Region 10 project.

For orientation purposes, Region 10 includes Alaska, Idaho, Oregon, and the State of Washington, as well as the greatest number of federally recognized tribes in all of the HHS regions. We have urban, rural, and frontier areas to consider in our PHEPR work, as well as the uniqueness of each tribe.

Region 10 also faces a range of environmental hazards, including wildfires, earthquakes, floods, and extreme heat and cold.

The foundational vision for our data ecosystem planning task was to build collaborative relationships between researchers, practitioners, and policymakers before emergencies happen—so that researchers can support practitioners and decision-makers before, during, and after emergencies.

Fundamentally, the data ecosystem is envisioned to create synergies that empower practitioners and researchers in order to improve emergency preparedness and response systems.

To add a bit more detail, we envisioned an ecosystem that includes people, processes, and platforms. The primary goal of this data ecosystem approach is building and enhancing connections between people who can more collaboratively use and improve existing processes and platforms, and potentially create new ones in order to lead to better data, more evidence-based strategies and interventions, and eventually better PHEPR responses.

As Kim mentioned, in order to ensure that the people-driven ecosystem would meet the PHEPR needs of our region—and the other regions did similar work—we adopted a user-centered design approach for our planning work and recruited a 19-member CoP composed of state, tribal, and local public health professionals, as well as regional healthcare coalition members and researchers. In Region 10, we engaged regularly with our CoP partners.

We engaged with them synchronously online, asynchronously through various activities, and also had a culminating in-person experience with our CoP and our University of Washington team over the course of our planning grant year.

We worked with our CoP to identify common PHEPR tasks that they want to be supported by this envisioned regional data ecosystem. Our CoP helped us identify four use cases for our ecosystem: hazard tracking, health outcome tracking, population characteristics tracking, and facility capacity tracking. Once we had these in place, we focused on a single specific use case applicable across all of Region 10 to do a deep dive into regional data needs and challenges for that hazard.

The single use case we selected for Region 10 was wildfire tracking and wildfire smoke tracking.

To find ways to address the priority data needs for our region, we looked at technical aspects of the data ecosystem and conducted an environmental scan to do a SWOT analysis. We identified four strengths, thirteen weaknesses, eleven opportunities, and eight threats. Some examples are included here on this slide. For example, an existing strength in Region 10 was a culture of collaboration across organizations and jurisdictions.

A weakness was insufficient technical capacity and training resources in many jurisdictions, which makes the use of technical processes and platforms difficult—especially when coupled with high workforce turnover, which was identified as a major threat.

As for non-technical considerations for our data ecosystem development work, we primarily considered these three aspects: governance, policy, and partnership.

We learned many lessons in our planning work and documented them in a technical report for Region 10. We came up with 86 recommendations and vetted those with our community of practice to guide the regional data ecosystem development. A few high-level ideas are captured here.

For example, use cases such as hazard tracking should drive ecosystem development to ensure its utility for PHEPR professionals. Another priority is to invest in documentation, especially to help minimize duplicate efforts in highly resource-constrained practice sites.

We also have some recommendations to support standardizing data collection and reporting so that such standardization can help downstream data integration and sharing work. Most of all, we included concrete recommendations to foster partnerships and trust-building for the data ecosystem development. It’s so foundational in our findings that we have it envisioned across the bottom of the slide.

Some of our lessons learned were most relevant to the processes and platforms needed for each stage of PHEPR data pipelines—from data collection to data sharing, and through the process of storage, preparation, and analysis.

While some of our other lessons learned were related to people.

Our community of practice in Region 10 highlighted the value and importance of using the same processes and platforms before and during emergencies. PHEPR practitioners and researchers really needed time before emergencies to develop the necessary trust, agreements, and technical familiarity to use the processes and platforms for collaborative response during emergencies.

So in case it hasn't been clear yet, our work suggested that trust is the linchpin that will make regional data ecosystems work, which is why we have featured it at the center of this figure.

In any future implementation of a data ecosystem, we believe that the first pilot effort should really focus on intentional activities to foster trust among regional PHEPR data partners. Our community of practice suggested some activities, including regular meetings and creating an easy-to-use group communication platform, which could be as straightforward as Microsoft Teams or a Slack workspace.

Lastly, we saw many opportunities for synergy between our envisioned regional data ecosystem and some of the foundational data modernization efforts launched over the past few years.

The data ecosystem work has an anchor within DMI activities. Efforts to build collaborative data ecosystems—and the people, processes, and platforms needed to help those be successful—will hopefully progress in the coming years.

And that is it for me. Thank you so much.

ALYSSA BOYEA
Thank you, Janet. We really appreciate it. Finally, we will hear from Vincent on UNC's approach. Vincent, the floor is yours.

VINCENT TOUPS
Hi, I'm Vincent Toups. I worked at the Collaborative Studies Coordinating Center on the CDC project for Region 4, which includes Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee, and tribal nations.

And as we'll see in our example scenario—let’s see if I have slide control here—yes, there we go.

Our scenario was to look at the impact of a hurricane. This was very soon after Hurricane Helene.

And you'll see that as we get into the goals of our particular data ecosystem design, this was a huge influence in the way we thought about the problem.

Our goal was highly influenced by this idea of a disaster where things like the internet and power systems might not be available. In disaster responses, there's this counterforce where you want to have data to make decisions at exactly the time when data may no longer be available. So we wanted to come up with a voluntary system that would encourage people—with a very low cost of entry—to begin sharing data with one another. Our goals were to reduce morbidity and mortality and to build upon systems that might already be in place, that people in the ecosystem or area might already be using to manage data.

Our CFP members were state health departments, emergency preparedness management programs, departments of public safety, computing and applied sciences programs, centers for health informatics, tribal communities, and others.

Our key goals were covered by two basic ideas. One was to think of an ecosystem as a distributed system of individual actors who come together using some shared format of data to produce an ecosystem. Because we were interested in disaster resilience, we wanted to ensure the system was sufficiently distributed—not necessarily totally centralized—so that the system could continue to function in pieces or altogether.

To do that, we wanted to make a system that was easy to contribute to, so that people already generating data could quickly become part of the system in terms of posting data, but also grab data from the system relatively easily. We wanted to do some centralized material like dashboards and real-time activity, but we wanted that to be designed in such a way that it was optional to the functioning of the system in a time of emergency.

So we proposed a fairly technical design called GRIDS, which is just one of those silly acronyms, but it stands for the Good Enough Resilient Infrastructure for Data Sharing.

This is a model we put together that is meant to be fairly distributed. In this picture of GRIDS, we envisioned data sources—of which we've already seen many examples from other regions in this presentation—generating data in real time.

For these producers, we wanted to provide a technical set of specifications and tools to allow people to easily build an internet server that broadcasts data as it comes in. For example, if this was school absentee data, it might be updated once a day. Whoever manages that data system would follow a standard messaging protocol, broadcasting that information to a centralized cloud data storage system.

The goal was to make this as easy as possible. In our design, we first imagined describing a protocol—a bare protocol that anyone could broadcast to—but also building tools like standardized notebooks, Docker containers, Python libraries, and other technical platforms to make it extremely easy for someone to start broadcasting data once a day, once an hour, or whatever is appropriate.

This data would go into the centralized cloud platform, where a centralized effort would build dashboards and other systems to help people understand what kind of data is moving through the ecosystem, and also to upload processing steps that are formalized in the same language as production steps to transform data as it comes in.

Then there are data consumers—distributed individuals or organizations interested in specific pieces of data that others in the state or region might be producing.

One of the inspirations for this design came from my experience as a software engineer and data scientist. It's often difficult to know ahead of time exactly what data we want and in what format.

So rather than try to specify an elaborate set of standards upfront, we wanted to create a system where it was easy for people to move data into the cloud and start working with it. Then individual consumers could write their own processes to fetch that data.

In this diagram, we imagined that data consumers are running a local system that subscribes to the data ecosystem and makes a local copy of the data they’re interested in. That way, if the power or internet goes out and the central system disappears, each node still has the actionable data it needs.

Furthermore, we thought about how to design such an ecosystem. One of our inspirations was successful, arguably decentralized data ecosystems like Wikipedia. One of our success metrics was: would this be something people want to participate in? People volunteer their time to edit Wikipedia because they get value from it.

So in addition to the technical component, we also wanted to think about how to build a community. What kind of communication system would be in the centralized dashboard to make people feel like this was worth their time and encourage participation?

We identified a number of technical challenges. First and foremost is engagement. It’s easy to describe a set of standard messaging protocols, and we did specify what kind of message protocol would be used. But if people don’t use it, the system isn’t useful. That’s one of the big challenges with bootstrapping a distributed system—you need initial buy-in.

One possibility is that in the early days of such a system, there would need to be a strong push from funded agencies or groups to begin populating the system with useful data.

This is closely related to the notion of collaboration.

Ideally, we want to create a community where people voluntarily ask themselves, “What kind of data do I have that might be useful to others in the region?” That’s something you need to build a community around.

A more technical challenge we discussed is the question of authentication and authorization. Other members of the panel have talked about data where you can't just share with everyone you might need irb approval or might be personally identifiable, in which case this notion of a fully distributed data ecosystem presents some serious challenges because you're in the original design copying data every time you see it. And this, of course, if you have data that has personally identifiable data, then you want to control the degree to which the data spreads.

And so there were some ideas about how to handle this. One way to think about it is to have an out of band system for distributing encryption keys. So that for data that we want to share on the network, but we do not want to be widely available out of band of the system. We can distribute encryption keys so that people can be logging the data at their local machine. But if they don't have the correct key to understand what that data content is, then it's not useful for them to look at it.

And then another technical question which was important to us was, we have some data sets which contain a lot of data, and we probably do not want to be in that. You know, huge data sets don't necessarily lend themselves to incremental updates over time. So we thought very much about this idea of a  of a decentralized real time data system.

But then this question of how you would distribute large data sets in that system came up. And of course, in this case you have to make compromises with this notion of distributed data.

Of course, storage is inexpensive.

I don't know if you were familiar, but Netflix actually distributes a physical box to different locations in the country, so that the data they want to distribute is close to every customer. And so that kind of idea of having a smaller number, but a still distributed set of large data stores.

For that kind of information was something we talked about.

So our lessons learned were that we need to begin conversations early and anticipate growth. So of course, all of this is easy to talk about from a technical point of view. But again, unless participants are actually engaged in a system it, it's just not going to to function. So maybe it was a weakness of our design in some sense that we we thought mostly about the technical problems.

We also wanted to keep it simple and easy to use. And there are multiple definitions of this idea of simplicity. So for me, as a technical person  my idea of simplicity is that oh, you just install a python library and then you run it. But of course you have to think about that design from the community's point of view. What mechanisms  to participate in a data ecosystem are optimal for people who are actually in the in the places where the data is.

Another question, of course, is is related to the security question. We want to know early on what kind of data needs to be secured and what kind of data can be freely distributed. And of course, we want to maintain transparency about how the data ecosystem works.

And also and I think we did a good job with this one which is anticipating atypical data sources and users. So our system is really designed without a specific idea of what kind of data might be integrated into it. So we designed a protocol, but the idea was to make it as simple as  broadcasting a message to the centralized system that you had a particular type of data. And here is a row.

And that's the I just want to, of course, acknowledge everybody else on the team. Everybody who is part of the region for community practice Mimi Kothari and the health scientists and the CDC team. Thank you.

ALYSSA BOYEA
Thank you so much, Vincent, and to all speakers. Thank you so much for sharing your work and insights with us as a reminder. We do encourage folks to add your questions to the Q&A box. I do see a few questions, so I'm going to go ahead and kick us off.

So our 1st question is around evaluation. And so one of our attendees would like to know, how have you, or would you evaluate and show the return on investment of an ecosystem for Phr.

KIMBERLEY SHOAF
I'll take that one the same way. Part of the problem with with evaluation and return on investment when on anything where we're looking at public health, preparedness and response. Is that a lot of the the value in that comes in when something actually happens right? Are we able to do things quicker, more effectively, more efficiently in a response. If we've got data that covers across different jurisdictions. And so that's always a problem when we're looking at.

Well, when something happens, we'll, you know. Look at that. And the investment. In the meantime, if something doesn't happen, is it a wasted investment? And the question that the answer to that is looking at building these things so that they're really dual use, so that we can then look at the daily right, the daily use of of sharing of data. So that's one of the things that I think about when I think about evaluating and return on investment for something like this.

ALYSSA BOYEA
Thank you, Vincent. I saw you come off of mute. Would you like to add your thoughts.

VINCENT TOUPS
Oh, I just, I'm I'm actually very interested in Erica's question. So I was coming off mute to to talk about it. But I don't wanna derail the discussion.

ALYSSA BOYEA
We can jump to that. Thank you so much, Kim. So yes. Erica's question in the Q&A was around tribal data sovereignty, and so wanted to learn a little bit more about how some of these projects may be considered, that as they recently learned this data governance principle that essentially points out that the modern mentalities around data modernization is in direct conflict with data sovereignty. So, Vincent, I know you were eager to take that one. So I'll turn it over to you.

VINCENT TOUPS
So I just think this is a extremely important question. In fact, my life story is that I got out of private sector data science, because I was concerned about the way that there's sort of like a feeding frenzy in the private sector for grabbing as much data as can be grabbed. And then literally just saying, we own this data without any kind of consent, really, from the people who the data is gathered about.

And I think honestly, I can't fully answer a question like this. I think there are some genuine ethical questions associated with the production and the processing of data.

However. I do think that one of our goals was to produce a system that was decentralized. That was easy for people to connect to, so that you wouldn't need a lot of money or resources to participate. And in a situation like that, the power for how to share data which data to share is sort of more in the hands of the individual people who are deciding to share it rather than systems which are centralized, which have a sort of almost a necessary tendency to smooth out all the possible feelings people might have about their specific data and how it works. So there is a benefit to having a distributed data system like that where there isn't as much of a central tendency to extract value from the data.

JANET BASEMAN
Could I also jump in there, Alyssa?

ALYSSA BOYEA
Of course.

JANET BASEMAN
Yeah. So this was really a topic of a lot of discussion within Region 10. We had several tribal partners who were very passionate about this topic in our community of practice, of course.

And some of our key recommendations—not the full 86, like the distilled version that we shared—some of these were highlighted, including the CARE principles. CARE standing for Collective benefit, Authority to control, Responsibility, and Ethics.

Prioritizing that for any data integration and data governance planning, followed by a set of FAIR principles with FAIR standing for Findable, Accessible, Interoperable, and Reusable with simplified data standards for nontraditional data streams.

And both of those frameworks we envisioned as being an essential part of regular data governance discussions to ensure we were fostering trust which, remember, was so foundational to so much of our work and value of the ecosystem, and would allow us to respect the autonomy of the Indigenous peoples and our tribal partners to share their data as they saw fit, and also ensure that they were benefiting.

So there is this kind of bi-directional benefit idea as well. That was just alluded to. That was really important to us as well. Thank you.

MIKE NEWMAN
I can add one more comment here, too. So at the Utah Population Database, I mean, we're hearing about this tension in this call about how you know, there's a need for speed and timeliness for public health response data, but also a need to respect the data providers and their willingness to share the data. And when you're when you're receiving data and aggregating and resharing it out that can take some time right? And so here at the Utah Population Database, we've erred on the side of the data providers themselves. And so it's a very lengthy process for investigators to come and ask for data and to go through the regulatory processes and receive it. But ultimately it's those data use agreements that are so important to outline how the data is going to be used, and try to find that balance between timeliness and respect of the providers.

ALYSSA BOYEA
Absolutely thank you so much for all of your comments.

We do have a few more questions in the chat, so I'll read the next one. So this question is asking about data standards and wants to know, how do you envision new data standards like FHIR and new networks like TEFCA playing a role in data ready ecosystems for PHR.

VINCENT TOUPS
I can say something about this very briefly.

One of the things I have learned working at startups and other situations is that it's very, very difficult to anticipate ahead of time all of the different ways that you're going to want to represent data and to formalize that ahead of time. It's just a very difficult thing to do.

And that's because you just never know what information is going to be important or not.

And so one of the guiding principles, as we thought about our proposal, was just encouraging people to share data, of course, that they were comfortable with without. It sounds a little crazy to say, but without a plan.

Because it may be the case that it's there's some kind of hurricane, a disaster, and then you suddenly realize that you need the data. And I think if if people are inhibited by complicated standardization processes before then, then, the data just may not even be there.

And so we wanted to build a system where we just said, here's how you communicate data to one another. Just send this kind of message, but did not specify in that system what kind of the specific format of the data inside.

And we, if we go, if we were to look at the slide, there's a section about data processors. And there is this idea that as people identify useful ways to represent data for whatever problems they're solving, they can actually attach a system to this centralized system that automatically processes data towards a more coherent or a more schematized form. And that way it can be used in that way. But we want to encourage the data to first get there before we try to figure out exactly what those things exactly how that would all work together.

JANET BASEMAN
I'll just add that several of our more informatics—so public health informatics—focused CoP members really felt like these standards would be pretty critical moving forward. So there's that.

And I see there's a question about the acronyms.

So FHIR is pronounced “fire,” which is pretty cute, and it stands for Fast Healthcare Interoperability Resources. And it's an HL7 standard for sharing health information. And then TEFCA is the Trusted Exchange Framework. For what's the CA stand for?

Is Common Agreement. Did I read your lips right, Kim? Okay, thank you. Yeah. So we talked a lot about standards, but we had a lot of informatics folks in our group.

KIMBERLEY SHOAF
Yeah. And we had. We had an informatics who's been working with APHL on the whole FHIR. And so I'm like, yeah, you talk about those things. I don't talk about those things. The informatics is really managed, those kinds of issues. So and certainly we took those into consideration. Right? Because when you're sharing health data, it is really important that we that we maintain those standards.

ALYSSA BOYEA
Wonderful. Thank you so much for your thoughts on that.

We do have another few questions. So the last one that came in was especially since collaboration and perspective seem to be a valuable aspect of how these projects have been conducted. What type of processes do you undergo when trying to select external vendors, specializing in data integration.

KIMBERLEY SHOAF
I will say that from in our region. And we've talked. We've actually thought a lot about this before, because there is that issue of you know. How do you? Who do you trust, and how do you purchase something? And is something that you purchase off the shelf really going to be responsive to the region's needs. We really looked at the academic institutions being those sort of trusted.

I don't. I don't know like I like the word vendor. But this trusted agents, right? For being able to coordinate and manage this. And part of that was looking at UPDB really as an example of that, and how, as a trusted agent, others could come, have access to that. So that was where we came from. I'm not sure how the other centers approach that.

JANET BASEMAN
Well, because this was planning work in Region 10, we weren't really, you know, thinking about vendors too much. When our practice partners do bits of this work, they've worked with a range of different vendors. I can't really speak to what processes they're using, but I think it sometimes has to do with relationships and cost, of course, and ability to execute.

Our practice partners have partnered with us, as Kim was saying, on the academic side, as well as with more traditional technical vendors.

ALYSSA BOYEA
Thank you so much. So one last question in the Q&A: Did any of you look at or use the Patient Unified Lookup System for Emergencies, or PULSE, as part of your designed ecosystem?

KIMBERLEY SHOAF
We did not. And that's actually the first that I'm hearing of this. And so I'm going to go look that up and follow up on that. Part of that is, we didn't necessarily look at very specific pieces, right? We had some broad ideas and some things, but we didn't really look at very specific data sets and data sources for this.

ALYSSA BOYEA
All right.

Well, I'm going to ask one last question, since I don't see any more in the Q&A. I have one last question for our panelists.

And so I would say, thinking about the future, what are you most excited about when you think about the future of data ecosystems and ecosystem frameworks? Like, what gets you really excited?

JANET BASEMAN
I could go first.

What we learned through this work is that the opportunity space is huge for this—huge. And so I would just be excited about collaboratively creating something. So we did the planning grant, and being able to actually create something. You know, our first question was about evaluating. If we get to create something, then we get to figure out how to make it better. So I'm just excited about being able to collaboratively—along with the practice partners and the academic partners—bring those decision makers, practitioners, and academics together, based on the foundational planning work we just completed over the last year, because there's so much that needs to be done, and so many places where I think we could really improve things.

VINCENT TOUPS
I can answer something.

I think I'm actually quite skeptical about AI systems and their utility in a variety of other things.

But I do think we're entering into a period where there's an unprecedented opportunity to take unstructured data and produce structured data out of it.

And so there's so much data floating around because of the way we've constructed the world with computers connecting everything. And if you've ever tried—I'm sure people in the meeting have tried—combining data sets from disparate sources and realized how hairy that problem can be.

Just because it's very difficult and subjective and complicated to figure out how things connect up.

But I think that we're reaching the stage where some of that work can be automated in a way which we could never have done before, and I think there's an opportunity there. There are risks there, of course—also serious risks—but I do think it's a very exciting time to be in a world where a lot of data that was basically not analyzable is in reach for those kinds of systems.

KIMBERLEY SHOAF
And I will let me add to that, that for me the most important piece of this, and the thing that really excites me is the opportunity that this provides our state, and particularly our local health department partners, to be able to do things in a way that is so much easier for them to be able to manage the disasters as they're coming in front of them. A number of years ago we worked with one of our local health departments to create an automated student absenteeism system where they could—because prior to that, they had been once a week manually downloading all the absenteeism data from their one school district and then manually analyzing that. And by the time they did that, anything that was going to come up was already too late. And we worked with them to create a system where every single day the school district just uploads the data to a system that then creates and analyzes it, and looks at the trends over time. And it went from it being a full day of an epidemiologist’s time to every morning just turning on their computer and seeing what happened there, and how much more time they had to be able to then respond.

And I think that really for me is what excites me about this ability—that we have so much data—to bring those data sets together, to bring all those points together, to utilize the things like AI right now to be able to manage this and make it so that it's actually useful and helpful for our public health partners. That to me is what really excites me.

ALYSSA BOYEA
Wonderful. Thank you all so much again for your thoughts and feedback. I would be remiss if I didn't ask folks just to take a moment to please evaluate our session. We're going to be putting up the QR code in just a second. The eval link is in the chat. Your insights and feedback as well are very valuable in assisting us and making sure that we are planning helpful and beneficial events.

And again, I just want to say a big thank you to all of our speakers today. This was such a wonderful conversation. And again, thank you all for joining us, and we hope you have a wonderful rest of your day.

Take care.

Thank you.