Skip to main content

Accessing National Water Model (NWM) Data via Google Cloud BigQuery API

· 3 min read
gcp architectrure diagram Image Source: https://github.com/BYU-Hydroinformatics/api-nwm-gcp

Several important historical and ongoing National Water Model (NWM) datasets are now available on Google Cloud BigQuery, which makes them queryable through SQL using Google Cloud console. Some of these data sets are also accessible through an API (e.g. using Python). These datasets and their current status are as follows:

ProductCloud Console SQLCIROH APIHistoricalDaily Updates
Medium-range forecastsXXXX
Long-range forecastsXXXX
Analysis and AssimilationXXXX
Retrospective Data (NWM v3)XX
Return PeriodsXX

To gain access and run queries on BigQuery:

This service offers a convenient and efficient way to interact with NWM data for your research and analysis needs.

Public-Private Partnership: Advancing Water Resource Management

The National Water Model (NWM) BigQuery project exemplifies a successful collaboration between public and private sectors, uniting government-generated data with modern, cutting-edge cloud technology. This collaboration addresses several key aspects:

  • Improved Data Access: By leveraging Google Cloud BigQuery, a private sector platform, the project dramatically improves access to public NWM data. This partnership makes valuable water resource information more readily available to researchers, policymakers, and the public.
  • Technological Innovation: The integration of NWM data with BigQuery showcases how private sector technology can enhance the utility of public sector data. This synergy promotes innovation in data analysis and visualization techniques.
  • Cost-Effective Solutions: The CIROH DevOps team's commitment to covering query costs demonstrates how public funding can be strategically used to make private sector tools accessible to a wider audience, particularly in the academic and research communities.
  • Capacity Building: This initiative helps build capacity across sectors by providing researchers and organizations with powerful tools to analyze water resource data, potentially leading to better-informed decision-making in water management.
  • Scalability and Efficiency: By utilizing Google's cloud infrastructure, the project ensures that the growing volume of NWM data can be efficiently stored, accessed, and analyzed, addressing the scalability challenges often faced by public sector IT resources.
  • Cross-Sector Collaboration: This project fosters collaboration between government agencies, academic institutions, and private technology companies, creating a model for future partnerships in environmental and resource management.
  • Open Science Promotion: By making NWM data more accessible, this partnership supports the principles of open science, encouraging transparency and reproducibility in water resource research.

This public-private partnership not only enhances the value of the National Water Model but also sets a precedent for future collaborations that can drive innovation in environmental data management and analysis.

CIROH Cloud User Success Story

· 3 min read

This month, we are excited to showcase two case studies that utilized our cyberinfrastructure tools and services. These case studies demonstrate how CIROH's cyberinfrastructure is being utilized to support hydrological research and operational advancements.

1. ngen-datastream and NGIAB

ngen-datastream image

Overview:

CIROH’s cloud computing resources have allowed for the development of ngen-datastream, which automates the process of collecting and formatting input data for NextGen, orchestrating the NextGen run through NextGen In a Box (NGIAB), and handling outputs. This software allows users to run NextGen in an efficient, relatively painless, and reproducible fashion, increasing community access to the NextGen framework. ngen-datastream is already community accessible (https://github.com/CIROH-UA/ngen-datastream/tree/main) and making an impact on research. A major component of this software is the Amazon Web Services (AWS) cloud-based research datastream (https://github.com/CIROH-UA/ngen-datastream/tree/main/research_datastream). The research datastream is a CONUS-wide recurring NextGen simulation configured by the community. The terraform to build the AWS infrastructure exists in the ngen-datastream repository and current development focuses on CI/CD and enabling community contribution to the research datastream via edits to the NextGen configuration. Ultimately, these tools help distribute access throughout the community to cutting edge hydrologic research, maximizing the pace of progress of research to operations in hydrology.

Contribution to CIROH:

  • Automation: It automates the process of collecting, formatting, and validating input data for NextGen, streamlining model preparation.
  • Flexibility: It allows users to provide their own input files to run NextGen.
  • Scalable Infrastructure: It utilizes AWS state machine to provide access to high-performance computing (HPC) resources.

Infrastructure Utilized:

  • Elastic Compute Cloud (EC2)
  • Simple Storage Service (S3)
  • AWS Lamda and Step Functions

2. TEEHR

  • PI : Katie van Wekhoven
  • Co-PI : Matt Denno (Development Lead)
  • Developer : Sam Lamont

Project Overview:

The goal of this project is to investigate, design, and build a prototype hydrologic model/forecast evaluation system (TEEHR) that will significantly improve our ability to evaluate continental-scale datasets and will provide a robust and consistent evaluation tool for CIROH and OWP research. Design priorities include easy integration into common research workflows, rapid execution of large-scale evaluations, simplified exploration of performance trends and drivers, inclusion of common and emergent evaluation methods, efficient data structures, open-source and community development, and easy extensibility.

teehr image

Contribution to CIROH:

  • TEEHR-HUB: It is a JupyterHub environment, running the TEEHR image, with AWS services (EFS and S3) to provide a scalable platform for hydrologic research.
  • Data Processing: TEEHR-HUB has successfully processed the AORC (v3.0 retrospective) gridded precipitation data to the MERIT basins, as well as the CONUS 40-year retrospective (v3.0 and USGS).
  • Testbed Integration: TEEHR-HUB’s compatibility with various testbeds allows researchers to experiment with different hydrologic models and datasets.
  • Evaluation - TEEHR is being used (or is planned for use) by several CIROH research teams to evaluate large scale model results.

Infrastructure Utilized:

  • Elastic Kubernetes Service (EKS) (including supporting AWS services) - Scalable computing resources to host JupyterHub Dask and Spark
  • Elastic File System (EFS) - Shared data drive for cached data and shared documents (notebooks, etc.)
  • Simple Storage Service (S3) - Bucket storage for large public and private datasets

CIROH Research CyberInfrastructure Update

· 2 min read

We're excited to share some recent developments and updates from CIROH's Research CyberInfrastructure team:

Cloud Infrastructure

  • CIROH's Google Cloud Account is now fully operational and managed by our team. You can find more information here.
  • We're in the process of migrating our 2i2c JupyterHub to CIROH's Google Cloud account.
  • We've successfully deployed the Google BigQuery API (developed by BYU and Google) for NWM data in our cloud. To access this API, please contact us at ciroh-it-admin@ua.edu. Please refer to NWM BigQuery API to learn more.

Support and Services

  • Monthly AWS office hours are ongoing. For more details on how to join, email us at ciroh-it-admin@ua.edu.
  • We provided IT support for the Summer Institute 2024, REU students, and team leads this summer.
summer-institute-students
reu-institute-students

Security Enhancements

  • New security features have been added to our CIROH-UA GitHub repository to prevent commits containing sensitive information.
  • We've updated our AWS best practices, particularly regarding key management. If your project uses CIROH AWS resources, please review these updates at AWS Best Practices.

Resources and Access

  • For external IT resources needed for your projects, check out NSF Access Allocations here.
  • GPU allocation is now available on CIROH's 2i2c JupyterHub. To request access, please fill out this form.
summer-institute-students

For more information on our services, please refer to our services page.

We're continually working to improve our IT infrastructure and support. If you have any questions or need assistance, don't hesitate to reach out to us at ciroh-it-admin@ua.edu.

CIROH Developers Conference 2024

· 2 min read

CIROH Developers Conference 2024

DevCon2024

The CIROH team recently participated in the 2nd Annual CIROH Developers Conference (DevCon24), held from May 29th to June 1st,2024. The conference brought together a diverse group of water professionals to exchange knowledge and explore cutting-edge research in the field of hydrological forecasting.

Reflecting CIROH's current research focus, the conference explored topics including hydrological modeling (NextGen), flood inundation mapping, hydroinformatics, social science, and community engagement. Attendees got the opportunity to delve deeper into specific areas through its well-structured training track. This year the tracks were:

  • NextGen
  • Flood Inundation Mapping (FIM)
  • Hydrological Applications of Machine Learning (ML)
  • Hydroinformatics
  • Cross-cutting

This year, various workshops leveraged cloud technologies. Notably, we provided access to the 2i2c JupyterHub environment, a cloud-based platform for interactive computing, for ten workshops. This facilitated seamless access to powerful computing resources for participants. Additionally, we provided AWS instances to support four workshops.

DevCon2024
DevCon2024

Presentation Slides:

You can find the presentation slides here. To learn more about CIROH's work or connect with the team, visit our website at CIROH-website.

Conference Website:

Learn More

DevCon2024
DevCon2024

AWRA 2024 Spring Conference

· 2 min read

AWRA 2024 Spring Conference

The CIROH CyberInfrastructure team recently participated in the AWRA 2024 Spring Conference, co-hosted by the Alabama Water Institute at the University of Alabama.

Themed "Water Risk and Resilience: Research and Sustainable Solutions," the conference brought together a diverse group of water professionals to exchange knowledge and explore cutting-edge research in the field.

CIROH CyberInfrastructure team presented on these topics:

  • Accelerating Community Contribution to the Next Generation Water Resources Modeling Framework
  • Creating a community dataset for high-speed national water model data access
  • Model structure selection for the flood and drought predictions using the NextGen Framework based on the extreme event simulations

CIROH team member James Halgren presented the work on "Accelerating Community Contribution to the Next Generation Water Resources Modeling Framework." The presentation focused on building and sharing a continuous research data stream using the NextGen Water Resources Modeling Framework with NextGen IN A Box (NGIAB). This project, a collaboration with Lynker members, showcases the potential for open-source tools and community-driven efforts to advance water resources modeling and research.

AWRA2024Spring

CIROH team member Sepehr Karimi presented the work on "Creating a community dataset for high-speed national water model data access"

AWRA2024Spring-2

CIROH team member Shahab Alam presented the work on "Model structure selection for the flood and drought predictions using the NextGen Framework based on the extreme event simulations"

AWRA2024Spring-3

These presentations showcased CIROH's expertise in open-source tools, community-driven efforts, and water resources modeling. The team's contributions sparked insightful discussions and potential collaborations for future projects.

Call to Action:

To learn more about CIROH's work or connect with the team, visit our website at CIROH-website.

Conference Website:

AWRA 2024 Spring Conference website link

Google Cloud Next '24: A Flood of Innovation and Inspiration

· 4 min read

Google Cloud Next '24

Hello everyone, and thanks for stopping by!

I recently had the incredible opportunity to attend Google Cloud Next 2024 in person for the first time, and it was truly an amazing experience. From insightful keynote presentations and workshops to vibrant booths buzzing with connections, the event was a whirlwind of innovation and inspiration.

One of the highlights was undoubtedly the abundance of AI announcements and advancements. Google continues to push the boundaries of what's possible, and it was exciting to witness the future of technology unfold.

Among the many highlights, CIROH achieved a significant milestone with its first-ever session at Google Cloud Next. The presentation, titled "Channel the Flood Data Deluge: Unlocking the American National Water Model," link led by Kel Markert (Google), Dr. Dan Ames (BYU), and Michael Ames(SADA) was a resounding success. The session link shed light on the immense potential of the National Water Model and its ability to revolutionize water resource management.

GoogleCloudNext2024

The conference was a truly enjoyable experience, especially collaborating with Dan, Kel, Michael and others. We had a great time together and sharing our insights.

GoogleNext-1
GoogleNext-2
GoogleNext-3

The energy and enthusiasm throughout the event were contagious, and I left feeling incredibly motivated and inspired. I connected with numerous individuals from diverse backgrounds, fostering new collaborations and sparking exciting ideas for the future of water research and technology.

If you're curious to see more about my Google Cloud Next experience, head over to my LinkedIn post link where I've shared pictures from all three days.

Thank you for reading and stay tuned for more updates on the exciting advancements in water research and technology!

Want to delve deeper into the insights and announcements from Google Cloud Next? Check out these valuable resources:

SADA Live: Recap Key Cloud Technology Insights from Google Cloud Next '24: link This LinkedIn event offers a comprehensive overview of the key takeaways and technological advancements unveiled at the conference.

Day 2 Google Blog Recap: Dive into the specifics of Day 2 at Google Cloud Next with this insightful blog post, covering topics ranging from AI and data analytics to infrastructure and security. link

AI Takes Center Stage:

  • Gemini for Google Cloud: The introduction of Gemini 1.5 Pro, integrated with various Google Cloud services, promises enhanced functionality, security, and AI performance across diverse applications.
  • AI Infrastructure Advancements: The AI Hypercomputer provides exceptional computational power for complex AI tasks, while Gemini API now offers models tailored for various scales, enriching the development environment.
  • Vertex AI Enhancements: New tools for low-latency applications and improved Gemini integration empower developers to build more efficient and sophisticated AI-driven applications.
  • Secure AI Framework (SAIF): Establishes rigorous security standards for AI implementations, ensuring secure and responsible AI integrations.
  • AI Database Assistant: Leverages Gemini to simplify complex queries and deepen AI integration into database management.
  • Google Vids: This innovative Workspace feature utilizes Gemini and Vertex AI to enhance digital storytelling and collaboration, revolutionizing workplace communication.

Infrastructure and Development:

  • Google Axion Processor: This cutting-edge processor boasts significant performance and energy efficiency improvements compared to traditional x86 instances, setting a new standard for computational efficiency. link

  • Google Distributed Cloud (GDC) Sandbox: Enables developers to build and test services for GDC within a Google Cloud environment, simplifying the development process. link

  • Migrate to Containers (M2C) CLI: This new tool facilitates seamless migration of applications to containers, supporting deployment on GKE or Cloud Run. link

Security and Data Analytics:

  • AI Cyber Defense Initiative: Revolutionizes cybersecurity by leveraging AI for innovative solutions against cyber threats.
  • BigQuery as a Unified Platform: Transforms BigQuery into a comprehensive platform for managing multimodal data and executing AI tasks, seamlessly integrated with Gemini.
    Check out all the announcements: link

Monthly News Update - March 2024

· 2 min read
Accelerating Innovation: CIROH's March 2024 Update

The CIROH team has been diligently accelerating research cyberinfrastructure capabilities this month. We're thrilled to share key milestones achieved in enhancing the Community NextGen project and our cloud/on-premises platforms.

A significant highlight was the successful launch of our new fully operational on-premises infrastructure. Comprehensive documentation is now available here, ensuring seamless access and utilization. Additionally, we've fortified the NextGen in a Box (NGIAB) ecosystem with bug fixes, repository enhancements, and initiated work on automating the CI pipeline for the Singularity Repo

Empowering our community remains a top priority. We've expanded the DocuHub knowledge base with dedicated sections on on-premises access guidelines, as well as policies and best practices for optimized infrastructure usage here . Furthermore, our team represented CIROH at the AWRA Geospatial Water Technology Conference in Orlando, sharing insights on leveraging geospatial data for water research. Refer here

As we continue driving advancements, we extend our gratitude for your unwavering support of the Community NextGen project and CIROH's cyberinfrastructure endeavors. Be on the lookout for more exciting updates next month as we strive to unlock new frontiers in water science through robust computing capabilities.

Click Here to Visit Community NextGen and NGIAB News from March 2024

Monthly News Update - February 2024

· One min read

Welcome to the February edition of the CIROH DocuHub blog, where we bring you the latest updates and news about the Community NextGen project and CIROH's Cloud and on-premise Infrastructure.

Our team has been hard at work enhancing CIROH's Infrastructure and Community NextGen tools. Here are some highlights from February 2024:

  1. We successfully launched our new On-premises Infrastructure, which is now fully operational. You can find documentation for it here.

  2. For NGIAB, we've made improvements to the CI pipeline for pull requests submitted with forked repositories. Now, we automatically build and test these submissions using the CI pipeline.

  3. We've added documentation for the NWMURL python package, which offers utility functions tailored for accessing National Water Model (NWM) data URLs. This library streamlines the process of accessing NWM data for various purposes, including analysis, modeling, and visualization. You can explore the documentation here.

  4. We're thrilled to announce the NextGen Track for DevCon24. The schedule is now available at: DevCon24 Schedule.

Thank you for your ongoing interest and support in the Community NextGen project. Stay tuned for more exciting updates and developments next month. 😊

Click Here to Visit Community NextGen and NGIAB News from Feb 2024

NextGen Monthly News Update - January 2024

· 2 min read

Welcome to the January edition of the CIROH DocuHub blog, where we share the latest updates and news about the Community NextGen project monthly. NextGen is a cutting-edge hydrologic modeling framework that aims to advance the science and practice of hydrology and water resources management. In this month's blog, we will highlight some of the recent achievements and developments of the Community NextGen team.

First, we are excited to announce that NextGen In A Box (NGIAB) is now available with Singularity support. This means that you can run NGIAB on any HPC system that does not support Docker, using Singularity containers. Singularity is a popular tool for creating and running portable and reproducible computational environments. To learn how to use NGIAB with Singularity, please visit our GitHub repository: Ngen-Singularity.

Second, we have made several improvements and enhancements to NGIAB, such as updating the sample input data, upgrading the Boost library, adding auto mode run, and supporting geopackage format. You can find more details about these updates on our GitHub repository: NGIAB-CloudInfra.

Third, we would like to share with you is the development of NextGen Datastream, a tool that automates the process of collecting and formatting input data for NextGen, orchestrating the NextGen run through NextGen In a Box (NGIAB), and handling outputs. The NextGen Datastream is a shell script that orchestrates each step in the process, using a configuration file that specifies the data sources, parameters, and options for the NextGen run. The NextGen Datastream can also generate its own internal configs and modify the configuration file as needed. You can find more details and instructions on how to use the NextGen Datastream on our GitHub repository: ngen-datastream.

We hope you enjoyed this blog and found it informative and useful. If you have any questions, comments, or feedback, please feel free to contact us at ciroh-it-admin@ua.edu. Thank you for your interest and support in the Community NextGen project. Stay tuned for more exciting news and developments in the next month. 😊

Visit NGIAB News

NextGen Framework Forcings

· One min read

NextGen Framework Forcings

A new forcing processor tool has been made public. This tool converts any National Water Model based forcing files into ngen forcing files. This process can be an intensive operation in compute, memory, and IO, so this tool facilitates generating ngen input and ultimately makes running ngen more accessible.

Read more

Visit Github

Welcome DocuHub's Blog

· 2 min read
Arpita Patel
Creator/Maintener of DocuHub

Adding posts

What file name to use?

DocuHub will extract a YYYY-MM-DD date from many patterns such as YYYY-MM-DD-my-blog-post-title.md or YYYY/MM/DD/my-blog-post-title.md. This enables you to easily group blog posts by year, by month, or to use a flat structure.

Example (with Metadata/Front matter)

To publish in the blog, create a Markdown file within the blog directory. For example, create a file at /blog/2019-09-05-hello-docuhub.md

e.g.

---
title: Welcome DocuHub
description: This is my first post on DocuHub.
slug: welcome-DocuHub
authors:
- name: John Doe
title: Co-creator of Product 1
url: <Youe github product or external article link>
image_url: <Author pic url>
- name: Jane Doe
title: Co-creator of Product 2
url: <Youe github product or external article link>
image_url: <Author pic url>
tags: [hello, docuhub, nextgen]
hide_table_of_contents: false
---

Welcome to this blog. This blog is created with [**DocuHub 2**](https://docs.ciroh.org/).