NWM Data Access
Within the CIROH projects, we encounter a wide range of data resources and data access inquiries. One of the most frequently asked questions is, "How can I obtain access to xyz-resource?". To help with answering that question, we have documented some of the most common data access methods and resources here, with links to additional sites to dive deeper.
Input and Output Data of the National Water Model
Here, you will find resources that grant access to the input data used and the output data produced by the operational national water model.
Official NOMADS Resource
The official NWM meteorological inputs and hydrology and routing outputs are accessible through both HTTP and FTP. These resources are provided by the National Center for Environmental Prediction (NCEP) at the following locations:
As of October 24, 2023, these resources include the following directories:
para_post-processed/ 22-Sep-2023 20:37 -
post-processed/ 02-Nov-2020 14:31 -
prod/ 24-Oct-2023 00:18 -
v3.0/ 24-Oct-2023 00:18 -
The para_post-processed
directory lacks specific documentation, although the "para" designation suggests it is a "parallel" execution, indicating a candidate production run under testing for operational use. In the post-processed dataset, you will find the following subdirectories:
- NOMADS post-processed
- RFC: Outputs filtered down to RFC locations.
- WMS: Contains re-indexed/reformatted outputs in per-forecast netCDFs suitable for rapid querying and responsive for graph visualizations on the water.noaa.gov/map site.
- IMAGES: .png-formatted renderings of NWM output for various domains and variables.
- logs: Logs. :)
NODD - NOAA Open Data Dissemination Program
"The NOAA Open Data Dissemination (NODD) Program provides public access to NOAA's open data on commercial cloud platforms through public-private partnerships. These partnerships remove obstacles to public use of NOAA data, help avoid costs and risks associated with federal data access services, and leverage operational public-private partnerships with the cloud computing and information services industries." (For more information, visit NODD)
The NODD datasets made available through several public cloud vendors are an incredible resource for accessing NWM data for research and evaluative purposes. The NWS NODD datasets are listed on this page and include the following:
AWS
AWS hosts two repositories as part of their sustainability data initiative.
The first repository contains the operational data (now hosts 4 week rolling collection of all output; it used to only be short range and the registry entry retains the description only for the short_range data here; alternatively, the same resource is described under the sustainability initiative page here.)
- The catalog of AWS-hosted operational NWM data can be browsed here.
The second (and more useful) AWS repository contains several versions of the retrospective dataset each described on the main page under the open data registry here. (The same information is also on the AWS sustainability initiative webpage here )
The different catalogs of those [currently] five versions of that resource are linked below:
- Two versions of NWM v2.1 retrospective
- Two versions of NWM v2.0 retrospective
- NWM v1.2 retrospective data
- netCDF, here
The AWS retrospective resource is the primary publicly available source for the version 1.0 of the “AORC” Analysis of Record for Calibration dataset, which is a 40-year best-available estimate of most common meteorological parameters required for hydrological modeling. Version 1.1 of the dataset will accompany the release of the NWM model version 3.0 retrospective (or 2.2 version??), hopefully in the next few weeks.
Jupyter notebook instructions for processing NWM Zarr and NetCDF output formats here
An example of pulling data from the channel output zarr 2.1 archive and writing the results to csv follows:
'''
#install these libraries if they aren't already installed
!pip install zarr
!pip install xarray
!pip install s3fs
!pip install numpy
'''
# Import needed libraries
import xarray as xr
import numpy as np
import s3fs
from datetime import datetime, timedelta
# open the zarr store
url = "s3://noaa-nwm-retrospective-2-1-zarr-pds/chrtout.zarr"
fs = s3fs.S3FileSystem(anon=True)
store = xr.open_zarr(s3fs.S3Map(url, s3=fs))
# Function to get the time series for a specified reach id and and time range
# then write it out to a csv file.
def GetAndWriteTimeSeriesAtReach(reach_id, start_time_index, end_time_index):
flows = streamflow_array.where(feature_id_array==reach_id, drop=True)
df_flows = flows[start_time_index:end_time_index].to_dataframe()
df_flows.to_csv(f'flows_{reach_id}.csv')
# get an xarray array of the various values
time_array = store['time']
feature_id_array = store['feature_id']
streamflow_array = store['streamflow']
# Define the feature IDs to check for
feature_ids = [5781221, 5781223, 5781703]
# Specify the start and end times of interest
start_time = datetime(2015, 5, 23, 0, 0, 0)
end_time = datetime(2015, 6, 24, 0, 0, 0)
# Get the indices for the needed dates
zero_start_time = start_date = datetime(1979, 2, 1, 0, 0, 0)
start_time_index = int((start_time - zero_start_time).total_seconds() / 3600)
end_time_index = int((end_time - zero_start_time).total_seconds() / 3600)
for reach_id in feature_ids:
GetAndWriteTimeSeriesAtReach(reach_id, start_time_index, end_time_index)
'''
Simple Script for Retrieving Retrospective NWM Data from AWS Store
Dan Ames, 11/17/2023
dan.ames@byu.edu
'''