LROSE Workshop # 1 - 2017

First LROSE Project Workshop, NCAR, April 2017

The initial (kick-off) workshop for the LROSE project was held at the NCAR Foothills Laboratory, in Boulder, 2017/04/11 - 2017/04/12.

The aim of this workshop was to discuss with the user community ideas on how best to proceed with the project, including setting goals and priorities.

Agenda day 1 - 2017/04/11

Time Agenda item
08:00 Registration / Coffee
08:30 Welcome address and housekeeping (Michael Bell)
08:40 LROSE Overview and Workshop Goals (Michael Bell)
09:00 Current state of the software (Mike Dixon)
09:30 The Python ARM Radar Toolkit: a community-based architecture for interacting with weather radar data (Scott Collis)
10:00 Coffee break
10:15 WMO and international collaboration (Daniel Michelson)
10:30 Compute pipelines using Pegasus Workflows - an introduction (Karan Vahi)
11:00 Breakout session # 1
12:00 Breakout sessions report back
12:20 Lunch - NCAR Cafeteria
13:30 Defining the spatial properties of precipitation features using data from the WSR-88D network (Corene Matyas)
14:00 Breakout session # 2
15:00 Coffee break
15:15 Breakout sessions report back, and plenary discussion
16:45 Day 1 adjourn


Agenda day 2 - 2017/04/12

Time Agenda item
07:30 Breakfast provided
08:30 Private industry collaboration and Artview (Nick Guy)
09:00 Breakout session # 3
10:00 Breakout sessions report back
10:30 Coffee break
10:50 Plenary session and wrap-up
12:20 Adjourn


Tentative Breakout Session Discussion Topics

Breakout # 1

  1. Working together: LROSE, PyART, BALTRAD
  2. Defining workflows: common tasks for common goals and reproducibility
  3. TBD

Break​out # 2

  1. LROSE internals and design principles
  2. Science priorities: meeting the needs of scientists across disciplines
  3. TBD

Breakout # 3:

  1. Building community: mechanisms for including externally-maintained code
  2. Big ideas: leveraging LROSE for high-risk, high-reward science
  3. TBD

Presentation abstracts

Scott Collis
Argonne National Laboratory
Title:The Python ARM Radar Toolkit: A community based architecture for interacting with weather radar data.

Py-ART is a midsize (100+ users) community (19 contributors) Python toolkit for interacting with data produced by meteorological radars. The original aim was to facilitate the dissemination of algorithm research funded by the Department of Energy's ARM program. Since its public release, approximately four years ago, through careful package management Py-ART has grown in use while maintaining a narrow, maintainable scope. This presentation will outline the philosophy of the package, various techniques to keep the project tractable and a new five year road-map for the future of the project.

Karan Vahi
USC Information Sciences Institute - Pegasus Team
Title: Compute Pipelines using Pegasus Workflows: An Introduction

Workflows are a key technology for enabling complex scientific applications. They capture the interdependencies between processing steps in data analysis and simulation pipelines, as well as the mechanisms to execute those steps reliably and efficiently in a distributed computing environment. They also enable scientists to capture complex processes to promote sharing and reuse, and provide provenance information necessary for the verification of scientific results and scientific reproducibility. The talk will give an introductory overview of Pegasus Workflow Management System (Pegasus WMS Pegasus allows users to design workflows at a high-level of abstraction, which is independent of the resources available to execute them and the location of data and executables. It compiles these abstract workflows to executable workflows that can be deployed onto distributed resources such local campus clusters, computational clouds and grids such as XSEDE and Open Science Grid. During the compilation process, Pegasus does data discovery, whereby it determines the locations of input data files and executables. Data transfer tasks are added to the executable workflow that are responsible for staging in the input files to the cluster, and the generated output files back to a user specified location. In addition to the data transfers tasks, data cleanup (cleanup data that is no longer required) and data registration tasks are also added. Pegasus also captures all the provenance of the pipeline lifecycle from the planning stage, through execution, to the final output data, helping scientists to accurately measure the performance of their pipelines and reconstruct the history of data products. Pegasus provides both command line tools and a web dashboard for debugging and monitoring that allow users to easily detect and debug failures in their pipelines. Pegasus has been used in a number of scientific domains including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science, limnology, and others. Pegasus workflows are also used for automatic quality control analysis of phenotypic data submissions to NRGR a large NIH funded repository.

Corene Matyas and Jingyin Tang
University of Florida
Title: Defining the spatial properties of precipitation features using data
from the WSR-88D network

Geographers specialize in the analysis of spatial patterns. To examine rainfall patterns, climatologists may rely on data interpolated from rain gauges. However, these data do not permit the exploration of features such as convective cells inside of the rainband of a tropical cyclone. These features can be resolved through an analysis of the high spatial and temporal resolution data produced by the WSR-88D network. Although its capabilities to analyze temporal data are somewhat limited, the use of Geographical Information Systems (GIS) by nongeographers including meteorologists is growing. This suggests that the collaboration of geographers specializing in geospatial techniques with those pursuing research in climatology could develop new GIS-based methods for the spatial analysis of radar data that facilitate climate-scale research of precipitation features. This presentation features techniques our research group has developed to quantify the spatial patterns of radar reflectivity values through the calculation of shape metrics for the rain fields of landfalling tropical cyclones.

Notes from Breakout Sessions

Breakout #1 - LROSE, PyArt and BaltRad working together as a community.

  • Common file formats are a necessary 'glue' for communication between packages.
  • We need a validator app for each format, so users can check if they are in compliance.
  • Some duplication of effort can be good, to test out different approaches to a problem.
  • A common scripting layer (e.g. Pegasus) , on top of each system, can provide a uniform approach to running apps in each system.
  • LROSE needs a data discovery / cataloging component.

Breakout #2: Tech Talk

  • Q: will LROSE work for operational as well as archive purposes? A: Yes.
  • Documentation: recommendation to use doc-strings in code.
  • Module size: small enough to be manageable, large enough to keep the module count reasonable.
  • Request: to expose LROSE apps as python modules.

Breakout #3: How to import externally-developed code.

  • For the most part, external modules should be apps, not part of the core library.
  • Use git workflows to manage external work.
  • Integrate and release frequent small changes, rather than waiting for larger modifications.
  • Provide a RoadMap to show our plan of where we want LROSE to go.
  • Provide external developer guidelines early on - don't wait.
  • Good testing of external modules is essential. Nothing accepted without adequate testing.
  • Good documentation is essential. Nothing accepted without good docs.
  • Provide 'Gatekeeper' function - an SE who vets the code before accepting.

Notes from Plenary Sessions

Plenary # 1

  • Provide a central location for documentation.
  • Make it easier to assemble the data sets needed for CIDD and Jazz.
  • Make legacy data sets available on-line.
  • Try to make it easy (a low bar) for new users to join the collaboration.
  • Student input: provide good centralized documentation, FAQ, how have people previously solved things? Try to make things easy to use. Have docs on a wiki for basic operations. Support IDL. Make code easy to install. Have a catalog of tools. Making things easier helps students with time management. Improve editing.
  • To students: if you succeed in an LROSE task, please document your experience so that others can learn from it.
  • Good visualization tools are essential.
  • Try to get OU involved.
  • Keep the wind profiler community involved.
  • Training - keep 'software carpentry' community in mind, follow this model.
  • Give I/O and storage serious consideration. Consider storing data in the cloud.
  • Need multi-Doppler synthesis.
  • Support for airborne platforms.
  • Minimize a user's time on QC, maximize research.
  • Put docs on the wiki in GitHub.
  • Document a procedure for submitting bug fixes - including the testing requirements etc.
  • If users submit a bug, they should also provide data and config files so that the LROSE team can reproduce the bug.
  • Scope and expectations should be managed. Do less, but do it well. Project management is important.
  • Provide Docker and VM support.
  • Use the wiki and issue tracked in GitHub.
  • Try to let development be somewhat organic. We do not want to prescribe how people will use the system.
  • Make visualization quick-look available in Jazz.
  • Provide QC tools for operational purposes.
  • Choose a few things to do, and do them well. Scope limits are important.
  • Ease of use is important. Some legacy apps no longer compile.
  • (Private conversation: we need to let some legacy software go, we cannot continue to support it all.)
  • Consider making sub-groups of users - e.g. model assimilation, airborne.
  • Important for LROSE to embrace good data exchange.
  • Ask the following: at the end of 4 years, how will things be different? How will our work make things better?
  • Provide format converter in Docker container.
  • Perform continuous integration testing.

Plenary # 2.

  • Assemble white paper.
  • Make sure lrose users list is up to date.
  • Consider short courses to spread the word. (this will need to wait until we have some good examples to present).
  • Opportunities: AMS radar conference, Chicago - submit abstract on the workshop. There is a 'radar software' line item in the conference.
  • Building identity is important - Make Solo Great Again. Branding.
  • Consider Facebook and Twitter.
  • Revisit the Open Radar Wiki.
  • Keep barrier to entry low, encourage student users to contribute to the forum. What are the barriers?
  • Encourage people to contribute earlier rather than later - smaller changes more often.
  • Build a sense of community.
  • Before next workshop, perform outreach to the modeling community. EOL has an ongoing collaboration with RAL and MMM on data assimilation from radar. Develop uncertainty metrics for radar fields - this is ongoing.

Goals resulting from workshop

  • Primary goal - enable science.
  • Secondary goal - grow number of users and contributors.

List of attendees

Name                     Email                                Organization
Angela Rowe                      University of Washington
Anthony Didlake                      Penn State University
Arthur Eiserloh             San Jose State University
Brad Schoenrock                       NCAR/EOL
Brenda Dolan              Colorado State University
Brianna Lund                          University of Alabama in Huntsville
Bruno Melli                    Colorado State University
Corene Matyas                         University of Florida
Courtney Laughlin                   Center for Severe Weather Research
Dan Stechman                   University of Illinois
David Kingsmill         University of Colorado
David Plummer                      University of Wyoming
David Yates                           NCAR/RAL
Eleanor Delap            Colorado State University
Erik Johnson                             NCAR/EOL
Frank Hage                          Private consultant
Frank Marks                     NOAA/AOML/HRD
Frederick Iat Hin-Tam                    National Taiwan University
Gary Cunning                        NCAR/RAL
Hannah Barnes                 Pacific Northwest National Lab
Haonan Chen                Colorado State University
Ivan Arias Hernandez               Colorado State University
James Marquis                      CSWR/UC Boulder
Jennifer DeHart                      University of Washington
Jennifer L Davison                     Lower Atmosphere Research Group
Jim Wilson                          NCAR/EOL
Jingyin Tang                       University of Florida
John Gamache                   NOAA/AOML/HRD
Jon Martinez              Colorado State University
Josh Aikins               University of Colorado/NOAA PSD
Josh Carnes                         NCAR/EOL
Karan Vahi                              USC Information Sciences Inst. Pega
Karen Kosiba                       Center for Severe Weather Reserach
Kristen Rasmussen          Colorado State University
Larry Oolman                       University of Wyoming
Michael Bell                    Colorado State University
Mike Dixon                            NCAR/EOL
Nancy Rehak             Global Weather Corp
Naufal Razin                    Colorado State University
Nick Guy                        The Climate Corporation
Paul Robinson                     Center For Severe Weather Research
Peter Dodge                     NOAA/AOML/HRD
Roelof Burger                North West University South Africa
Ryan Gooch                    Colorado State University
Samuel Haimov                        University of Wyoming
Scott Collis                         Argonne National Laboratory
Scott Pearse                         NCAR/CISL
Sounak Biswas            Colorado State University
Stacy Brodzik                         University of Washington
Stephen Herbener       Colorado State University
Tammy Weckwerth                       NCAR/EOL
Ting Yu Cha                Colorado State University
Trevor White                         Center for Severe Weather Research
Ulrike Romatschke                    NCAR/EOL
Wen-Chau Lee                        NCAR/EOL