LROSE Workshop # 1 - 2017


First LROSE Project Workshop, NCAR, April 2017

The initial (kick-off) workshop for the LROSE project was held at the NCAR Foothills Laboratory, in Boulder, 2017/04/11 - 2017/04/12.

The aim of this workshop was to discuss with the user community ideas on how best to proceed with the project, including setting goals and priorities.

Agenda day 1 - 2017/04/11

Time Agenda item
08:00 Registration / Coffee
08:30 Welcome address and housekeeping (Michael Bell)
08:40 LROSE Overview and Workshop Goals (Michael Bell)
09:00 Current state of the software (Mike Dixon)
09:30 The Python ARM Radar Toolkit: a community-based architecture for interacting with weather radar data (Scott Collis)
10:00 Coffee break
10:15 WMO and international collaboration (Daniel Michelson)
10:30 Compute pipelines using Pegasus Workflows - an introduction (Karan Vahi)
11:00 Breakout session # 1
12:00 Breakout sessions report back
12:20 Lunch - NCAR Cafeteria
13:30 Defining the spatial properties of precipitation features using data from the WSR-88D network (Corene Matyas)
14:00 Breakout session # 2
15:00 Coffee break
15:15 Breakout sessions report back, and plenary discussion
16:45 Day 1 adjourn

 

Agenda day 2 - 2017/04/12

Time Agenda item
07:30 Breakfast provided
08:30 Private industry collaboration and Artview (Nick Guy)
09:00 Breakout session # 3
10:00 Breakout sessions report back
10:30 Coffee break
10:50 Plenary session and wrap-up
12:20 Adjourn

 

Tentative Breakout Session Discussion Topics

Breakout # 1

  1. Working together: LROSE, PyART, BALTRAD
  2. Defining workflows: common tasks for common goals and reproducibility
  3. TBD

Break‚Äčout # 2

  1. LROSE internals and design principles
  2. Science priorities: meeting the needs of scientists across disciplines
  3. TBD

Breakout # 3:

  1. Building community: mechanisms for including externally-maintained code
  2. Big ideas: leveraging LROSE for high-risk, high-reward science
  3. TBD

Presentation abstracts

Scott Collis
Argonne National Laboratory
Title:The Python ARM Radar Toolkit: A community based architecture for interacting with weather radar data.

Py-ART is a midsize (100+ users) community (19 contributors) Python toolkit for interacting with data produced by meteorological radars. The original aim was to facilitate the dissemination of algorithm research funded by the Department of Energy's ARM program. Since its public release, approximately four years ago, through careful package management Py-ART has grown in use while maintaining a narrow, maintainable scope. This presentation will outline the philosophy of the package, various techniques to keep the project tractable and a new five year road-map for the future of the project.

Karan Vahi
USC Information Sciences Institute - Pegasus Team
Title: Compute Pipelines using Pegasus Workflows: An Introduction

Workflows are a key technology for enabling complex scientific applications. They capture the interdependencies between processing steps in data analysis and simulation pipelines, as well as the mechanisms to execute those steps reliably and efficiently in a distributed computing environment. They also enable scientists to capture complex processes to promote sharing and reuse, and provide provenance information necessary for the verification of scientific results and scientific reproducibility. The talk will give an introductory overview of Pegasus Workflow Management System (Pegasus WMS http://pegasus.isi.edu). Pegasus allows users to design workflows at a high-level of abstraction, which is independent of the resources available to execute them and the location of data and executables. It compiles these abstract workflows to executable workflows that can be deployed onto distributed resources such local campus clusters, computational clouds and grids such as XSEDE and Open Science Grid. During the compilation process, Pegasus does data discovery, whereby it determines the locations of input data files and executables. Data transfer tasks are added to the executable workflow that are responsible for staging in the input files to the cluster, and the generated output files back to a user specified location. In addition to the data transfers tasks, data cleanup (cleanup data that is no longer required) and data registration tasks are also added. Pegasus also captures all the provenance of the pipeline lifecycle from the planning stage, through execution, to the final output data, helping scientists to accurately measure the performance of their pipelines and reconstruct the history of data products. Pegasus provides both command line tools and a web dashboard for debugging and monitoring that allow users to easily detect and debug failures in their pipelines. Pegasus has been used in a number of scientific domains including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science, limnology, and others. Pegasus workflows are also used for automatic quality control analysis of phenotypic data submissions to NRGR a large NIH funded repository.

Corene Matyas and Jingyin Tang
University of Florida
Title: Defining the spatial properties of precipitation features using data
from the WSR-88D network

Geographers specialize in the analysis of spatial patterns. To examine rainfall patterns, climatologists may rely on data interpolated from rain gauges. However, these data do not permit the exploration of features such as convective cells inside of the rainband of a tropical cyclone. These features can be resolved through an analysis of the high spatial and temporal resolution data produced by the WSR-88D network. Although its capabilities to analyze temporal data are somewhat limited, the use of Geographical Information Systems (GIS) by nongeographers including meteorologists is growing. This suggests that the collaboration of geographers specializing in geospatial techniques with those pursuing research in climatology could develop new GIS-based methods for the spatial analysis of radar data that facilitate climate-scale research of precipitation features. This presentation features techniques our research group has developed to quantify the spatial patterns of radar reflectivity values through the calculation of shape metrics for the rain fields of landfalling tropical cyclones.

Notes from Breakout Sessions

Breakout #1 - LROSE, PyArt and BaltRad working together as a community.

  • Common file formats are a necessary 'glue' for communication between packages.
  • We need a validator app for each format, so users can check if they are in compliance.
  • Some duplication of effort can be good, to test out different approaches to a problem.
  • A common scripting layer (e.g. Pegasus) , on top of each system, can provide a uniform approach to running apps in each system.
  • LROSE needs a data discovery / cataloging component.

Breakout #2: Tech Talk

  • Q: will LROSE work for operational as well as archive purposes? A: Yes.
  • Documentation: recommendation to use doc-strings in code.
  • Module size: small enough to be manageable, large enough to keep the module count reasonable.
  • Request: to expose LROSE apps as python modules.

Breakout #3: How to import externally-developed code.

  • For the most part, external modules should be apps, not part of the core library.
  • Use git workflows to manage external work.
  • Integrate and release frequent small changes, rather than waiting for larger modifications.
  • Provide a RoadMap to show our plan of where we want LROSE to go.
  • Provide external developer guidelines early on - don't wait.
  • Good testing of external modules is essential. Nothing accepted without adequate testing.
  • Good documentation is essential. Nothing accepted without good docs.
  • Provide 'Gatekeeper' function - an SE who vets the code before accepting.

Notes from Plenary Sessions

Plenary # 1

  • Provide a central location for documentation.
  • Make it easier to assemble the data sets needed for CIDD and Jazz.
  • Make legacy data sets available on-line.
  • Try to make it easy (a low bar) for new users to join the collaboration.
  • Student input: provide good centralized documentation, FAQ, how have people previously solved things? Try to make things easy to use. Have docs on a wiki for basic operations. Support IDL. Make code easy to install. Have a catalog of tools. Making things easier helps students with time management. Improve editing.
  • To students: if you succeed in an LROSE task, please document your experience so that others can learn from it.
  • Good visualization tools are essential.
  • Try to get OU involved.
  • Keep the wind profiler community involved.
  • Training - keep 'software carpentry' community in mind, follow this model.
  • Give I/O and storage serious consideration. Consider storing data in the cloud.
  • Need multi-Doppler synthesis.
  • Support for airborne platforms.
  • Minimize a user's time on QC, maximize research.
  • Put docs on the wiki in GitHub.
  • Document a procedure for submitting bug fixes - including the testing requirements etc.
  • If users submit a bug, they should also provide data and config files so that the LROSE team can reproduce the bug.
  • Scope and expectations should be managed. Do less, but do it well. Project management is important.
  • Provide Docker and VM support.
  • Use the wiki and issue tracked in GitHub.
  • Try to let development be somewhat organic. We do not want to prescribe how people will use the system.
  • Make visualization quick-look available in Jazz.
  • Provide QC tools for operational purposes.
  • Choose a few things to do, and do them well. Scope limits are important.
  • Ease of use is important. Some legacy apps no longer compile.
  • (Private conversation: we need to let some legacy software go, we cannot continue to support it all.)
  • Consider making sub-groups of users - e.g. model assimilation, airborne.
  • Important for LROSE to embrace good data exchange.
  • Ask the following: at the end of 4 years, how will things be different? How will our work make things better?
  • Provide format converter in Docker container.
  • Perform continuous integration testing.

Plenary # 2.

  • Assemble white paper.
  • Make sure lrose users list is up to date.
  • Consider short courses to spread the word. (this will need to wait until we have some good examples to present).
  • Opportunities: AMS radar conference, Chicago - submit abstract on the workshop. There is a 'radar software' line item in the conference.
  • Building identity is important - Make Solo Great Again. Branding.
  • Consider Facebook and Twitter.
  • Revisit the Open Radar Wiki.
  • Keep barrier to entry low, encourage student users to contribute to the forum. What are the barriers?
  • Encourage people to contribute earlier rather than later - smaller changes more often.
  • Build a sense of community.
  • Before next workshop, perform outreach to the modeling community. EOL has an ongoing collaboration with RAL and MMM on data assimilation from radar. Develop uncertainty metrics for radar fields - this is ongoing.

Goals resulting from workshop

  • Primary goal - enable science.
  • Secondary goal - grow number of users and contributors.

List of attendees

Name                     Email                                Organization
Angela Rowe              akrowe@atmos.uw.edu                  University of Washington
Anthony Didlake          didlake@psu.edu                      Penn State University
Arthur Eiserloh          arthur.eiserloh@sjsu.edu             San Jose State University
Brad Schoenrock          brads@ucar.edu                       NCAR/EOL
Brenda Dolan             bdolan@atmos.colostate.edu           Colorado State University
Brianna Lund             bl0027@uah.edu                       University of Alabama in Huntsville
Bruno Melli              bpmelli@colostate.edu                Colorado State University
Corene Matyas            matyas@ufl.edu                       University of Florida
Courtney Laughlin        claughlin@cswr.org                   Center for Severe Weather Research
Dan Stechman             stechma2@illinois.edu                University of Illinois
David Kingsmill          david.kingsmill@colorado.edu         University of Colorado
David Plummer            dplumme1@uwyo.edu                    University of Wyoming
David Yates              yates@ucar.edu                       NCAR/RAL
Eleanor Delap            eleanor.delap@colostate.edu          Colorado State University
Erik Johnson             ej@ucar.edu                          NCAR/EOL
Frank Hage               fwhage@gmail.com                     Private consultant
Frank Marks              frank.marks@noaa.gov                 NOAA/AOML/HRD
Frederick Iat Hin-Tam    ft21894@gmail.com                    National Taiwan University
Gary Cunning             cunning@ucar.edu                     NCAR/RAL
Hannah Barnes            hannah.barnes@pnnl.gov               Pacific Northwest National Lab
Haonan Chen              haonan.chen@colostate.edu            Colorado State University
Ivan Arias Hernandez     idariash@colostate.edu               Colorado State University
James Marquis            jmarquis@cswr.org                    CSWR/UC Boulder
Jennifer DeHart          jcdehart@uw.edu                      University of Washington
Jennifer L Davison       jldavison@me.com                     Lower Atmosphere Research Group
Jim Wilson               jwilson@ucar.edu                     NCAR/EOL
Jingyin Tang             jtang8756@ufl.edu                    University of Florida
John Gamache             john.gamache@noaa.gov                NOAA/AOML/HRD
Jon Martinez             jon.martinez@colostate.edu           Colorado State University
Josh Aikins              joshua.aikins@colorado.edu           University of Colorado/NOAA PSD
Josh Carnes              jcarnes@ucar.edu                     NCAR/EOL
Karan Vahi               vahi@isi.edu                         USC Information Sciences Inst. Pega
Karen Kosiba             kakosiba@cswr.org                    Center for Severe Weather Reserach
Kristen Rasmussen        kristenr@rams.colostate.edu          Colorado State University
Larry Oolman             ldoolman@uwyo.edu                    University of Wyoming
Michael Bell             mmbell@colostate.edu                 Colorado State University
Mike Dixon               dixon@ucar.edu                       NCAR/EOL
Nancy Rehak              nrehak@globalweathercorp.com         Global Weather Corp
Naufal Razin             naufal@colostate.edu                 Colorado State University
Nick Guy                 nick.guy@climate.com                 The Climate Corporation
Paul Robinson            robinsonp@cswr.org                   Center For Severe Weather Research
Peter Dodge              peter.dodge@noaa.gov                 NOAA/AOML/HRD
Roelof Burger            roelof.burger@nwu.ac.za              North West University South Africa
Ryan Gooch               s.ryan.gooch@gmail.com               Colorado State University
Samuel Haimov            haimov@uwyo.edu                      University of Wyoming
Scott Collis             scollis@anl.gov                      Argonne National Laboratory
Scott Pearse             pearse@ucar.edu                      NCAR/CISL
Sounak Biswas            sounak.biswas@colostate.edu          Colorado State University
Stacy Brodzik            brodzik@uw.edu                       University of Washington
Stephen Herbener         stephen.herbener@colostate.edu       Colorado State University
Tammy Weckwerth          tammy@ucar.edu                       NCAR/EOL
Ting Yu Cha              tingyu@rams.colostate.edu            Colorado State University
Trevor White             twhite@cswr.org                      Center for Severe Weather Research
Ulrike Romatschke        romatsch@ucar.edu                    NCAR/EOL
Wen-Chau Lee             wenchau@ucar.edu                     NCAR/EOL