First LROSE Project Workshop, NCAR, April 2017
The initial (kick-off) workshop for the LROSE project was held at the NCAR Foothills Laboratory, in Boulder, 2017/04/11 - 2017/04/12.
The aim of this workshop was to discuss with the user community ideas on how best to proceed with the project, including setting goals and priorities.
Agenda day 1 - 2017/04/11
|08:00||Registration / Coffee|
|08:30||Welcome address and housekeeping (Michael Bell)|
|08:40||LROSE Overview and Workshop Goals (Michael Bell)|
|09:00||Current state of the software (Mike Dixon)|
|09:30||The Python ARM Radar Toolkit: a community-based architecture for interacting with weather radar data (Scott Collis)|
|10:15||WMO and international collaboration (Daniel Michelson)|
|10:30||Compute pipelines using Pegasus Workflows - an introduction (Karan Vahi)|
|11:00||Breakout session # 1|
|12:00||Breakout sessions report back|
|12:20||Lunch - NCAR Cafeteria|
|13:30||Defining the spatial properties of precipitation features using data from the WSR-88D network (Corene Matyas)|
|14:00||Breakout session # 2|
|15:15||Breakout sessions report back, and plenary discussion|
|16:45||Day 1 adjourn|
Agenda day 2 - 2017/04/12
|08:30||Private industry collaboration and Artview (Nick Guy)|
|09:00||Breakout session # 3|
|10:00||Breakout sessions report back|
|10:50||Plenary session and wrap-up|
Tentative Breakout Session Discussion Topics
Breakout # 1
- Working together: LROSE, PyART, BALTRAD
- Defining workflows: common tasks for common goals and reproducibility
Breakout # 2
- LROSE internals and design principles
- Science priorities: meeting the needs of scientists across disciplines
Breakout # 3:
- Building community: mechanisms for including externally-maintained code
- Big ideas: leveraging LROSE for high-risk, high-reward science
Argonne National Laboratory
Title:The Python ARM Radar Toolkit: A community based architecture for interacting with weather radar data.
Py-ART is a midsize (100+ users) community (19 contributors) Python toolkit for interacting with data produced by meteorological radars. The original aim was to facilitate the dissemination of algorithm research funded by the Department of Energy's ARM program. Since its public release, approximately four years ago, through careful package management Py-ART has grown in use while maintaining a narrow, maintainable scope. This presentation will outline the philosophy of the package, various techniques to keep the project tractable and a new five year road-map for the future of the project.
USC Information Sciences Institute - Pegasus Team
Title: Compute Pipelines using Pegasus Workflows: An Introduction
Workflows are a key technology for enabling complex scientific applications. They capture the interdependencies between processing steps in data analysis and simulation pipelines, as well as the mechanisms to execute those steps reliably and efficiently in a distributed computing environment. They also enable scientists to capture complex processes to promote sharing and reuse, and provide provenance information necessary for the verification of scientific results and scientific reproducibility. The talk will give an introductory overview of Pegasus Workflow Management System (Pegasus WMS http://pegasus.isi.edu). Pegasus allows users to design workflows at a high-level of abstraction, which is independent of the resources available to execute them and the location of data and executables. It compiles these abstract workflows to executable workflows that can be deployed onto distributed resources such local campus clusters, computational clouds and grids such as XSEDE and Open Science Grid. During the compilation process, Pegasus does data discovery, whereby it determines the locations of input data files and executables. Data transfer tasks are added to the executable workflow that are responsible for staging in the input files to the cluster, and the generated output files back to a user specified location. In addition to the data transfers tasks, data cleanup (cleanup data that is no longer required) and data registration tasks are also added. Pegasus also captures all the provenance of the pipeline lifecycle from the planning stage, through execution, to the final output data, helping scientists to accurately measure the performance of their pipelines and reconstruct the history of data products. Pegasus provides both command line tools and a web dashboard for debugging and monitoring that allow users to easily detect and debug failures in their pipelines. Pegasus has been used in a number of scientific domains including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science, limnology, and others. Pegasus workflows are also used for automatic quality control analysis of phenotypic data submissions to NRGR a large NIH funded repository.
Corene Matyas and Jingyin Tang
University of Florida
Title: Defining the spatial properties of precipitation features using data
from the WSR-88D network
Geographers specialize in the analysis of spatial patterns. To examine rainfall patterns, climatologists may rely on data interpolated from rain gauges. However, these data do not permit the exploration of features such as convective cells inside of the rainband of a tropical cyclone. These features can be resolved through an analysis of the high spatial and temporal resolution data produced by the WSR-88D network. Although its capabilities to analyze temporal data are somewhat limited, the use of Geographical Information Systems (GIS) by nongeographers including meteorologists is growing. This suggests that the collaboration of geographers specializing in geospatial techniques with those pursuing research in climatology could develop new GIS-based methods for the spatial analysis of radar data that facilitate climate-scale research of precipitation features. This presentation features techniques our research group has developed to quantify the spatial patterns of radar reflectivity values through the calculation of shape metrics for the rain fields of landfalling tropical cyclones.
Notes from Breakout Sessions
Breakout #1 - LROSE, PyArt and BaltRad working together as a community.
- Common file formats are a necessary 'glue' for communication between packages.
- We need a validator app for each format, so users can check if they are in compliance.
- Some duplication of effort can be good, to test out different approaches to a problem.
- A common scripting layer (e.g. Pegasus) , on top of each system, can provide a uniform approach to running apps in each system.
- LROSE needs a data discovery / cataloging component.
Breakout #2: Tech Talk
- Q: will LROSE work for operational as well as archive purposes? A: Yes.
- Documentation: recommendation to use doc-strings in code.
- Module size: small enough to be manageable, large enough to keep the module count reasonable.
- Request: to expose LROSE apps as python modules.
Breakout #3: How to import externally-developed code.
- For the most part, external modules should be apps, not part of the core library.
- Use git workflows to manage external work.
- Integrate and release frequent small changes, rather than waiting for larger modifications.
- Provide a RoadMap to show our plan of where we want LROSE to go.
- Provide external developer guidelines early on - don't wait.
- Good testing of external modules is essential. Nothing accepted without adequate testing.
- Good documentation is essential. Nothing accepted without good docs.
- Provide 'Gatekeeper' function - an SE who vets the code before accepting.
Notes from Plenary Sessions
Plenary # 1
- Provide a central location for documentation.
- Make it easier to assemble the data sets needed for CIDD and Jazz.
- Make legacy data sets available on-line.
- Try to make it easy (a low bar) for new users to join the collaboration.
- Student input: provide good centralized documentation, FAQ, how have people previously solved things? Try to make things easy to use. Have docs on a wiki for basic operations. Support IDL. Make code easy to install. Have a catalog of tools. Making things easier helps students with time management. Improve editing.
- To students: if you succeed in an LROSE task, please document your experience so that others can learn from it.
- Good visualization tools are essential.
- Try to get OU involved.
- Keep the wind profiler community involved.
- Training - keep 'software carpentry' community in mind, follow this model.
- Give I/O and storage serious consideration. Consider storing data in the cloud.
- Need multi-Doppler synthesis.
- Support for airborne platforms.
- Minimize a user's time on QC, maximize research.
- Put docs on the wiki in GitHub.
- Document a procedure for submitting bug fixes - including the testing requirements etc.
- If users submit a bug, they should also provide data and config files so that the LROSE team can reproduce the bug.
- Scope and expectations should be managed. Do less, but do it well. Project management is important.
- Provide Docker and VM support.
- Use the wiki and issue tracked in GitHub.
- Try to let development be somewhat organic. We do not want to prescribe how people will use the system.
- Make visualization quick-look available in Jazz.
- Provide QC tools for operational purposes.
- Choose a few things to do, and do them well. Scope limits are important.
- Ease of use is important. Some legacy apps no longer compile.
- (Private conversation: we need to let some legacy software go, we cannot continue to support it all.)
- Consider making sub-groups of users - e.g. model assimilation, airborne.
- Important for LROSE to embrace good data exchange.
- Ask the following: at the end of 4 years, how will things be different? How will our work make things better?
- Provide format converter in Docker container.
- Perform continuous integration testing.
Plenary # 2.
- Assemble white paper.
- Make sure lrose users list is up to date.
- Consider short courses to spread the word. (this will need to wait until we have some good examples to present).
- Opportunities: AMS radar conference, Chicago - submit abstract on the workshop. There is a 'radar software' line item in the conference.
- Building identity is important - Make Solo Great Again. Branding.
- Consider Facebook and Twitter.
- Revisit the Open Radar Wiki.
- Keep barrier to entry low, encourage student users to contribute to the forum. What are the barriers?
- Encourage people to contribute earlier rather than later - smaller changes more often.
- Build a sense of community.
- Before next workshop, perform outreach to the modeling community. EOL has an ongoing collaboration with RAL and MMM on data assimilation from radar. Develop uncertainty metrics for radar fields - this is ongoing.
Goals resulting from workshop
- Primary goal - enable science.
- Secondary goal - grow number of users and contributors.
List of attendees
Name Email Organization
Angela Rowe firstname.lastname@example.org University of Washington
Anthony Didlake email@example.com Penn State University
Arthur Eiserloh firstname.lastname@example.org San Jose State University
Brad Schoenrock email@example.com NCAR/EOL
Brenda Dolan firstname.lastname@example.org Colorado State University
Brianna Lund email@example.com University of Alabama in Huntsville
Bruno Melli firstname.lastname@example.org Colorado State University
Corene Matyas email@example.com University of Florida
Courtney Laughlin firstname.lastname@example.org Center for Severe Weather Research
Dan Stechman email@example.com University of Illinois
David Kingsmill firstname.lastname@example.org University of Colorado
David Plummer email@example.com University of Wyoming
David Yates firstname.lastname@example.org NCAR/RAL
Eleanor Delap email@example.com Colorado State University
Erik Johnson firstname.lastname@example.org NCAR/EOL
Frank Hage email@example.com Private consultant
Frank Marks firstname.lastname@example.org NOAA/AOML/HRD
Frederick Iat Hin-Tam email@example.com National Taiwan University
Gary Cunning firstname.lastname@example.org NCAR/RAL
Hannah Barnes email@example.com Pacific Northwest National Lab
Haonan Chen firstname.lastname@example.org Colorado State University
Ivan Arias Hernandez email@example.com Colorado State University
James Marquis firstname.lastname@example.org CSWR/UC Boulder
Jennifer DeHart email@example.com University of Washington
Jennifer L Davison firstname.lastname@example.org Lower Atmosphere Research Group
Jim Wilson email@example.com NCAR/EOL
Jingyin Tang firstname.lastname@example.org University of Florida
John Gamache email@example.com NOAA/AOML/HRD
Jon Martinez firstname.lastname@example.org Colorado State University
Josh Aikins email@example.com University of Colorado/NOAA PSD
Josh Carnes firstname.lastname@example.org NCAR/EOL
Karan Vahi email@example.com USC Information Sciences Inst. Pega
Karen Kosiba firstname.lastname@example.org Center for Severe Weather Reserach
Kristen Rasmussen email@example.com Colorado State University
Larry Oolman firstname.lastname@example.org University of Wyoming
Michael Bell email@example.com Colorado State University
Mike Dixon firstname.lastname@example.org NCAR/EOL
Nancy Rehak email@example.com Global Weather Corp
Naufal Razin firstname.lastname@example.org Colorado State University
Nick Guy email@example.com The Climate Corporation
Paul Robinson firstname.lastname@example.org Center For Severe Weather Research
Peter Dodge email@example.com NOAA/AOML/HRD
Roelof Burger firstname.lastname@example.org North West University South Africa
Ryan Gooch email@example.com Colorado State University
Samuel Haimov firstname.lastname@example.org University of Wyoming
Scott Collis email@example.com Argonne National Laboratory
Scott Pearse firstname.lastname@example.org NCAR/CISL
Sounak Biswas email@example.com Colorado State University
Stacy Brodzik firstname.lastname@example.org University of Washington
Stephen Herbener email@example.com Colorado State University
Tammy Weckwerth firstname.lastname@example.org NCAR/EOL
Ting Yu Cha email@example.com Colorado State University
Trevor White firstname.lastname@example.org Center for Severe Weather Research
Ulrike Romatschke email@example.com NCAR/EOL
Wen-Chau Lee firstname.lastname@example.org NCAR/EOL