First LROSE Project Workshop, NCAR, April 2017
The initial (kick-off) workshop for the LROSE project was held at the NCAR Foothills Laboratory, in Boulder, 2017/04/11 - 2017/04/12.
The aim of this workshop was to discuss with the user community ideas on how best to proceed with the project, including setting goals and priorities.
Agenda day 1 - 2017/04/11
Time | Agenda item |
---|---|
08:00 | Registration / Coffee |
08:30 | Welcome address and housekeeping (Michael Bell) |
08:40 | LROSE Overview and Workshop Goals (Michael Bell) |
09:00 | Current state of the software (Mike Dixon) |
09:30 | The Python ARM Radar Toolkit: a community-based architecture for interacting with weather radar data (Scott Collis) |
10:00 | Coffee break |
10:15 | WMO and international collaboration (Daniel Michelson) |
10:30 | Compute pipelines using Pegasus Workflows - an introduction (Karan Vahi) |
11:00 | Breakout session # 1 |
12:00 | Breakout sessions report back |
12:20 | Lunch - NCAR Cafeteria |
13:30 | Defining the spatial properties of precipitation features using data from the WSR-88D network (Corene Matyas) |
14:00 | Breakout session # 2 |
15:00 | Coffee break |
15:15 | Breakout sessions report back, and plenary discussion |
16:45 | Day 1 adjourn |
Agenda day 2 - 2017/04/12
Time | Agenda item |
---|---|
07:30 | Breakfast provided |
08:30 | Private industry collaboration and Artview (Nick Guy) |
09:00 | Breakout session # 3 |
10:00 | Breakout sessions report back |
10:30 | Coffee break |
10:50 | Plenary session and wrap-up |
12:20 | Adjourn |
Tentative Breakout Session Discussion Topics
Breakout # 1
- Working together: LROSE, PyART, BALTRAD
- Defining workflows: common tasks for common goals and reproducibility
- TBD
Breakout # 2
- LROSE internals and design principles
- Science priorities: meeting the needs of scientists across disciplines
- TBD
Breakout # 3:
- Building community: mechanisms for including externally-maintained code
- Big ideas: leveraging LROSE for high-risk, high-reward science
- TBD
Presentation abstracts
Scott Collis
Argonne National Laboratory
Title:The Python ARM Radar Toolkit: A community based architecture for interacting with weather radar data.
Py-ART is a midsize (100+ users) community (19 contributors) Python toolkit for interacting with data produced by meteorological radars. The original aim was to facilitate the dissemination of algorithm research funded by the Department of Energy's ARM program. Since its public release, approximately four years ago, through careful package management Py-ART has grown in use while maintaining a narrow, maintainable scope. This presentation will outline the philosophy of the package, various techniques to keep the project tractable and a new five year road-map for the future of the project.
Karan Vahi
USC Information Sciences Institute - Pegasus Team
Title: Compute Pipelines using Pegasus Workflows: An Introduction
Workflows are a key technology for enabling complex scientific applications. They capture the interdependencies between processing steps in data analysis and simulation pipelines, as well as the mechanisms to execute those steps reliably and efficiently in a distributed computing environment. They also enable scientists to capture complex processes to promote sharing and reuse, and provide provenance information necessary for the verification of scientific results and scientific reproducibility. The talk will give an introductory overview of Pegasus Workflow Management System (Pegasus WMS http://pegasus.isi.edu). Pegasus allows users to design workflows at a high-level of abstraction, which is independent of the resources available to execute them and the location of data and executables. It compiles these abstract workflows to executable workflows that can be deployed onto distributed resources such local campus clusters, computational clouds and grids such as XSEDE and Open Science Grid. During the compilation process, Pegasus does data discovery, whereby it determines the locations of input data files and executables. Data transfer tasks are added to the executable workflow that are responsible for staging in the input files to the cluster, and the generated output files back to a user specified location. In addition to the data transfers tasks, data cleanup (cleanup data that is no longer required) and data registration tasks are also added. Pegasus also captures all the provenance of the pipeline lifecycle from the planning stage, through execution, to the final output data, helping scientists to accurately measure the performance of their pipelines and reconstruct the history of data products. Pegasus provides both command line tools and a web dashboard for debugging and monitoring that allow users to easily detect and debug failures in their pipelines. Pegasus has been used in a number of scientific domains including astronomy, bioinformatics, earthquake science, gravitational wave physics, ocean science, limnology, and others. Pegasus workflows are also used for automatic quality control analysis of phenotypic data submissions to NRGR a large NIH funded repository.
Corene Matyas and Jingyin Tang
University of Florida
Title: Defining the spatial properties of precipitation features using data
from the WSR-88D network
Geographers specialize in the analysis of spatial patterns. To examine rainfall patterns, climatologists may rely on data interpolated from rain gauges. However, these data do not permit the exploration of features such as convective cells inside of the rainband of a tropical cyclone. These features can be resolved through an analysis of the high spatial and temporal resolution data produced by the WSR-88D network. Although its capabilities to analyze temporal data are somewhat limited, the use of Geographical Information Systems (GIS) by nongeographers including meteorologists is growing. This suggests that the collaboration of geographers specializing in geospatial techniques with those pursuing research in climatology could develop new GIS-based methods for the spatial analysis of radar data that facilitate climate-scale research of precipitation features. This presentation features techniques our research group has developed to quantify the spatial patterns of radar reflectivity values through the calculation of shape metrics for the rain fields of landfalling tropical cyclones.
Notes from Breakout Sessions
Breakout #1 - LROSE, PyArt and BaltRad working together as a community.
- Common file formats are a necessary 'glue' for communication between packages.
- We need a validator app for each format, so users can check if they are in compliance.
- Some duplication of effort can be good, to test out different approaches to a problem.
- A common scripting layer (e.g. Pegasus) , on top of each system, can provide a uniform approach to running apps in each system.
- LROSE needs a data discovery / cataloging component.
Breakout #2: Tech Talk
- Q: will LROSE work for operational as well as archive purposes? A: Yes.
- Documentation: recommendation to use doc-strings in code.
- Module size: small enough to be manageable, large enough to keep the module count reasonable.
- Request: to expose LROSE apps as python modules.
Breakout #3: How to import externally-developed code.
- For the most part, external modules should be apps, not part of the core library.
- Use git workflows to manage external work.
- Integrate and release frequent small changes, rather than waiting for larger modifications.
- Provide a RoadMap to show our plan of where we want LROSE to go.
- Provide external developer guidelines early on - don't wait.
- Good testing of external modules is essential. Nothing accepted without adequate testing.
- Good documentation is essential. Nothing accepted without good docs.
- Provide 'Gatekeeper' function - an SE who vets the code before accepting.
Notes from Plenary Sessions
Plenary # 1
- Provide a central location for documentation.
- Make it easier to assemble the data sets needed for CIDD and Jazz.
- Make legacy data sets available on-line.
- Try to make it easy (a low bar) for new users to join the collaboration.
- Student input: provide good centralized documentation, FAQ, how have people previously solved things? Try to make things easy to use. Have docs on a wiki for basic operations. Support IDL. Make code easy to install. Have a catalog of tools. Making things easier helps students with time management. Improve editing.
- To students: if you succeed in an LROSE task, please document your experience so that others can learn from it.
- Good visualization tools are essential.
- Try to get OU involved.
- Keep the wind profiler community involved.
- Training - keep 'software carpentry' community in mind, follow this model.
- Give I/O and storage serious consideration. Consider storing data in the cloud.
- Need multi-Doppler synthesis.
- Support for airborne platforms.
- Minimize a user's time on QC, maximize research.
- Put docs on the wiki in GitHub.
- Document a procedure for submitting bug fixes - including the testing requirements etc.
- If users submit a bug, they should also provide data and config files so that the LROSE team can reproduce the bug.
- Scope and expectations should be managed. Do less, but do it well. Project management is important.
- Provide Docker and VM support.
- Use the wiki and issue tracked in GitHub.
- Try to let development be somewhat organic. We do not want to prescribe how people will use the system.
- Make visualization quick-look available in Jazz.
- Provide QC tools for operational purposes.
- Choose a few things to do, and do them well. Scope limits are important.
- Ease of use is important. Some legacy apps no longer compile.
- (Private conversation: we need to let some legacy software go, we cannot continue to support it all.)
- Consider making sub-groups of users - e.g. model assimilation, airborne.
- Important for LROSE to embrace good data exchange.
- Ask the following: at the end of 4 years, how will things be different? How will our work make things better?
- Provide format converter in Docker container.
- Perform continuous integration testing.
Plenary # 2.
- Assemble white paper.
- Make sure lrose users list is up to date.
- Consider short courses to spread the word. (this will need to wait until we have some good examples to present).
- Opportunities: AMS radar conference, Chicago - submit abstract on the workshop. There is a 'radar software' line item in the conference.
- Building identity is important - Make Solo Great Again. Branding.
- Consider Facebook and Twitter.
- Revisit the Open Radar Wiki.
- Keep barrier to entry low, encourage student users to contribute to the forum. What are the barriers?
- Encourage people to contribute earlier rather than later - smaller changes more often.
- Build a sense of community.
- Before next workshop, perform outreach to the modeling community. EOL has an ongoing collaboration with RAL and MMM on data assimilation from radar. Develop uncertainty metrics for radar fields - this is ongoing.
Goals resulting from workshop
- Primary goal - enable science.
- Secondary goal - grow number of users and contributors.
List of attendees
Name Email Organization
Angela Rowe akrowe@atmos.uw.edu University of Washington
Anthony Didlake didlake@psu.edu Penn State University
Arthur Eiserloh arthur.eiserloh@sjsu.edu San Jose State University
Brad Schoenrock brads@ucar.edu NCAR/EOL
Brenda Dolan bdolan@atmos.colostate.edu Colorado State University
Brianna Lund bl0027@uah.edu University of Alabama in Huntsville
Bruno Melli bpmelli@colostate.edu Colorado State University
Corene Matyas matyas@ufl.edu University of Florida
Courtney Laughlin claughlin@cswr.org Center for Severe Weather Research
Dan Stechman stechma2@illinois.edu University of Illinois
David Kingsmill david.kingsmill@colorado.edu University of Colorado
David Plummer dplumme1@uwyo.edu University of Wyoming
David Yates yates@ucar.edu NCAR/RAL
Eleanor Delap eleanor.delap@colostate.edu Colorado State University
Erik Johnson ej@ucar.edu NCAR/EOL
Frank Hage fwhage@gmail.com Private consultant
Frank Marks frank.marks@noaa.gov NOAA/AOML/HRD
Frederick Iat Hin-Tam ft21894@gmail.com National Taiwan University
Gary Cunning cunning@ucar.edu NCAR/RAL
Hannah Barnes hannah.barnes@pnnl.gov Pacific Northwest National Lab
Haonan Chen haonan.chen@colostate.edu Colorado State University
Ivan Arias Hernandez idariash@colostate.edu Colorado State University
James Marquis jmarquis@cswr.org CSWR/UC Boulder
Jennifer DeHart jcdehart@uw.edu University of Washington
Jennifer L Davison jldavison@me.com Lower Atmosphere Research Group
Jim Wilson jwilson@ucar.edu NCAR/EOL
Jingyin Tang jtang8756@ufl.edu University of Florida
John Gamache john.gamache@noaa.gov NOAA/AOML/HRD
Jon Martinez jon.martinez@colostate.edu Colorado State University
Josh Aikins joshua.aikins@colorado.edu University of Colorado/NOAA PSD
Josh Carnes jcarnes@ucar.edu NCAR/EOL
Karan Vahi vahi@isi.edu USC Information Sciences Inst. Pega
Karen Kosiba kakosiba@cswr.org Center for Severe Weather Reserach
Kristen Rasmussen kristenr@rams.colostate.edu Colorado State University
Larry Oolman ldoolman@uwyo.edu University of Wyoming
Michael Bell mmbell@colostate.edu Colorado State University
Mike Dixon dixon@ucar.edu NCAR/EOL
Nancy Rehak nrehak@globalweathercorp.com Global Weather Corp
Naufal Razin naufal@colostate.edu Colorado State University
Nick Guy nick.guy@climate.com The Climate Corporation
Paul Robinson robinsonp@cswr.org Center For Severe Weather Research
Peter Dodge peter.dodge@noaa.gov NOAA/AOML/HRD
Roelof Burger roelof.burger@nwu.ac.za North West University South Africa
Ryan Gooch s.ryan.gooch@gmail.com Colorado State University
Samuel Haimov haimov@uwyo.edu University of Wyoming
Scott Collis scollis@anl.gov Argonne National Laboratory
Scott Pearse pearse@ucar.edu NCAR/CISL
Sounak Biswas sounak.biswas@colostate.edu Colorado State University
Stacy Brodzik brodzik@uw.edu University of Washington
Stephen Herbener stephen.herbener@colostate.edu Colorado State University
Tammy Weckwerth tammy@ucar.edu NCAR/EOL
Ting Yu Cha tingyu@rams.colostate.edu Colorado State University
Trevor White twhite@cswr.org Center for Severe Weather Research
Ulrike Romatschke romatsch@ucar.edu NCAR/EOL
Wen-Chau Lee wenchau@ucar.edu NCAR/EOL