< US JGOFS DATA - PROGRESS & WORK PENDING Below is a start to an outline of where we are in terms of data management, and what we need to do over the remaining 2-3 years of US JGOFS. This text file is available for editing: Various comments to original version are posted in red last update: 01 July 02 I. PROGRESS TO DATE: A. US JGOFS data - J-LAS (mostly non-SMP) B. SMP data II. IDENTIFY ANY PROBLEMS WITH FUTURE WORK OR WORK STILL UNDONE III. CONSTRUCT REALISTIC TIME TABLE FOR REMAINING TASKS (WITH ASSIGNED PERSONNEL?) IV. IDENTIFY SHOWCASE ITEMS FOR INTERNATIONAL JGOFS MEETING (May 2003) ------------------------------------------------------------------------ J-LAS PRODUCTS PROMISED OR DESIRED As a reminder of what was promised, the following is excerpted relating to Data Management from the NSF Proposal "Coordination and Management of the US JGOFS Synthesis and Modeling Project: The Second and Final Phase" The SMP-LAS will be configured to support non-gridded data with the same functionality as currently available for gridded products (e.g. plotting, sub-sampling). The new SMP-LAS will also have improved capabilities for merging and comparing multiple data sets whether gridded or non-gridded. The end product will be a web-based system allowing scientists to draw from the wide array of information being generated as a part of SMP. The specific new features will include: * metadata searching capabilities to assist the user with locating the desired parameters from a large and complex collection * user interface changes to support the identification and selection of data subsets within in-situ collections * internal data structure enhancements to support cruise tracks, time series, point observations, etc. and arbitrary subsetting of them; * back-end and scripting enhancements to support graphics and file generation for the new data types; * extension of its data fusion capabilities to include in-situ to gridded data comparison (essentially by interpolating within gridded fields at the space-time points at which ovservations exist.) The SMP data management work will nicely complement the DMO efforts to build merged data objects and searching tools for the process data. The scientific value of JGOFS data sets and SMP derived synthesis products will extend for years beyond the completion of the program. An improved, user-friendly SMP-LAS interface will help make these results widely accessible to scientific and public audiences beyond just the current JGOFS investigators. The SMP management team will work with the DMO to package platform-independent versions of the SMP-LAS system with the final archived JGOFS data on permanent media and/or web-based systems. PROGRESS * The beta-version of the J-LAS system to display US JGOFS Process Study data has received very positive feedback * Arabian Sea and Southern Ocean results are currently posted * Equatorial Pacific and NABE data are currently being processed. * A CD-ROM (or two) of Process Study data, complete with platform-independent software to view it, will be cut in Dec 2002. Cyndy has also mentioned that Steve Hankin et al. hope to implement the following key JGOFS issues into J-LAS by July: * support in browsers where Java/JavaScript is not stable [CC: I don't know what this line means. I know there are some platform/browser version issues with LAS, but I don't remember hearing that PMEL would be fixing them] * ability to impose metadata constraints (e.g. select data by cruise_ID) * ability to present a nicer interface for the selection of variables (thematically grouped) * ability to select groups of variables for download as a unit * ability to select two (or more) variables to make property-property plots * and ultimately (not yet): gridded/in-situ comparison ------------------------------------------------------------------------ SMP DATA & SMP LAS PRODUCTS PROMISED OR DESIRED ... also from the NSF proposal ... Desired Features for an SMP data management system: * Centralized access to any SMP data set (but with possibly decentralized distribution) * Subsetting capability * Ability to reformat data to one of several different formats * Graphical display of data PROGRESS * Centralized access provided via the main SMP Data Page. Some of these displayable via the LAS. * SMP-LAS (for gridded data, satisfies the features listed above) o some of the gridded data sets submitted above are now available via the SMP-LAS o some still have problems and require PMEL expertise: e.g. McGillicuddy's curvilinear grid o Model data have not been tackled to great extent yet, but will be soon, now that many LAS problems have been resolved e.g., Christian/McClain, Hofmann, Arrigo * Data Inventory - data sets are slowly being submitted and posted on the SMP data system as they come in. ------------------------------------------------------------------------ LONG TERM ISSUES Post-US JGOFS System Maintenance: We should start thinking about plans to keep the US JGOFS data online. The US JGOFS DMO at Woods Hole is the obvious site to maintain this. What are the mechanisms for obtaining funding to keep the system alive for x number of years; how many years would seem appropriate. Archiving * funding for the U.S. JGOFS SMP does not include long-term archival costs, at least maintaining data on a server, beyond the lifetime of the SMP. NODC will handle long-term archival, but will any program step up and continue to serve the data along with data management and analysis tools? * long-term archiving requires different standards than those for short-term archiving. Currently, most programs provide CD-ROMS of their data, with software to access the data. For example, there was an explicit call at the 2001 summer SMP Principal Investigator meeting to provide the US JGOFS field-collected data in the same format and with the same access software as is now provided via the web. In terms of SMP data, this is possible, but may not be practical. [CC: If I was around when this was mentioned I would have let people know that this is impossible. The LAS software can not be used for CDROM data access and even the jgofs software would require a lot of programming effort to make it available for multi-platform installation. Either of these software packages require much more time and effort to install and configure than the average user is willing or capable of investing. We are trying to make a new Java application that could be easily installed to provide multi-platform support for CDROM data access.] Because many of the SMP data will be submitted in a wide variety of formats (ASCII, NetCDF, HDF?, Excel Spreadsheets, binary?) it may be impractical to provide multiple access routines to the data. Long term storage of data may better serve the user if the data are saved in some common format that is likely to remain easily readable for the long-term future (e.g. ASCII). Referencing: Will there be a publication to accompany the US JGOFS CD-ROMs? [CC: What do you mean by this? Are you thinking about a publication to allow authors to reference the CDROM via the printed publication? The answer is that no one at the DMO has considered this till today. Dave Glover said he could imagine writing something with input from the rest of the team, but ... this is not currently on the to do list.] ------------------------------------------------------------------------ TIME TABLE 2002 ongoingSMP: obtain and post data sets from completed SMP Projects (inventory available and updated on the web) 08/02 [CC: EqPac bottle and CTD] 10/02 [CC: NABE bottle and CTD] 12/02 US JGOFS: CD-ROMs of U.S. JGOFS process study data cut ???? US JGOFS: non-beta release of J-LAS system 2003 ongoingSMP: obtain and post data sets from completed SMP Projects ???? US JGOFS & SMP: merging of J-LAS and SMP-LAS 05/03 International JGOFS Meeting 2004 ongoingSMP: obtain and post data sets from completed SMP Projects 12/04 US JGOFS & SMP: Program completed 2005? ???? SMP: CD-ROMs of U.S. JGOFS SMP Project ???? US JGOFS & SMP: Archiving of U.S. JGOFS data at NODC 05/02 [CC: end date of US JGOFS DMO funding?] [JK: is this correct and already beyond the date? or is it 02/05?] ------------------------------------------------------------------------