US JGOFS DATA - PROGRESS & WORK PENDING
Below is a start to an outline of where we
are in terms of data management, and what we need to do over the remaining
2-3 years of US JGOFS. This text
file is available for editing:
Various comments to original version are posted in red last update: 01 July 02
I. PROGRESS TO DATE:
A. US JGOFS data - J-LAS (mostly non-SMP)
B. SMP data
II. IDENTIFY ANY PROBLEMS WITH FUTURE WORK OR WORK STILL UNDONE
III. CONSTRUCT REALISTIC TIME TABLE FOR REMAINING TASKS (WITH ASSIGNED
PERSONNEL?)
IV. IDENTIFY SHOWCASE ITEMS FOR INTERNATIONAL JGOFS MEETING (May
2003)
J-LAS
PRODUCTS PROMISED OR DESIRED
As a reminder of what was promised, the following is excerpted relating
to Data Management from the NSF Proposal "Coordination and Management of
the US JGOFS Synthesis and Modeling Project: The Second and Final Phase"
The SMP-LAS will be configured to support non-gridded data with
the same functionality as currently available for gridded products (e.g.
plotting, sub-sampling). The new SMP-LAS will also have improved capabilities
for merging and comparing multiple data sets whether gridded or non-gridded.
The end product will be a web-based system allowing scientists to draw
from the wide array of information being generated as a part of SMP. The
specific new features will include:
-
metadata searching capabilities to assist the user with locating
the desired parameters from a large and complex collection
-
user interface changes to support the identification and selection
of data subsets within in-situ collections
-
internal data structure enhancements to support cruise tracks, time
series, point observations, etc. and arbitrary subsetting of them;
-
back-end and scripting enhancements to support graphics and file
generation for the new data types;
-
extension of its data fusion capabilities to include in-situ to
gridded data comparison (essentially by interpolating within gridded fields
at the space-time points at which ovservations exist.)
The SMP data management work will nicely complement the DMO efforts
to build merged data objects and searching tools for the process data.
The scientific value of JGOFS data sets and SMP derived synthesis products
will extend for years beyond the completion of the program. An improved,
user-friendly SMP-LAS interface will help make these results widely accessible
to scientific and public audiences beyond just the current JGOFS investigators.
The SMP management team will work with the DMO to package platform-independent
versions of the SMP-LAS system with the final archived JGOFS data on permanent
media and/or web-based systems.
PROGRESS
-
The beta-version
of the J-LAS system to display US JGOFS Process Study data has received
very positive feedback
-
Arabian Sea and Southern Ocean results are currently posted
-
Equatorial Pacific and NABE data are currently being processed.
-
A CD-ROM (or two) of Process Study data, complete with platform-independent
software to view it, will be cut in Dec 2002.
Cyndy has also mentioned that Steve Hankin et al. hope to implement the
following key JGOFS issues into J-LAS by July:
-
support in browsers where Java/JavaScript is not stable
[CC: I don't know what this line means. I know there are some platform/browser version issues with LAS, but I don't remember hearing that PMEL would be fixing them]
-
ability to impose metadata constraints (e.g. select data by cruise_ID)
-
ability to present a nicer interface for the selection of variables (thematically
grouped)
-
ability to select groups of variables for download as a unit
-
ability to select two (or more) variables to make property-property plots
-
and ultimately (not yet): gridded/in-situ comparison
SMP DATA
& SMP LAS
PRODUCTS PROMISED OR DESIRED
... also from the NSF proposal ...
Desired Features for an SMP data management system:
-
Centralized access to any SMP data set (but with possibly decentralized
distribution)
-
Subsetting capability
-
Ability to reformat data to one of several different formats
-
Graphical display of data
PROGRESS
-
Centralized access provided via the main SMP Data
Page. Some of these displayable via the LAS.
-
SMP-LAS (for gridded data, satisfies the features listed above)
-
some of the gridded data sets submitted above are now available via the
SMP-LAS
-
some still have problems and require PMEL expertise: e.g. McGillicuddy's
curvilinear grid
-
Model data have not been tackled to great extent yet, but will be soon,
now that many LAS problems have been resolved e.g., Christian/McClain,
Hofmann, Arrigo
-
Data Inventory - data sets are slowly
being submitted and posted on the SMP data system as they come in.
LONG TERM
ISSUES
Post-US JGOFS System Maintenance: We should start thinking
about plans to keep the US JGOFS data online. The US JGOFS DMO at Woods
Hole is the obvious site to maintain this. What are the mechanisms for
obtaining funding to keep the system alive for x number of years; how many
years would seem appropriate.
Archiving
-
funding for the U.S. JGOFS SMP does not include long-term archival costs,
at least maintaining data on a server, beyond the lifetime of the SMP.
NODC will handle long-term archival, but will any program step up and continue
to serve the data along with data management and analysis tools?
-
long-term archiving requires different standards than those for short-term
archiving. Currently, most programs provide CD-ROMS of their data, with
software to access the data. For example, there was an explicit call at
the 2001 summer SMP Principal Investigator meeting to provide the US JGOFS
field-collected data in the same format and with the same access software
as is now provided via the web. In terms of SMP data, this is possible,
but may not be practical.
[CC: If I was around when this was mentioned I would have let people know
that this is impossible. The LAS software can not be used for CDROM
data access and even the jgofs software would require a lot of
programming effort to make it available for multi-platform
installation. Either of these software packages require much more time
and effort to install and configure than the average user is willing or
capable of investing. We are trying to make a new Java application that
could be easily installed to provide multi-platform support for CDROM
data access.]
Because many of the SMP data will be submitted
in a wide variety of formats (ASCII, NetCDF, HDF?, Excel Spreadsheets,
binary?) it may be impractical to provide multiple access routines to the
data. Long term storage of data may better serve the user if the data are
saved in some common format that is likely to remain easily readable for
the long-term future (e.g. ASCII).
Referencing: Will there be a publication to accompany the US JGOFS
CD-ROMs?
[CC: What do you mean by this? Are you thinking about a publication to allow
authors to reference the CDROM via the printed publication? The answer is that
no one at the DMO has considered this till today. Dave Glover said he could
imagine writing something with input from the rest of the team, but ... this is
not currently on the to do list.]
TIME TABLE
2002 |
ongoing |
SMP: obtain and post data sets from completed SMP Projects (inventory
available and updated on the web) |
08/02 |
[CC: EqPac bottle and CTD] |
10/02 |
[CC: NABE bottle and CTD] |
12/02 |
US JGOFS: CD-ROMs of U.S. JGOFS process study data cut |
???? |
US JGOFS: non-beta release of J-LAS system |
2003 |
ongoing |
SMP: obtain and post data sets from completed SMP Projects |
???? |
US JGOFS & SMP: merging of J-LAS and SMP-LAS |
05/03 |
International JGOFS Meeting |
2004 |
ongoing |
SMP: obtain and post data sets from completed SMP Projects |
12/04 |
US JGOFS & SMP: Program completed |
2005 |
???? |
SMP: CD-ROMs of U.S. JGOFS SMP Project |
???? |
US JGOFS & SMP: Archiving of U.S. JGOFS data at NODC |
05/05 |
[CC: end date of US JGOFS DMO funding?]
|