Tuesday, September 29, 2015

Raleigh Martin at EC3 Workshop 2015

Recently I participated in the EarthCube funded EC3 (Earth-Center Communication for Cyberinfrastructure) workshop at Yosemite National Park and Owens Valley, California.  The workshop brought together a mix of geoscientists and computer scientists to address challenges in field data collection and to brainstorm cyberinfrastructure solutions to make field data collection easier, more efficient, and more likely to result in useful long-term data preservation.

My own work encompasses both laboratory experiments and fieldwork on active sediment transport processes.  Through my engagement with SEN (Sediment Experimentalists Network), I have already thought substantially about laboratory issues, so participation in the EC3 trip gave me a chance to think more about field data.  To my somewhat surprise, the idea of “fieldwork” varies vastly among domains.  Whereas fieldwork for me primarily encompasses collection of instrumental time series records, during the EC3 trip the focus was on mapping of geological structures and stratigraphy.

Despite my somewhat outsider status, I learned several lessons from the EC3 field trip, which I hope to share with the SEN community:

1)   The most effective development of geoscience cyberinfrastructure occurs when software developers and geoscientists are tied together at every step of the development process.  Otherwise, there is a danger that computer tools will not be compatible with the way that scientists actually do their work.  For example, tablet-based apps might one day replace the field notebook, but only if they accommodate the free-form sketches that don’t fit neatly into metadata categories.
2)   Research progresses in an unpredictable, heterogeneous, iterative, and “messy” way that makes the adoption of uniform, comprehensive cyberinfrastructure and database tools impossible.  I could see this in how much my concept of “fieldwork” differed from other workshop participants.  Rather than seeking a grand solution to all of our data problems, we’re better off building smaller-scale solutions for specific applications, then linking these applications through semantics, i.e., clear, machine-readable assignments of meaning that allow computers to link together heterogeneous databases into shared resources.
3)   Computer scientists actually enjoy our data problems and view them as research challenges!  They are not simply contractors for hire to build specific pieces of software.  As geoscientists, we can view work with computer scientists as research collaboration, which includes applying for grants together and writing papers together.  This will also make the development of cyberinfrastructure feel more like fun and less like a chore.  The EARTHTIME project is one great example of the synergies to be found between geoscientists and computer scientists.

These lessons are my own personal opinions, and I’m open to debate with those who might disagree!  I encourage comments on these ideas and perhaps even further blog posts by members of the Sediment Experimentalist Network on this topic of development of cyberinfrastructure for the geosciences.

No comments:

Post a Comment