Thursday, November 19, 2015

November Newsletter

SEN November Newsletter

Dear Experimentalists,

We hope everyone is having an excellent fall.  We have a lot of exciting news and upcoming events to share with you in this newsletter.

This issue contains the following:
  1. New NSF Data Policy
  2. Open Science Framework
  3. Travel Grant Winners Announced
  4. SEN at AGU Fall Meeting
  5. EarthCube Science Committee Webinar Series
  6. EarthCube Early Career Travel Grants
  7. Gilbert Club 2015
  8. EarthCube needs your help to gather examples of technical obstacles in geoscience research

New NSF Data Policy 
NSF is putting into effect a new policy on public access on January 25, 2016.  This policy will be required for all research funded by NSF and should help make it easier for the public to discover data and results from funded projects.  An FAQ on the new policy can be found here.

Open Science Framework
Raleigh Martin recently added a blog entry describing the Open Science Framework (OSF), a useful tool for data and workflow documentation through the research lifecycle.  This blog entry is based on a recent webinar on the OSF hosted by DataOne.

Travel Grant Winners Announced
Thank you to all the participants of the SEN Graduate Student and Early Career Travel Grant Contest.  We received many great applicants with excellent entries onwww.sedexp.net.  The winners are:


Please be sure to check out all the great entries on the SEN wiki at http://www.sedexp.net.

SEN at AGU Fall Meeting
For all those that will be attending the AGU Fall Meeting this December in San Francisco please be sure to check out the following SEN-related events.  During these events, SEN will be handing out buttons and stickers so you can show your support!

  • SEN team members will be available to the EarthCube booth (#116) in the exhibit hall.  The booth will be open from 10 AM – 5 PM on Tuesday, Wednesday, and Thursday.  Raleigh Martin will be there from 10-11 on Tuesday, and Kim Miller will be there from 10-11 on Thursday. 
  • Check out the oral and poster sessions entitled “Experimental Studies in Surface Processes”.  Oral sessions are being held Monday, December 14thfrom 1:40pm to 3:40pm and 4:00pm to 6:00pm in Moscone West 2005. The poster session is Tuesday, December 15th from 8:00am to 12:20pm.
  • We would also like to point out several other AGU events of potential interest to the SEN community:
    • AGU will be hosting a “Data Fair” at the Fall Meeting, which will include daily panel discussions on Geoscience Data Publishing and Repositories.  More information is available here.
    • EarthCube will be hosting a Town Hall (TH15D) on Monday (12/14) from 6:15-7:15 PM in Moscone West 2008 on “EarthCube Science Drivers and Implementation Roadmap”.
    • A technical session entitled “Data Ought Not Be in the Darkness: They Should Be Open, Accessible, Transparent, and Reproducible” will be held on Monday and Tuesday afternoons.  Search for session numbers IN23E and IN33A for more information.

EarthCube Science Committee Webinar Series
The EarthCube Science Committee will host the first in a series of webinars on "Doing Geoscience with EarthCube Tools" aimed at connecting geoscientists with products generated by EarthCube funded projects.  The first webinar will be held Friday, November 20, at 2 PM EST, with Kerstin Lehnert / Megan Carter (Lamont-Doherty Earth Observatory) presenting together on the iSamples Research Coordination Network, which seeks to improve discovery, access, sharing, analysis, and curation of physical samples and their associated data in the geosciences. Call-in and event details are available here.  More information on the webinar series is available here.

EarthCube Early Career Travel Grants
Two types of travel grants are now available to support early career scientists in travel to AGU and other conferences.
  1. The EarthCube Science Committee is offering six (6) travel grants of $500 each specifically for the upcoming AGU Fall Meeting.  Application deadline is November 30, 2015.  More information is available here.
  2. The EarthCube Engagement Team offers travel grants on a rolling basis. More information is available here.
If you are an early career researcher who has participated in SEN activities, or are planning to participate in more SEN activities, let us know if you have any questions about this travel grant at sedimentexp@gmail.com.

Gilbert Club 2015
The annual Gilbert Club meeting takes place at the Berkeley Hall of Science the Saturday following AGU and features talks and discussions from prominent scientists in the field of geomorphology.  This year’s meeting features presentations from Sue Brantley, Eric Lajeunesse, and Sean Willet.  For more information, including schedule, registration, and transportation, please check out there website.

EarthCube needs your help to gather examples of technical obstacles in geoscience research
Is your geoscience research limited by computing or data management issues?  The goal of EarthCube is to break up these logjams that limit scientific productivity.  To address technical limitations in the geosciences, EarthCube is compiling a list of "Use Cases" (real-world examples of computing challenges) that will inform creation of cyberinfrastructure for the geosciences. Please fill out this form to describe your Use Case, then a member of EarthCube will follow up with you to conduct a one hour interview to gather more details about your Use Case.

For up to date information about SEN, please check out our blog athttp://sedimentexperiments.blogspot.com/ and follow us on Twitter (@sedimentexp).

Happy experimenting,
The Sediment Experimentalist Network
http://workspace.earthcube.org/sen

Thursday, November 12, 2015

Open Science Framework (OSF): A useful free tool for data and workflow management for scientific reproducibility

On October 13, 2015, DataOne hosted a webinar led by Courtney Soderberg from the Center for Open Science.

The webinar had two goals: (1) To outline the issues with existing scientific workflows that can lead to bias and results that are not reproducible, and (2) To introduce the Open Science Framework (OSF) as a tool to overcome these biases and increase the reproducibility of science.

Regarding issues of reproducibility, most scientists are probably aware of the narrow issue of computational reproducibility, i.e., the ability to take the data collected by a team of researchers, perform the same analyses, and reach the same conclusions.  Ms. Soderberg described this issue in her talk, but she also described more subtle biases and issues with reproducibility.  One issue is publication bias: analyses often change through the course of a project, and only the final (successful) analyses and results are documented, while negative results or dead-end analyses are never captured.  Related to publication bias is Hypothesizing After Results are Known, or HARKing.  To present a succinct story, publications often present hypotheses as a priori, whereas hypotheses may in fact have been generated after researchers spent significant time poring over the data.  In what is known as researcher degrees of freedom, data processing and analytical decisions are often made after seeing and interacting with data, severely increasing the potential for false positive outcomes, often outside of the awareness of a researcher.  (For further discussion of reproducibility problems, I suggest the enlightening recent special issue of Nature on this topic.)

In response to these various potential sources of bias, the OSF, a free web-based resource for data and workflow management, builds in mechanisms to reduce (or at least document) potential sources of research bias.  The OSF is meant to be used through the whole research life cycle, from project conception to final paper and data publication, and all actions taken, wiki entries written, and files uploaded on the OSF are timestamped and version controlled.  For example, it is possible to document a timestamped hypothesis prior to data collection and analysis to avoid HARKing.  More details on the OSF can be learned by viewing Ms. Soderberg's excellent presentation in full; below, I provide a few highlights:

  • OSF pages can be public or private, and there is granular control over access to individual pages and sections for collaborators or the general public.  Public projects are fully searchable.
  • Built-in tools smooth the collaboration process.  One can create templates for common file types, and projects can be "forked" to create copies of files/folders with original content intact.
  • Third-party software such as GitHub, Google Drive, and FigShare can be seamlessly integrated through add-ons.  This is especially useful for large files that exceed the current 128 MB limit for individual files stored with OSF (no total storage limit across all files).  The one catch is that, while all file versions uploaded directly to OSF are stored permanently, linked third-party content remains stored with third parties subject to their version control/storage policies.  Nonetheless, OSF does keep track of all version changes (even if it does not keep the original files).
  • Permanent identifiers (GUIDs) are assigned to projects created on OSF.  Other unique identifiers (e.g., DOIs, ORCID, LinkedIn) can be assigned to projects and/or researchers.
  • Versions of a project can be "registered" at a fixed point in time, such as when submitting an article for publication.  Registered versions become read-only and fully include all linked (third-party) content, so a registered project can provide a stable data/workflow accompaniment to a published journal article.  Registered versions can remain private for an embargo period of up to four years.  Once public, registered projects can be assigned a DOI.
  • Data sustainability is extremely important to OSF.  In case the Center for Open Science disappears, a "sustainability fund" has been established to maintain existing data in a read-only format indefinitely.
  • Public projects are fully searchable.

I strongly encourage all scientists to investigate OSF as an option for workflow and data management.  The advantage of OSF is that it provides a flexible, robust architecture for many data management challenges.  The disadvantage is that it may not fulfill discipline-specific needs of sediment experimentalists.  As we continue to develop the SEN Knowledge Base, we will closely follow developments of OSF and other data management platforms.

Raleigh L. Martin
UCLA Dept. of Atmospheric and Oceanic Sciences

---

Click here to view the webinar on the DataOne website.