Just like organisms, datasets get born, grow up, reproduce, and die. GloBI's mission is to help increase the productivity (or reuse), and increase the lifespan of, datasets before they meet their maker.
To monitor the lifestage of a datasets, the GloBI "status" page was introduced in January 2017. This page shows the state of the federated collection of species interaction datasets that make up GloBI. Rather than treating datasets like static entities, GloBI takes a dynamic approach and revisits datasets regularly to incorporate changes and revisit the links to naming authorities (e.g. itis.gov), gazetteers (e.g. geonames.org) and other relevant external data services. Several monitors can be found on the status page that indicate whether the dataset can be read, searched or citated. In addition, some statistics are provided to point to known issues and the properties of the datasets (e.g. number of interactions, number of names, match rate of names). With this, a de-facto publication process is outlined along with quality control measures that show how the lifestage of a dataset.
At time of writing, the status page provides a wealth of information. For instance, the state of the dataset (e.g. ) provided by http://kelpforest.ucsc.edu (Beas-Luna et al. 2014) indicates that the dataset is close to dying or already dead: nearly all indicators are red and unresolved issues exist. In contrast, the chance of survival of a dataset like the Africa Tree Database (Selzer et al. 2015) looks promising: indicators are green (e.g. ), name match rate is 94% across 1.9k names and 7.7k interactions. In addition, the Africa Tree Database has been deposited with Zenodo, a service that is designed to provide "permanent" data availability through digital object identifiers.
Over the last couple of weeks, I've used the page to discover and resolve various dataset issues. For example, an obscure GloBI bug was found (and resolved) (see hurlbertlab/dietdatabase#40 and jhpoelen/eol-globi-data#276) that prevented the integration of some interaction records provided by the Avian Diet Database. I am curious to see how the status page will evolve in the next months.
Poelen, Jorrit; Schulz, Katja; Hammock, Jennifer (2016): Pragmatic, scalable aggregation of organismal interaction data. figshare. doi:10.6084/m9.figshare.4495130.v1
The increased availability of digital records on where organisms hangout (or occur) has been facilitated by Darwin Core, a community standard for exchanging biodiversity information. TDWG (pronounced tahd-wick), the association that helped nurture this standards into maturity, held this year's annual meeting in Costa Rica. While the primary focus of Darwin Core is on species occurrence, an extension mechanism is provided to help capture other kinds of aspects of biodiversity data. While some methods exist to help describe species interactions in DwC (e.g. associatedTaxa, associatedOccurrences), various projects use "unofficial" extensions to express granular species interaction data. GloBI makes an effort to support all of the methods in the hope that, over time, we'll continue to discuss and collaborate on ways to improve ways to share interaction data.
A long time GloBI contributor, Katja Schulz, presented a poster on a pragmatic approach to integrating species interaction data at the conference. Rather than relying on a single data exchange format, GloBI will take any existing (semi-)structured digital format to make existing datasets easier to access. With this approach, the burden of standardization of existing datasets is reduced. In addition, a continuous integration approach is taken: the datasets are continuously refreshed and linked against taxonomies, ontologies and other biodiversity data services.
Also, she, and some of her colleagues, joined for an in-promptu lunch to talk about species interaction data. I hope that the outcome of this lunch will contribute to the ongoing development of ways to easily exchange evidence records on how organisms rely on each other.
In his landmark publication “Animal Ecology,” Charles Elton points out that only after we organize knowledge across a “colossal store of facts” published in books and journals can we understand ecology beyond “clearcut niches which happened to have been worked out” (Fig 1). He wrote about this vision of natural-history fact synthesis as early as 1927. Since then, we've made some progress in providing access to observations of organisms recorded by generations of naturalists. For instance, Global Biodiversity Information Facility (GBIF) indexes over 600 million records of species occurrences (as of Feb 2016) from a wealth of data sources from all over the globe. GBIF's assemblage includes digitized museum specimen collections and digital-native citizen science projects such as iSpot, eBird, and iNaturalist. However, major “shortfalls” exist, especially related to species interactions (“Eltonian shortfall,” Hortal et al. 2015).
Fig 2. GloBI sits in a web of interconnected online services, including evidence literature (red), taxonomies (yellow), ontologies (purple), data providers (green) and search indexes/aggregators (pale blue).
Now that we have a wealth of data describing where organisms hang out, a next logical step is to increase focus on records that describe how species rely on each other. Global Biotic Interactions (GloBI) takes a federated approach to integrating evidence data: rather than serving as a giant data store, GloBI continuously discovers and links facts about species interactions across existing ecological data providers, online taxonomies, citation indexes, and data applications. In a way, GloBI is an actor in a web of mutualistic relationships between journals, online data tools, and citizen-science and research communities (Fig 2). With its finely tuned mission of making interaction data more accessible, GloBI acts as a catalyst for connecting previously siloed communities through the data they share. This linked data makes it easier to discover evidence about specific species interactions (e.g., What do sea otters (Enhydra lutris) eat?) and forges cross-disciplinary collaborations, such as data peer reviews and food-web reconstructions from previously disjointed evidence data (see Poisot et al. 2015).
Fig 3. Linkages related to an iNaturalist observation describing a Great White Shark (Carcharondon carcharias) eating a California Sea Lion (Zhalophus californianus).
By linking communities through their shared data (Fig 3), focused projects like GloBI can help to improve the resilience, quality, and diversity of the data ecosystem that exists today. And by building an increasingly diverse and healthy data ecosystem, we continue to work toward realizing Charles Elton’s vision and better our understanding of ecology at a global scale.
On Saturday 17 October 2015, a herd of nature nerds gathered for the Nerds for Nature Summit 2015 at Manylabs in San Francisco, California. The format of the summit was a scaled up version of the collaborative Nerds for Nature meetup format: a round of introduction, short project pitches followed by breakout project specific discussion groups. At this years summit, Global Biotic Interactions was featured as one of the one-minute pitches.
Nerds for Nature and Manylabs facilitate invaluable interactions between folks from various disciplines (e.g. software, hardware, education, design) to develop and mature projects centered around open data, and open (citizen) science. One of the many outcomes my (and therefore GloBI's) connection to Nerds for Nature and Manylabs was getting access to a stimulating work environment at Manylabs, planting the seeds for a project called EDAM(Ecological Data Access Monitor) and an increased cross-pollination of open science/data ideas with individuals and institutions (e.g. Manylabs residents and their guests, Concord Consortium, Hacking Measurement at UC Berkeley, Moore Foundation).
Going forward, Global Biotic Interactions will continue to foster mutualistic relationships within a growing, diverse and social open data/education/science community.
Big thanks to Nerds for Nature for organizing the summit and to Manylabs for hosting the event.
Baron, Daniela; Caragol, Ri; Furrer, Stefan; Macmurchy, Peter; Stark, Adam (2015): GloBI Explorer Interactive Ecosystem Explorer. figshare. http://dx.doi.org/10.6084/m9.figshare.1414252 Retrieved on May 23, 2015.
MOOCs are all the rage. An enormous amount of courses are now available online for anyone who has time and a networked device. When Battushig Myanganbayar, a 15-year-old high school student from Mongolia, aced MIT's Circuits and Electronics MOOC, he found himself in the spotlight and got invited to study at the prestigious institution.
The folks at Indiana University take the idea to the next level: the Information Visualization MOOC organized by Katy Börner et al. not only provides free education, it also gets students to collaborate with real projects that give access to open data. I was excited that our project, Global Biotic Interactions (GloBI), was invited to participate in this unique course for a second year (see IVMOOC 2014 project).
Baron, Daniela; Caragol, Ri; Furrer, Stefan; Macmurchy, Peter; Stark, Adam (2015): GloBI Explorer: Interactive Ecosystem Explorer. figshare. http://dx.doi.org/10.6084/m9.figshare.1414253 Retrieved May 22, 2015.
The IVMOOC-GloBI challenge for this year was to create an engaging experience for high school students to explore food webs in and outside of the classroom. From the start, Daniela Baron, Ri Caragol, Stefan Furrer, Peter MacMurchy, and Adam Stark were eager to learn more about the dataset, provide improvement suggestions, and respond to feedback provided by Jeff Holmes, Marie Studer, and Jen Hammock of the Encyclopedia of Life. I was impressed by what they were able to create in only a short amount of time: a web application, a paper, and show-it-all poster.
I think that this year's IVMOOC project demonstrates the benefits of open data: the use of openly accessible data provides the ability for anyone with an idea and an internet connection to help us better understand the world around us. Not only that, it helps to create Massive Open Online Data (MOOD) communities of citizen scientists and engineers from all over the world to help make the data, and the tools used to access it, increasingly more useful. For example, Sergey Slyusarev, an IVMOOC 2014 alumnus, has identified data issues and is coauthor of rglobi, an R library for accessing GloBI.
Thanks to the IVMOOC class of 2015 (and their organizers) for making this happen!
On Feb 5, 2015, Global Biotic Interactions (GloBI) was topic of an afternoon Tea talk at Berkeley Institute for Data Science (BIDS). Located in the historic Doe Memorial Library, BIDS helps to advance data-intensive science across the UC Berkeley campus. Part of their many activities is a twice-a-week afternoon tea series where projects are presented and discussed.
After a short introduction by BIDS fellow Falk Schuetzenmeister, I introduced GloBI to the 20-30 audience members. Following, the more interesting part of the afternoon started: an open discussion! I learned a few things from the discussion. First is that the audience considered the immediate utility of GloBI to be facilitating data discovery and literature research to help come up with original research questions. In line with this topic, Iryna Dronova suggested to create a real-time data source tracker to help visualize which data sources are available through GloBI.
One of the other discussion topics was how to stimulate ecologist to share data. David Ackerly mentioned that a critical part of the success of GenBank was that editorial boards of genomics journal collaborated and mandated that data be deposited in this public resource prior to accepting manuscripts for publication. Also, in-person, multi-day workshops were mentioned as promising method to get cross-disciplinary researchers together, share data and use new tools to help answer meaningful research questions.
Sea otters and their lunch were a topic of discussion at BIDS Tea Feb 5, 2015 following a presentation.
Finally, an audience member asked a question: 'Do otters really eat beavers?'. This told me that my short demo using available GloBI tools assisted in data review - dubious data was identified quickly during a short data excursion in the world of species interactions. My answer to this question was: Please lookup the source reference and ask the data contributors. After a visit to the EOL Enhydra data tab, I discovered that this interaction was recorded by Joel Sachs et al. (2006).
Now the question remains: Do sea otters (Enhydra) really eat American beavers (Castor canadensis)? Are sea otters that vicious? I hope we'll find out... I'll make a point to share this blog post with the authors, in the hope that they can shed some light on the topic.
Update Feb 19, 2015 - After an open discussion with data contributors, the trophic interaction between Sea Otters (Enhydra) and American beavers (Castor canadensis) has been removed. The changes will propagate into GloBI and EOL with some delay.
Jorrit H. Poelen, James D. Simons and Chris J. Mungall. (2014). Global Biotic Interactions: An open infrastructure to share and analyze species-interaction datasets. Ecological Informatics. http://dx.doi.org/10.1016/j.ecoinf.2014.08.005 .
"The proof of the pudding is in the eating", is a phrase that stuck out in detailed comments from Jan Willem Henfling on our recent paper (Poelen et al., 2014) in Ecological Informatics. With this, he pointed out that it is important to get the species interaction data into the hands of researchers and educators.
I was happy to read his comments, because it told me that our investment in writing and publishing an open-access paper (at a seemingly hefty price of $2500) is starting to pay off. Also, it highlighted that getting the interaction data out there for anyone to use is not enough: active collaborations are essential to show the use of our project. This is why I wanted to share some recent activities with you.
Participants of NESCent-BHL-EOL Research Sprint on 4-7 February 2014 in Durham, North Carolina. Can you find the author?
After participating in the 4 day research sprint organized by NESCent, Biodiversity Heritage Library and Encyclopedia of Life at Durham, North Carolina in February 2014, I have been working with Brian Hayden to use GloBI data to show how dietary niche relates to biodiversity around the globe. Preliminary results are encouraging and a manuscript is in the works. Also, I have continued to work with Jen Hammock (Encyclopedia of Life, Smithsonian Institution) and Jim Simons (Gulf of Mexico Species Interactions, Texas A&M Corpus Christi) to put GloBI data to public use.
Tree-for-All hackathonparticipants gather to hear a progress report.
In September 2014, I participated in the week long Tree-for-All hackathon hosted at the University of Michigan and organized by Arbor Workflows and Open Tree of Life. Among many other things, this collaborative event helped create a method to retrieve phylogenetic trees related to species interactions (e.g. Pocket Gophers and Their Parasitic Chewing Lice) using rglobi (part of rOpenSci) and rotl R libraries.
In the time to come, I am looking forward to continue to help others eat (or make!) more of that delicious GloBI data pudding! Pudding anyone?
A spatially integrated food web of the world derived from hundreds of thousands of interactions, across tens of thousands of species, and thousands o..
Sergey Slyusarev, Dimitrios-Georgios Kontopoulos, William Taysom, Adrian Guzman, and Bimlesh Wadhwa used GloBI data to create a food-web map  as part of the Information Visualization MOOC class of 2014 at Indiana University. The map was created by combining interaction data from GloBI's Darwin Core Archive with terrestrial and marine ecoregions of the world and various openly available taxonomies (e.g., ITIS, NCBI, WoRMS). After eliminating taxa with few recorded interactions, species with similar predator-prey characteristics were grouped by a custom algorithm that was inspired by the Jaccard index, a similarity measure, and based on Infomap, a community-detection algorithm. The resulting interconnected taxa communities were then used to make an information-packed (gorgeous!) food-web visualization. The map was generated with a combination of custom R scripts, existing libraries (e.g., igraph, Reol, rgdal), Cytoscape, and Adobe Illustrator.
Explanation of how color, line width, and node size are used to encode spatial food-web information.
I find the integration of spatial information (e.g., marine, terrestrial) in this graph useful because I can quickly relate specific interactions to regions in the world. For instance, I can easily spot a coastal interaction as a filled node that also has a colored border. In addition, the directionality of the interactions are easy to understand thanks to color coding: predator is orange, prey is blue. Opening the high-resolution image in a run-of-the-mill image viewer, I can easily browse the map by zooming and moving with touch-pad gestures. With the help of this visualization, data anomalies in GloBI's complex data collection were detected, reported through GloBI's issue list (see here, here, here, and here), and corrected. This alone tells me that the visualization by Slyusarev et al. is a useful research tool.
Special thanks to all GloBI data contributors, Sergey for his suggestions for improving GloBI, and Scott Weingart of Indiana University for inviting GloBI as a client project of IVMOOC 2014. Can't wait to work with the IVMOOC class of 2015!
 Slyusarev, Sergey; Kontopoulos, Dimitrios-Georgios; Taysom, William; Guzman, Adrian; Wadhwa, Bimlesh (2015): Global Biotic Interactions food web map. figshare. http://dx.doi.org/10.6084/m9.figshare.1297762 Retrieved 03:26, Feb 07, 2015 (GMT)
Area selection tool in the GloBI's Interaction Browser provides access to raw data files in addition to a share link. The "show" link updates the visualizations in other parts of the page.
With Ben's data and Göran's tool, we can now "dial-up food webs" (phrase coined by Peter Roopnarine) all across the Antarctic and discover that many species feast on Eurythenes gryllus and its cousin Eurythenes obesus. Not only are the interactions visualized on-the-fly using a dependency wheels, you can also access the raw csv, json or dot files to do offline analysis. In addition, you can share the selected area with others using a provided Interaction Browser "share" link.
This particular experience tells me that simply collecting and aggregating data is not enough. Only after locating and illuminating data with search and visualization tools, I can start to analyze and perhaps understand the biological mechanisms behind the data hidden inside GloBI. . .
Screenshots of circular diagrams that highlight predatory interactions for Eurythenes gryllus around the Antarctic peninsula. The left diagram indicates the number of interactions by the width of the arc on the outside of the circle. The right diagram bundles the interactions to help detect highly interacting taxa. In the right diagram red indicates incoming interactions of selected taxon (e.g. prey), whereas green indicates outgoing interactions (e.g. predator).
In the spring of 2013, a friend of mine pointed me to an article in the National Geographic about tongue-eating fish parasites. After suppressing my gag reflex upon seeing a picture of a parasite acting as a tongue of an Atlantic croaker (yes, the fish was still alive), I decided to request data from Colt W. Cook, author of a master's thesis titled "The Early Life History and Reproductive Biology of Cymothoa excisa, a Marine Isopod Parasitizing Atlantic Croaker, (Micropogonias undulatus), along the Texas Coast." Colt was kind enough to give me permission to add his dataset to GloBI.
Now that the Encyclopedia of Life has integrated GloBI data into its species pages, the Atlantic croaker page includes dietary habits as well as information about parasites such as Cymothoa excisa. It's a win-win: users of the Encyclopedia of Life gain access to all sorts of structured species-interaction data, and the hardworking scientists who collected the data are attributed for their research.
Screenshot of the reference to Colt W. Cook's thesis on the EOL Atlantic croaker data page.
At time of writing (January 24, 2014), GloBI includes about half a million global interactions with close to four hundred references, spanning over a century of species-interactions data. As GloBI continues to aggregate existing datasets, we lower the barrier to accessing important data and put the scientists who've made contributions to the field of biology in the spotlight.
Hoping to hear from you! Thank you for reading this post!
Taxonomic query for king mackerel (Scomberomorus cavalla) provides a diet breakdown, a list of interaction observations, and a link to download raw CSV data.
1) a taxonomic query tool that lists species-interaction observations with information, 2) a spatial query tool for finding species-interaction observations within a specific geographic area, and 3) a food-web explorer tool that helps to navigate visually through a food web.
Food-web explorer tool: species of interest in blue, predators in red, prey in green. You can click on predator or prey items to further explore the food web.
GoMexSI not only provides a valuable educational and research tool to study the food webs in the Gulf of Mexico, but also gives a great example of how GloBI web services can be used to recycle, repurpose, and liberate existing species-interaction datasets in our educational and scientific communities. The GoMexSI 1.0-beta version was released to the general public on September 3, 2013.
The GoMexSI website is developed by Reed Hewett and Michael Casavecchia under the guidance of James Simons of Texas A&M Corpus Christi and Jorrit Poelen (GloBI developer and author of this post). To learn more about the many other people and institutions that helped to get GoMexSI where it is today, visit http://gomexsi.tamucc.edu.
The continued success of GloBI and GoMexSI depends on data contributions of ecologist around the world. References to contributed datasets can be found at http://globalbioticinteractions.org/references.html . If you'd like to share your species-interaction dataset, please open an issue at http://github.com/jhpoelen/eol-globi-data or send a message using http://gomexsi.tamucc.edu/feedback/.
The graph above shows GloBI species interaction data in action, and answers a question you might ask yourself when the flies start nipping on your holiday in Puerto Rico.
The graph shows the interaction between source (top circles) and target (bottom circles) taxa or species. The lines between the taxa represent an observed interaction between the source (e.g., predator) and target (e.g., prey) organisms. In the graph, the top source taxon for true flies (Diptera) is selected, and, as you can see, they feast on all sorts of organisms, including us poor mammals (represented by the bottom light blue circles). The source and target circles are ordered in the same way and grouped according to their location in the tree of life or taxonomy. The colors of the circles represent the higher-order taxonomic rank (e.g., kingdom, phylum) of the taxon. Some people might call this interaction graph a bipartite graph, where others might see it as a simple form of a hive plot.
So by now, you should be able to answer the question you might ask when you visit Puerto Rico: "Am I the only one who's getting eaten by flies?" This example shows how EOL's GloBI enables easy access to detailed information about species interactions and helps answer very specific ecological questions.
Attendees from left to right: Margarita Gallegos Martínez*; Héctor Ocampo Alvarez*; Ana L. Ibañez*; Francisco Javier Ordoñez Gazca*; James D. Simons, Texas A&M Corpus Christi, TX, USA; Luis Gerardo Abarca-Arenas**; Iliana Pérez Espinosa*; María Trinidad Sosa Medina, Laboratorio de Patologia Acuatica, Departamento de Recursos del Mar, CINVEST TAV-IPN, Unidad Mérida, Mexico; Porfírio Álvarez Torres, Gulf of Mexico Proyecto, Mexico; Silvia Díaz Ruiz*; Francisco Arreguin Sánchez, Centro Interdisciplinario de Ciencias Marinas, CICIMAR; Arturo Aguirre León, UAM-Xochimilco, Mexico City, Mexico; Jorrit Poelen, EOL's GloBI, Oakland, CA, USA; Jonathan Franco López, Laboratorio de Ecologia, Facultad de Estudios Superiores de Iztacala, UNAM; Enrique del Callejo Canal**
*UAM-Iztapalapa, Depto. Hidrobiología, Mexico City, Mexico
**Universidad Veracruzana, Instituto de Investigaciones Biologicas, Mexico
On May 23, 2013, Jorrit Poelen was invited to give a presentation about EOL's GloBI project at a marine biology meeting organized by James D. Simons and Ana L. Ibañez at UAM-Iztapalapa in Mexico City. The presentations by James D. Simons (GoMexSI: A Species Interaction Database), Jorrit Poelen (Encyclopedia of Life's Global Biotic Interactions—Unleashing EOL's Interaction Datasets), Francisco Arreguin Sánchez (Using Trophic Information to Identify Ecosystem Evolution and “Ecosystem Reference Levels” for Management of Living Marine Resources: Challenging the Conventional Paradigm), Silvia Díaz Ruiz (Food-Web Structure under Two Time Scales) and Enrique del Callejo Canal (Alternative Stochastic Processes in Food Webs) provided valuable insights into the structure of marine trophic data and how analysis of aggregated trophic datasets helps deepen our understanding of the Gulf of Mexico marine ecosystems. This understanding is critical to the preservation of marine life. The interdisciplinary meeting at UAM-Iztapalapa helped stimulate research collaborations by providing access to a vast library of existing species-interaction datasets. And with input from the citizen-science community, EOL's GloBI can play a critical role by continuing to facilitate data sharing and making aggregated data accessible.
Big thanks to Jim and Ana for organizing this inspiring meeting!
Cyndy Parr, chief scientist at Encyclopedia of Life, mentioned EOL's Global Biotic Interactions (GloBI) efforts at GSC15 (April 22–24, 2013). Thanks to our species-interaction data contributors and a compelling Gulf of Mexico food-web visualization, Cyndy had a great story to tell about species-interaction data as part of her presentation: Encyclopedia of Life—Applying Concepts from Amazon.com and Lego to Biodiversity Informatics.
- Information Visualization MOOC (Massive Open Online Course) led by Dr. Katy Börner of Indiana University, students Twy Bethard (United States), Andrew Miles (United Kingdom), Edward Kok (Netherlands), and Mattia Della Libera (Italy) used GloBI data to create an insightful visualization of spatial marine food webs in the Gulf of Mexico. The full PDF version of the visualization can be found here. Jim Simons, GloBI datasets contributor and lead for GoMexSI, provided an early version of the normalized GloBI data to Twy Bethard in an easy-to-use CSV file format.
If you are interested in getting access to a massive amount of global species-interaction data, please visit our EOL GloBI Data wiki or contact us through this blog.
Yesterday (Sunday, March 3), Robert Reiz, Chris Mungall, and I got together in Dolores Park, San Francisco, to talk about software architecture, deployment, and data models while enjoying a windy but sunny winter day. Our discussions were really fun, and I wanted to share some of the sketches we made with y'all. Doesn't a good idea always start with a bunch of drawings on the back of a napkin?
For those tech-savvy people out there, we've created some open-source repositories to host our software and datasets: eol-globi-data for normalizing, enriching, and importing species-interaction datasets; eol-globi-rest for implementing an API that is easy to embed in web pages; eol-globi-service for hosting the normalized datasets; and eol-globi-web, a Ruby on Rails web app that serves as an example of how to use the EOL-GloBI normalized species-interaction datasets and services. If you have any feedback or have the urge to contribute datasets or coding skills, don't hesitate to contact me.
I am happy to share that our proposal "Unleashing EOL's Species Interaction Datasets—Integration, Visualization, and Analysis" was selected for the Rubenstein Fellows Program 2013. I am looking forward to working with James D. Simons, GoMexSI curator; Chris Mungall, a bioinformaticist; Robert Reiz, a software engineer; and the EOL community to provide easy access to the massive amounts of species-interaction data that have already been collected. By combining these existing data collections, we are hoping to make it easier for scientists and enthusiasts to better understand species interactions as they happen over time and space.
subscribe via RSS