Customer Story

Big data and the race to save coral reefs

With climate change fast eliciting coral bleaching on global scales, marine biologists are using predictive modeling to help prioritize conservation efforts.

Khaled bin Sultan Living Oceans Foundation

ChallengeWith the rapid advance of climate change, coral reef health is in decline globally. Scientists are working against the clock to optimize reef preservation efforts and prevent further degradation and bleaching.
SolutionBy increasing the level of statistical sophistication in their research, marine biologists can draw more substantive conclusions from coral reef data sets. JMP® makes robust statistical modeling less daunting for non-statisticians.
ResultsThrough enhanced collaboration and data transparency, scientists are closer to developing new approaches aimed at best driving future coral reef conservation efforts.

Though coral reefs constitute less than 2% of the Earth’s oceans, they play a critical part in ocean health and are among the most diverse ecosystems on the planet. Not only are they home to scores of marine species, but their immense impact extends to local economies, supporting fisheries, encouraging tourism, and protecting shorelines from erosion – an impact valued at billions of dollars each year. With an outsized role in both environmental and economic success, protecting these fragile reefs should be a global priority, yet coral reef health around the world continues to decline. Climate change, in particular, has accelerated the degradation of reefs, and scientists now estimate that more than half of the world’s coral reefs have been lost over the past four decades due to climate change and other human-associated factors. Although scientists broadly understand the causes of coral reef bleaching and (later) death, predicting specifically which reefs are most at risk, which are most resilient, and which should be prioritized for conservation continues to be among the field’s foremost challenges.

Organizations like the Khaled bin Sultan Living Oceans Foundation seek answers to these questions, and marine scientists the world over are investing in international coral reef research, conservation, and restoration efforts. With their support, Dr. Anderson Mayfield, PhD, has devoted his career to the study of reef coral physiology, arguing that scientists need a more rigorous, statistically driven approach if they are to understand – and mitigate – the breakdown of corals in response to environmental shifts before it is too late.

Statistical approaches speed research in a field where scientists are fighting against time

In the field of marine biology – where so much is hidden from casual view – scientists’ first challenge is to observe and understand how many corals are out there, where they’re located, how healthy they are, and the risk factors they face. Documenting changes in reef health over time is paramount in efforts to understand how this fragile ecosystem is being affected by climate change-associated factors like increases in seawater temperature and acidity.

But with the immediacy of the threat now facing coral reef ecosystems around the world, Dr. Mayfield – an assistant scientist at the National Oceanic and Atmospheric Administration’s Atlantic Oceanographic and Meteorological Laboratory in Miami – believes there is a need to be more proactive about marine science, and especially marine conservation efforts. That’s why he and his collaborators are making the case for a statistical approach. Predictive modeling, he says, can help to determine which reefs are more susceptible to environmental stress – and which are most resilient – thereby enabling conservation funding agencies to prioritize targeted mitigation efforts.

Healthy coral polyps rely on an endosymbiotic relationship with microscopic algae that live in their tissue. These algae give the coral their color and serve as their primary food source. When water temperatures rise, however, the symbiotic relationship comes under stress and the algae begin to leave, causing the coral to turn white or “bleach” and starve.

“Some corals will bleach regardless,” Dr. Mayfield explains. “But there are other reefs where if we alleviate some of the local pressures, they're going to be more likely to survive.” Although it’s difficult to decide which reefs are most worth protecting, some, like a series of reefs in Southern Taiwan, have shown high resiliency in the face of increasing environmental pressures. “This seems to me like a good candidate for a reef we should prioritize for conservation and research [to determine] why it’s so resilient,” he says.

Identifying critical survival factors may allow researchers like Dr. Mayfield to better predict which reefs are more prone to stress and thwart future bleaching elsewhere.


  • Healthy corals at the base of an active volcano, Banda Islands, Maluku, Indonesia.

Boosting the statistical power of scientific conclusions

Dr. Mayfield’s statistical models represent a significant departure from his previous reliance on experimentation with coral conspecifics grown in laboratory tanks where he says results were not sufficiently statistically powerful to account for natural species variation. By contrast, “big data” statistical modeling is enabling him to make use of worldwide data surveys (such as those undertaken on the recently completed Living Oceans Foundation “Global Reef Expedition,” in which he was a participant) that are both deeper and more geographically expansive than data amassed by any individual scientist.

For example, in recent papers published in Platax (2018) and Journal of Sea Research (2019), Dr. Mayfield and his co-authors looked at data sets from understudied regions of Fiji’s Lau Archipelago and the deep South Pacific (Austral Islands of French Polynesia and the Cook Islands), respectively, using a combination of univariate and multivariate methods. Critically, the researchers explored 12 environmental factors (e.g., temperature, reef structure, fish biomass) expected to influence coral physiology. Best-fit models produced by stepwise regression and partial least squares showed that only a subset of such routinely assessed environmental parameters were needed to explain a significant portion of the variation in physiological response. Though several of the models in the studies were found to be of relatively low predictive capacity, Dr. Mayfield urges that the method is nonetheless a proof of concept and showcases the idea that it is worth attempting to use previously collected data to make predictions about marine animal health.


“We might already have all the data we need to address certain coral reef issues but we just aren’t analyzing them the right way.”

– Anderson Mayfield, PhD,
Assistant Scientist at the National Oceanic and Atmospheric Administration

Check out some of Dr. Mayfield's visualizations on JMP Public

From a graphing calculator to JMP®

When Dr. Mayfield began his training in marine biology first at Duke University and then at the University of Hawaii, Manoa, he was far from the outspoken proponent of statistical methods he is today. Having not yet been exposed to formal multivariate statistics in any real way – and because the variables in the tank experiments of his early career were largely known and tightly controlled – Mayfield jokes that his analysis was limited to what could be done on a graphing calculator (e.g., one-way ANOVA).

Dr. Mayfield’s reliance on rudimentary tools like calculators and Microsoft Excel, however, had to change as his research expanded into more field-based projects for which a new level of statistical sophistication was necessary. Fortunately, it was around that time that Mayfield also got started with JMP statistical discovery software.

Now he won’t use anything else.

“At first, I just used JMP to look at distributions and to carry out simple comparative analyses: t-tests, ANOVAs, linear regression and such. I probably took advantage of 5% or less of what JMP could do,” he recalls. Eventually, however, Dr. Mayfield resolved to pursue the other 95% – a move he credits with not only changing the way he analyzes data but also how those data are collected.

“Even when I'm out on the reef, I'm seeing JMP tables and figures, thinking about how I can get the most information out of a sample,” he says. Knowing the statistics and the capabilities available to him through JMP, Dr. Mayfield says he can make strategic decisions about where to focus his attention and which samples to collect while he’s underwater. “These are the kinds of things that I can directly test with JMP ahead of time,” he says.

Furthermore, Dr. Mayfield contends that with JMP, “even with the same amount of data, the amount [of information] it tells us has expanded exponentially.” Bringing the statistics in-house also limits his reliance on external statistical resources, which inevitably add more time to the research process. And time is a precious resource in light of the growing crisis of coral bleaching. “We don't really have the luxury to sit around for five years and think about how to analyze the data,” he explains.


  • Mass coral bleaching in the Peros Banhos region of the Chagos Archipelago in the Indian Ocean, summer 2015.

“It never occurred to me that I could actually use some of these ideas with coral”

Having taken the time to deepen his modeling skills within JMP, Dr. Mayfield is empowered to be creative with his data sets and devise a statistical approach that extracts previously hidden information. For ideas, he now looks to how other companies and institutions handle their large data sets. Analysts in other fields routinely use behavioral data to develop predictive algorithms about, for example, shopping preferences or maintenance schedules – why couldn’t the same principles be applied to coral reef behaviors and their survival?

“We may not be able to predict with 100% certainty whether a coral is going to die,” Dr. Mayfield theorizes, but “if we have enough data already about how corals behaved in the past ... we might say based on our past data that a one-degree [seawater] temperature increase results in a 30% change in coral growth.”

Potentially, he and his collaborators can use the worldwide data survey developed as part of the Living Oceans Foundation’s Global Reef Expedition to make the transition from explanation to prediction. “These are the kinds of things I want to explore now that I know more about what JMP can offer,” Dr. Mayfield says. He, however, was quick to echo a key message gleaned from the pioneering work of Galit Shmueli (2011) in that, just because a data set excels at explaining past observations, it does not ipso facto signify that this same data set will necessarily allow one to make predictions on the future behavior of the target animals. This is not to say that explanatory and predictive capacity are never in sync; just that one must validate this relationship empirically.

Though it's still a nascent idea, Dr. Mayfield is excited to explore the opportunities big data affords in the field of reef coral physiology. With new models developed in JMP, he can test their predictive power with an extensive field data set. “We won't know until we test [a model],” he says, but luckily, JMP has in-built platforms that support this type of model accuracy testing; when you build your models, JMP will tell you whether your model worked for each sample. “JMP’s predictive modeling platforms already have the pieces of the puzzle in place, they’re just waiting for this kind of data to come so that we can play around with it.”

Dr. Mayfield says he’s taken advantage of not only the modeling features of JMP, but also the neural networking and outlier analysis platforms. Regarding the latter, rather than throw out data from corals displaying aberrant behavior (i.e., the outliers), he argues that, in many cases, these oddly functioning individuals – which tend to be the most stress-resistant in his data sets – might actually be those of most interest to physiologists. In other words, such features of JMP allow Dr. Mayfield to identify the coral biopsies he should prioritize under the real-world limitation that, in many cases, there are insufficient funds to analyze every biopsy to its fullest capacity.


  • Acroporid corals off Makassar, Sulawesi, Indonesia.


  • Acroporid corals off Raja Ampat, Papua, Indonesia.


  • Acroporid corals off Komodo National Park, Nusa Tenggara, Indonesia.

Accelerating research through data transparency

As Dr. Mayfield is painfully aware, copious data languish unused by researchers who lack either the time or statistical knowledge to analyze them. Moreover, given the pressure to publish and competition between scientists, most are reluctant to share their data (even data they have already used in published works). With such a time-sensitive problem like coral reef bleaching, though, data sharing could be a key factor in accelerating the state of the science and determining the most effective conservation strategies. “Especially in our field, [scientists] should be more open about how they analyze the data and make real efforts to get it out there. It's not doing any good just sitting on a computer,” he says.

And Dr. Mayfield leads by example – JMP gives him the tools for promoting transparency in his own research, such as interactive HTML features that allow for real-time data analyses. “JMP has given me so many ideas – new ways of making my data more interactive and not just in terms of visualization but showing people the thought process behind how I analyzed it,” he says. In addition to posting data sets to his personal website coralreefdiagnostics.com, Dr. Mayfield now uses JMP Public as a means of disseminating dynamic visualizations. The software’s data filtering capacity, he says, permits his collaborators and other interested individuals to “find” corals demonstrating certain characteristics in the data set – a bleached coral from Fiji that was over-expressing fluorescent proteins, for example. This feature in JMP Public will allow both present and future collaborators to quickly identify samples they may wish to analyze in more detail, be they archived biopsies or the GPS coordinates of the originally sampled coral colonies themselves.

> Check out a principal components analysis in JMP Public showing reef data from Fiji 

“Most people will fall asleep if you give them a one-hour lecture on statistics, but maybe if they could just see how you went from this data table to this figure ... viewers would then be more confident in your data.” Dr. Mayfield hopes that by disclosing the details of his own research, and by boosting the reproducibility of his methods, he can encourage colleagues toward similar transparency, bringing the field together in pursuit of tangible solutions to coral reef decline.

“We might already have all the data we need to address certain coral reef issues,” says Dr. Mayfield. “We just aren’t analyzing them the right way.” JMP could change that. And now, Dr. Mayfield’s research is only limited by the questions he asks.

References

Mayfield AB, Dempsey AC, Inamdar J, Chen CS (2018), A statistical platform for assessing coral health in an era of changing global climate-I: A case study from Fiji’s Lau Archipelago. Platax 15, 1-35.

Mayfield AB, Chen CS, Dempsey AC (2019), Modeling environmentally mediated variation in reef coral physiology. Journal of Sea Research 145, 44-54.

Shmueli G (2011), To explain or to predict? Statistical Science 25(3), 289-310.

The results illustrated in this article are specific to the particular situations, business models, data input and computing environments described herein. Each SAS customer’s experience is unique, based on business and technical variables, and all statements must be considered nontypical. Actual savings, results and performance characteristics will vary depending on individual customer configurations and conditions. SAS does not guarantee or represent that every customer will achieve similar results. The only warranties for SAS products and services are those that are set forth in the express warranty statements in the written agreement for such products and services. Nothing herein should be construed as constituting an additional warranty. Customers have shared their successes with SAS as part of an agreed-upon contractual exchange or project success summarization following a successful implementation of SAS software.