PLOS clarification confuses me morePosted: March 3, 2014
…the Data Policy states the ‘minimal dataset’ consists “of the dataset used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. This does not mean that authors must submit all data collected as part of the research, but that they must provide the data that are relevant to the specific analysis presented in the paper.
I have been trying to parse this for the last couple of days. I cannot see how “the dataset used to reach the conclusions” could somehow be something different from “all data collected as part of the research.” Trivially you could mean “experiments the results of which are not in the paper.” But that’s dumb, who would even consider including that?
So what IS the distinction PLOS is trying to make? Are the videos of monkeys that Marc Hauser used the “minimal dataset” or are they in the vague subsection of “all data collected as part of the research” category you don’t need to make accessible? If you had 30 DVDs of videos, how would you make them accessible with metadata and DOIs if you wanted to? I can’t see how those videos could be anything other than “the dataset used to reach the conclusions.” But the impression I’ve gotten from others (and from the lengthy PLOS clarifications and FAQs), however, is that the output file from the manual coding of these videos would be the expected “available” data under the PLOS policy.
This illustrates my point, and what PLOS seems unwilling to address. Making the video accessible in the way they demand is insanely burdensome. Making the data file that is the result of coding the video accessible is easy but pointless. Everything important about the analysis occurred between the video and the coded file.
This isn’t just true of behavior videos…it’s nearly anything where you perform measurements that attempt to extract “important” features from complex, continuous phenomena. In other words, a LOT of experimental science. When you have large video or physiology datasets, there is usually not an agreed upon standard for how you convert that into a manageable set of measurements that are amenable to quantitative analysis, or necessarily even agreement on what features are important to measure. For the data I collect, for example, some of are analyzed purely by code. This is great for resting easy about bias, but often makes terrible, systematic mistakes that have to be manually checked, because code is so stupid. For example, often there is no “correct” threshold or segmentation criterion that is going to work in every case, so you either have to have exclusion criteria or a partial human judgment step or some other way of filtering your data. At the other end of the spectrum are totally subjectively scored events (like Hauser’s monkeys). Because humans are humans, you do your best to make the experimenter blinded to whatever independent variables you are interested in. That’s not always possible, so I try to avoid experiments that rely solely on “eyeballing” something.
Most behavioral or physiological analysis is somewhere between “pure code” analysis and “eyeball” analysis. It happens over several stages of acquisition, segmenting, filtering, thresholding, transforming, and converting into a final numeric representation that is amenable to statistical testing or clear representation. Some of these steps are moving numbers from column A to row C and dividing by x. Others require judgment. It’s just like that. There is no right answer or ideal measurement, just the “best” (at the moment, with available methods) way to usefully reduce something intractable to something tractable.
It is interrogating or revisiting this judgment that is the only purpose I can see to “open data” for these kinds of experiments, so that means you need every TB of raw data to make this useful. So, again, is the PLOS policy to make this happen? If so, how? If not, it is pointless. That’s why I think for these kind of data and in the absence of a massive investment (on the scale of Genbank) repository for enormous video files and raw time series data (in what formats? annotated how? paid for by whom?), the PLOS policy is either impossibly burdensome or pointless. Even then, given the large set of uncontrollable and unknown variables that affect the collection of experimental data like this, pooling them or comparing them would be very bad science indeed.
I’m trying to think of more ways to frame this. here’s another try:
There is no answer to the question “what is rat behavior in response to X?” in the sense that there is an answer to “what is the structure of rat myoglobin?” There is only what some particular group of rats did when some version of X (as understood and applied by a given experimenter under lab conditions that they can only partially control) happened to them. The “data” are everything those rats did that the experimenter chose to observe/measure after they did X to them. (You’ve already made important choices in deciding what to measure.) Many information-reducing and analytical steps later, there is a summary of what you decided was important and quantifiable about what the rats did.
This approximate, mediated, interpreted, judgment-based, tentative kind of conclusion is what neuroscience (and many sciences) lives with. We never get discrete right/wrong answers about sequences, or phosphorylation sites, or the number of planets around a particular star, or the mass of a subatomic particle. We aren’t measuring facts of nature, we are asking “what kinds of things usually happen when…?”
If you think that’s frustrating to the “standardize your formats so all your data can be pooled/analyzed by others” set, imagine how frustrating it is to those of use who are doing the experiments. I wish there were a sequence of numbers I could pull out of a behavioral dataset that is the True Result of That Experiment, let alone The True Facts of That Behavior. There is not and never will be, there is only the raw video/trace of some stuff that happened one time.
The gene jockeys keep saying “we have these standards/repositories because the community demanded them.” It might be worth reflecting on reasons why other communities haven’t demanded them beyond being bad scientists, old fashioned, selfish, and uncollaborative.
I will also note that an ongoing issue at PLOS ONE is the extent to which a given editor buys into the PLOS ONE mission. This is as clearly an articulated policy as one could hope for (and one I deeply believe in), and the majority of editors understand it and follow it even when reviewers don’t. A significant portion of the editors, however, don’t get the mission and don’t follow it, happily rejecting papers based on reviewers that say they aren’t “novel” or “exciting” enough. A smaller proportion don’t even seem to know that P1 has a mission that differentiates if from other journals and go on about how P1 should be working to increase its impact factor. Given this editorial variance, I can’t wait to see the wildly differing experiences people have navigating this ball of confusion about what a “minimal dataset” is.