17 - Keys to successful collaborations, part 2: Agree on the format of data & samples

Understand how initial data & samples are produced; share all publishable data in interpretable form

Reading time: 6 mins

In the previous post, we saw that the #1 source of collaboration disputes was n o t having agreed on authorship in advance. Now we will learn about the second main source of disputes: not having agreed on the precise format of the data and samples to be shared during the collaboration.

I can hear you asking “Is the format of data and samples shared really that important to be one of the main sources of problems in collaborations”? I understand your surprise, and yes, as often happens, the devil is in the details.

There are challenges associated with data and samples at two key moments during the collaboration:

the initial sharing of data and samples;
the interpretation of data and samples generated by the collaboration.

Let’s see how to prevent problems at each of these stages.

1) Understand how initial data & samples are produced and validated

The first misunderstandings occur when collaborators share data or samples at the beginning of the collaboration. They may phrase their requests in terms they believe are precise, but which can be misunderstood by researchers from other fields.

For example, when a biologist says they will provide a ‘pure’ sample of protein, they might mean ‘95% pure’, while a chemist might assume it’s 99.99% pure. The chemist could understandably become quite annoyed upon realizing this isn’t the case. In addition, you’ll have lost precious time and goodwill.

You can see why a precise mutual understanding of the format of data and samples is crucial.

Tip: when you discuss about initial data and samples, ask how they'll be produced and validated. This will give you a clearer understanding of the format in which they will be provided.

A horror story with a happy ending

A major misunderstanding about data format occurred in one of my own collaborations: my collaborator and I argued for months about whether I ‘had’ (me) or ‘hadn’t’ (my collaborator) provided the data requested. This lasted until I realized that it would be best to meet in person. I invited my collaborator over, and only then did we discover a misunderstanding about the format of the data he was requesting – which was very easy to fix.

Two researchers chatting amicably

Discussing in person rather than by email solves most misunderstandings

Now, let’s look at the format in which the data generated during the collaboration should be shared.

2) Ensure publishable data is shared by all collaborators in interpretable form

By definition, collaborations produce data that will form the basis of joint papers. A commonly accepted rule is that all collaborators should have access to these data in an interpretable form. This point is crucial, as raw data is rarely straightforward to interpret, except for experts (and you’re probably not an expert in your collaborator’s field, or you wouldn’t need a collaboration).

For example, you won’t be able to interpret your collaborators’ data if doing so requires expensive or proprietary image software, or if your collaborators dump enormous amounts of raw data for you to parse, knowing that you can’t program or don’t have the time. Unfortunately, this happens often.

Can't interpret these raw data?

That's why you should always ask collaborators to provide data in interpretable form

A horror story with a bittersweet ending

As an example, here’s my own horror story. I was preparing a paper with a collaborator and had doubts about the results he drew from his data (for more on the relationship between data and results, check out this post: The OHEDR format for crystal-clear results paragraphs). I had a feeling that the results were overstating the data, yet I couldn’t be sure, so I asked him for access to the raw data.

My collaborator refused, because it would take him a month of work, whereas he was nearing the end of his postdoc. I had to accept that I wouldn’t be able to examine the raw data and focused instead on toning down the paper’s conclusion as much as possible. The paper was accepted, but we both felt like we had been shortchanged…

If instead we’d had this discussion before embarking on a collaboration – that is, if I had requested that my prospective collaborator provide me with the data he generated in interpretable form – he would have budgeted the time. Had he refused, I wouldn’t have collaborated with him.

Publishing an article without having had access to my collaborator's data in interpretable form

“Do I always need to check the interpretation of the data produced by my collaborators? Can’t I just engage in a ‘service’ collaboration?”

You’re right; you may choose to engage in simple ‘service’ collaborations, in which you rely on a researcher for their technical expertise and trust the data they generate as well as their interpretation. You can read more about this type of collaboration here: Guidelines for Negotiating Scientific Collaboration.

But you should be extra-careful before engaging in such collaborations, precisely because you won’t be checking the data. So, do your due diligence. Discuss with your collaborators the limitations of their techniques. They should be able to speak for an hour about the methodological mistakes in their field, how they avoid them, and how they validate the quality of their data. Otherwise, it’s best not to collaborate with them.

In summary

In summary, to determine the precise format in which to share data and samples, you need to understand:

how the initial data and samples are prepared and validated;
which type of data and samples will be generated during the collaboration;
how the generated data will be interpreted;
the limitations of techniques used at each step.

Tip: include an initial quality check in your collaboration agreement. Indeed, it is not uncommon for the samples provided to be different from what you expected, or for their quality to be subpar. A colleague of mine once received a piece of DNA that was supposed to contain a gene of interest – but it was not complete, so their experiments would never have worked! I’m sure that many of you will nod your heads in recognition when reading this :-).

Remember, as we saw in the previous post: "Never A.S.S.U.M.E. because it makes an ASS of U and ME!"

And that’s it for the format of data and samples, one of the two most important things to agree upon before embarking on a collaboration :-). If you want to know more, don’t hesitate to contact me for a one-day course The keys to successful collaborations, in which I present the other common sources of collaboration disputes and how to avoid them.

Final note: if you cannot agree on authorship or on the data & samples format, don't collaborate

Of course, if you and your prospective collaborator cannot agree upon authorship or the format of data to be shared, don’t engage in collaboration. Because anything you haven’t agreed on will come back and bite you.

Did you already know these tips? If not, do they resonate with your experience? Do you have collaboration horror stories to share? Do you have questions? Write me at david _at_moretime4research.com.

Have a nice day and fruitful research.

David

PS: If this post is useful, consider linking it to your website, and let me know at david_at_moretime4research.com. Thanks!

Back to blog

Item added to your cart