What happens if my pragmatic approach to data analysis gets reviewed by a card-carrying statistician? “The absence of competent statistical guidance in this MS is becoming painfully obvious.” Oh my!
It’s been almost a year since Trinh et al, “Practical and Robust Identification of Molecular Subtypes in Colorectal Cancer by Immunohistochemistry” came out in Clinical Cancer Research. Let me tell you a bit about the reviews we got back in summer 2016.
One reviewer was exclusively focusing on our data analysis, so I assume this was a card-carrying statistician invited for exactly this purpose. I definitely have no problem with that. It is a very good thing that submissions to clinical journals get reviewed by statisticians. It’s just that the one we got was a particularly narrow-minded specimen.
The reviewer claimed to have spotted major statistical flaws in our analysis and made condescending remarks like “The absence of competent statistical guidance in this MS is becoming painfully obvious.”
Let me give you a run-through of our crimes:
First, we wrote a sentence along the lines of “cases in this subtype are enriched in late-stage disease” — the wording could be smoother, I agree, but that’s not what upset the reviewer:
I have no idea what this means – I am not an oncologist, but “enriched” – is this a real word? Does it have a meaning in the context of “late-stage disease”?
Yes, ‘enriched’ is a real word and refers to an increased proportion. And it is a commonly used term in genomics. Just think of Gene Set Enrichment Analysis. This example alone gives you an idea how literal-minded and stubborn this reviewer was.
Next, we used the words “normalization was applied“.
This is absolutely meaningless. The term [normalization] itself is extremely variable – it means “make a distribution more like a normal.” Probably what is meant is ”standardization”. But this can again mean a lot of different things.
Yes, I guess, ‘make a distribution more like a normal’ is the literal meaning of ‘normalization’. In genomics it means “removal of unwanted variation“.
“Principal component analysis of the normalized data illustrated… ” This is meaninglessness squared. A principal components analysis obtains scores. But of what variables? This is not stated. The analysis details are important. But since the standardized analysis is unclear, the use of a principal components analysis of unknown variables is doubly unclear. [my emphasis]
Well, for a statistically challenged researcher like myself, principal components analysis obtains 2D scatter plots, which give you an overview of how your data are organised (and yes, these plots are based on scores and loadings and bla bla bla). I had never before met a person who claimed not to understand what “a PCA of the data” meant.
Well, we got the paper in in the end. How did we do it?
Step 1: Sleep over it, don’t respond to reviewers immediately.
Step 2: swallow your Ego.
Step 3: No matter how stupid the reviewer, no matter how condescending and irrelevant the remarks, stay friendly and constructive in your response.
That’s not the same as brown-nosing: we explained that these terms might be used differently in different communities and we did not apologize, since everything we had said was correct.
But in my experience it is very hard to try and just talk your way out of negative reviewer comments (unless, of course, the reviewer made an obvious factual mistake), so we changed the text a little bit here and there just to show our good-will.
And for me as a reviewer: I am often asked to comment on data analyses and experiences like the one above taught me to be more forgiving when authors use terminology differently from how I am used to. It is much more constructive to write in a report “I would describe this differently” than “You obviously don’t know any statistics”.