【英語論文の書き方】第79回　「データの解析（パート3）：データを提示する」について論文翻訳・英文校正　研究者専門の翻訳会社　ワールド翻訳サービス

受付時間：平日9：00～18：00

【英語論文の書き方】第79回　「データの解析（パート3）：データを提示する」について

2021年3月11日 14時43分

第78回では「データの解析（パート2）：統計分析」を取り上げました。

第79（今回）のテーマは
「データの解析（パート３）：データを提示する」についてです。

この三部作の記事のパート１と２では、
発見したことを確認するためのデータ探索の方法と、
予備的に理解した内容を確かめるために
厳密にデータ分析を行う方法についてお話しました。
最後の記事となる今回のパート３では、
発見したことを読者に対して説明するためには
どのようにデータを提示すれば良いのか、
また、あなたの解釈が正しいものであると
読者を納得させるためにはどうすれば良いのか、
という点についてお伝えします。

ここでの第一目標は、結論を裏付けるようなデータを提示すること、
そして、その結論を支持するための、
説得力のある根拠につながる一連の結果を選択することです。
このプロセスは、統計で使ったような
数学的、統計的な方法で説明するのではなく、
説得力をもって、自分の考えと議論を整理するものとして考えてみてください。
（これらの詳細は、Methodsの部分で述べるものとなります）

それでは、記事をお読みください。

Analyzing your data (part 3 of 3): presenting your data By Geoffrey Hart

In parts 1 and 2 of this three-part article, I described how to explore your data to see what you’ve discovered and how to rigorously analyze the data to confirm your preliminary interpretations. In this concluding part, I’ll discuss how to present your data to show your readers what you’ve discovered and convince them your interpretation is correct. Here, the primary goal is to present data that support your conclusions, and to choose a sequence of results that creates a compelling argument in favor of your conclusions. Think of this as organizing your thoughts and your argument in a persuasive way, not as describing the mathematical and statistical methods used in your analysis; those details will be present in the Methods section.

Standardizing data

Note: Although terminology varies, normalization usually refers to transformations that are intended to produce a normal distribution, whereas standardization is intended to account for different initial values in different treatments, regardless of their statistical distribution.
In part 2 of this article, I discussed some problems that result from transforming your data. A less problematic form of transformation is to express results as a proportion of some base value, such as the value in a control or the initial value in a time series, rather than examining only the raw data. The analysis then changes from a comparison of sample means to a comparison of changes in those means. The changes may be based on a difference (i.e., you calculate the final value minus the original value) or a proportion (you divide all values by the original value) or by the proportional change in values (you subtract the original value from the current value, and divide that difference by the original value). One popular technique is to use z-scores, which transform all values into a number of standard deviations from the mean. Other standardizations include expressing values per unit area, per unit mass, or per capita.
Note: Always provide the standard deviation or standard error with every mean, or a box plot that shows the variation around a median, so that readers will understand the magnitude of the variation in your results. Present the sample size to provide additional insights into that variation.
Standardization is a powerful way to clarify changes in a data series and differences between treatments because it accounts for factors that might bias your interpretation of those changes, such as differences in the initial value. However, as is the case in any transformation of data, you must remember to account for the consequences of the transformation. The raw values and standardized values have different meanings. For example, if only a small percentage of a region’s farmland becomes degraded due to an unsustainable agricultural practice, the percentage suggests that the impacts are not serious. The proportion is, after all, small. But if this degradation occurs over a very large agricultural area, the total area that became degraded becomes large and important. Conversely, what seems like a large proportional change based on the transformed data may prove to be unimportant in practice. For example, if the survival rate for a plant disease increases from 1 in 1000 to 2 in 1000, the increase is [2–1]/1 =1.00 = 100%. However, that increase has little practical significance for farmers, and is as likely to result from random chance as it does from a successful treatment. An increase from 100 in 1000 to 200 in 1000 represents exactly the same proportional change (100%), but the survival of 100 additional individuals is more likely to be important.
Like any transformation, standardizing your data loses some information and changes the nature of the data you’re looking at. Keep those changes in mind as you decide how to present your interpretation.

Evaluating non-significant results

Sometimes a specific experimental design fails to reveal a significant difference between treatments. Ask yourself why. For example, researchers who don’t review the literature to learn the expected magnitude of the variation before they design their experiment often choose a too-small sample size, leading to high variation in the results that obscures the differences. Alternatively, budget and time constraints may force you to use a too-small sample. In that case, you may need to present your study as exploratory, with the goal of increasing understanding of the study system so you can design a better experiment for your subsequent research.
Researchers prefer positive (i.e., statistically significant) results because journals have a strong bias against reporting negative results (i.e., a lack of statistical significance). However, negative results can be very important, as in the case of a medicine that produces no beneficial effect. If you designed your experiment well, have carefully controlled your selection of the study population, have obtained a large dataset, have validated your data by repeatedly calibrating your instruments against lab standards, and have replicated your results, you can be more confident that the negative result is real. For subjective data, such as the data generated by many sociology and psychology studies, asking a colleague to classify the results to see whether they agree with your classification increases confidence in the classification results. Where interpretations differ, you can discuss the difference and try to design a criterion that makes the classification more objective. With luck, that criterion will help you to agree about the correct classification.
Additional confidence can be provided using an experimental design based on triangulation. If two methods of measuring the same variable agree, the probability that a negative result is an error rather than a true lack of difference is much lower. For example, you could calculate the area of a leaf using a digital caliper and an empirically derived relationship between length, width, and area, or you could scan the leaf and use software to calculate its area. Similarly, if analyses of two different aspects of the same process lead to the same conclusion, that also reduces the likelihood that the lack of significance is an error. For example, if you measure the effect of activation of a gene using both the RNA produced by the gene and proteins produced by transcription of that RNA, and both results show no change in the study system in response to that gene expression, you can be more confident that activation of the gene produced no significant effect.
In extreme cases, negative results may even reverse the conclusions you reached in previous research. This can be a very good thing if it improves understanding of your subject. As an example, see Sager (2020). Of course, if you want to replace the prevailing understanding of a phenomenon with a new understanding, you’ll need strong evidence, and lots of it, to convince everyone. Tell readers what additional research will be required to support your proposed new description.

Presenting datasets clearly and consistently

Help your readers follow your description of the data by choosing a criterion for judging a result’s importance. Statistical significance is one obvious criterion, but significant results may not be meaningful in practice, as in the example of proportional changes that I described earlier in this article. Choose an appropriate characteristic of the data you are describing. For example, when you discuss the vectors for the variables in a redundancy analysis or principal-coordinates ordination, you can limit your description to only the vectors that are longer than a certain threshold length and that lie at an angle of <30° from the axis. Other vectors may be significant, but their correlation with the axis will be weaker, and that means you can omit those vectors from your discussion. The criterion you choose tells you which results you should focus on, which is particularly important when you can’t discuss every result (e.g., in a large, multi-variable dataset).
Next, choose an efficient sequence to work through the data in a figure or table.
For example:

In a linear regression analysis, describe the trend for each regression line separately. For example, y increased continuously with increasing x in treatment 1, but decreased continuously in the control. Next, examine the differences between each pair of lines. Treatment 1 may have values less than those in treatment 2 up to a certain point, then achieve higher values subsequently.
In a table that presents multiple variables for each treatment, describe each variable, one at a time, to show how that variable differs among the treatments. Then repeat this process for the next variable and the next one until you reach the end of the variables.

Follow that order rigorously until you have described all data in the figure or table that meets your criterion for what to describe.

Constraining your presentation

Be cautious about extrapolating beyond the range of your data. Your data often describes only a small portion of the total range of possible values for a variable. If that total range is much larger, extending your interpretation beyond the range of your data is risky. For example, the beneficial response to a drug often increases with increasing dosage, right up to the point that the drug reaches toxic levels in the patient. Even if your intuitive knowledge of the situation suggests the behavior does not change for smaller or larger values, explain why you believe your assumption is valid, and suggest any cautions that are required if someone tries to extrapolate beyond your data.

Acknowledgments

I’m grateful for the reality check on my statistical descriptions provided by Dr. Julian Norghauer (https://www.statsediting.com/about.html). Any errors in this article are my sole responsibility.

Reference

Sager, W.W. 2020. Massif redo. Scientific American May 2020:48-53.

無料メルマガ登録

メールアドレス
お名前

これからも約2週間に一度のペースで、英語で論文を書く方向けに役立つコンテンツをお届けしていきますので、お見逃しのないよう、上記のフォームよりご登録ください。

もちろん無料です。

バックナンバー

〒300-1206
茨城県牛久市ひたち野西3-12-2
オリオンピアA-5

TEL 029-870-3307
FAX　029-870-3308

【英語論文の書き方】第79回 「データの解析（パート3）：データを提示する」について 論文翻訳・英文校正 研究者専門の翻訳会社 ワールド翻訳サービス