【英語論文の書き方】第82回 「研究データと関連文書の管理(パート1):研究内容を文書で厳密に記録することがなぜ大切なのか」について

2021年6月18日 17時11分

第81回では「データ以外のもの(パート2)」を取り上げました。

第82回(今回)のテーマは
「研究データと関連文書の管理(パート1):研究内容を文書で厳密に記録することがなぜ大切なのか」についてです。
 
研究者は、データを長期的に残し、
未来の研究者がそのデータを利用できるように
保管することがますます求められています。
 
Kathy R. Berensonは、この課題について
完璧で素晴らしい概要を2018年に出版しています*が、
それは心理学分野の研究にフォーカスしたものとなっています。
(*Berenson, K.R. 2018, Managing Your Research Data and Documentation. American Psychological Association. 105 p. including index)
 
今回の4部作の記事では、彼女が書いた内容を踏まえて、
もっと実地調査や実験室での研究にフォーカスした概要をお伝えします。
 
どのような研究を行うにしても、入力データの品質、
その記録、分析の仕方を管理する方法を見つけることが大切です。
 
ぜひご参考になさってください。

 

Managing your study data and the supporting documentation By Geoffrey Hart

Increasingly, researchers are being asked to archive their data so that it remains available in the long term and can be used by future researchers. Kathy R. Berenson (2018) provides a thorough and excellent overview of the subject, but with a focus on psychology research. In this series of four articles, I’ll build upon what she’s written to provide a summary that focuses more on field and laboratory research. Whatever type of research you perform, it’s important to find ways to control the quality of your input data and how it is recorded and analyzed. For example, if you’re studying an ecosystem and want to compare your results with another researcher’s methodology, you need to find a way to choose an ecosystem that is either sufficiently similar to the ecosystem in their study that you can obtain the same results, or sufficiently different that you can observe a different response and learn new things about the causes of those differences.
Note: Berenson, K.R. 2018, Managing Your Research Data and Documentation. American Psychological Association. 105 p. including index

Part 1: Why it’s important to rigorously document your study

Replicability is an essential part of science, particularly in fields such as psychology and the social sciences that include human subjects and that directly affect human lives. In these fields, studies have proven particularly difficult to replicate, in large part due to the high variability in any human population and the difficulty selecting a representative subset of that population. To mitigate this problem, it’s necessary to manage your data and document how you obtained the data and analyzed it sufficiently well that you could give your dataset and instructions to a colleague and they could reanalyze your data and produce identical results. (This is particularly important to facilitate meta-analyses of large datasets.) Better still, your documentation should be so complete and detailed that another researcher could control their experimental conditions well enough that if they repeated your study, they would be likely to obtain similar results.
A further complication involves the fact that modern research generates large quantities of complex data, and most researchers receive little or no training in how to manage that data. Fortunately, there are logical ways to proceed that make data management easier and more effective. One of the key strategies is to develop objective, clear methods for how to organize, process, and analyze your data—and how to explain what you’ve done so well that another researcher could exactly repeat what you did.
There are important ethical considerations to managing data so that it can be shared. For example, research that received government funding should be available to all the taxpayers who supported that research. This issue is sufficiently important that the American Psychological Association asks its members to ensure that their data remains available for a minimum of 5 years. A better target might be 10 years, particularly for important and innovative research or research that was difficult and expensive to perform.

Create a hierarchical project structure on your computer

Every research project should have its own folder (directory) on your computer. That folder and all subordinate folders it contains should be named clearly enough to distinguish it from the many similar projects that you will subsequently perform during your research career. How to define that name depends on how your research is organized. If you are early in your career and are only conducting one or two studies simultaneously, the folder name may be as simple as your name, a key word related to the subject, and the year. For example, if your research is all stored in a folder named “Research”, your current project might be named “Hart 2020 drought stress [field or lab] study”. If you’re part of a research group, you may need to include the names of the principal investigators or you may need to use your employer’s project naming structure.
Note: If your research group comprises people from multiple institutions, create a document that lists complete names and contact information for all investigators that was valid at the time of the study. If possible, try to update this document periodically to include current contact information.
Avoid nicknames and abbreviations that only your immediate research group will know. After a few years, these shortcuts may become problematic because your colleagues may have moved to another institution or retired, and institutional memory of these shortcuts may have been lost; that is, the remaining researchers may no longer remember the meanings of those shortcuts or their reasons. If it’s necessary to use a complex project naming system that only bureaucrats could love, consider creating a document named “Explanation of project folder names” that provides the necessary explanations.
How you name the folders for each project depends on the nature of your research and how you approach the design and subsequent management of your studies. Berenson (2018) recommends creating the following subfolders:
  • Project files: All of the “paperwork” associated with a project (whether scans of paper documents or electronic copies), such as funding applications and Institutional Review Board approvals.
  • Data files: All of your original data, formatted as “read only” so that it can’t be accidentally changed. Most computer operating systems let you apply this format directly from the file management system (e.g., the File Explorer in Windows, the Macintosh Finder).
    To protect files against accidental modification, change their format to “read only”:
  • Macintosh: Select the file in a Finder window, and then press Command+I to reveal the file information dialog box. Scroll down to the heading “Sharing & Permissions”. For the “everyone” settings, under the heading “Privilege”, select “Read Only”.
  • Windows: Using the Windows File Explorer, right-click or Control-click the file and select “Properties”. Select the checkbox beside “Read only”.
  • Working files: All files that represent your work in progress, such as “cleaned” data files (files from which outliers and erroneous data have been removed), and not the original data files.
  • Command files: This includes all data-processing scripts (e.g., for the R statistical software) and the code for any software you developed to analyze your data. (I’ll discuss command files in more detail in Part 3 of this article.) If that software evolves during the project, create a version control system so that you can retain old versions of the software in case you need to return to an older version.
  • Replication files: Files that you can provide to someone who wants to replicate your analysis, using either your data or their own data.
Note that although these file protections are helpful, they are not a substitute for a rigorous backup and archiving strategy. Although your employer’s computer staff should implement such a strategy for you, you can also create your own backups. For some thoughts on how to do this, see <http://geoff-hart.com/articles/2021/backups-1.html>.
An alternative project folder structure might be folders named “original data”, “working data”, “method documentation, and “paperwork”. In this series of articles, I’ll use Berenson’s suggested names so that if you choose to consult her book, you can more easily find details that I don’t have room to discuss here. However, I’ll discuss her categories in an order that seems more logical to me, as it more closely follows the order in which you will perform your research.
Whatever names and structure you choose, create a standard nomenclature for all files in a given category and document that nomenclature so that a year after you developed it, when you begin your next project, you can create names that are consistent with that system of names and equally easy to understand. (This is also useful for variable names, since this will make it easier for futures to compare your results from different studies.) For example, your data files could be named using the format Site–Date–Treatment or Subject–Trial number–Data type. This will make it much easier to organize and find files because when you display them in alphabetical order on your computer, the files for a given project will be grouped under the same names. Avoid cryptic coded names, even if you document the meanings of those names in a separate document. The ideal system should be easily understood without referring to this document. Modern computers allow long file names, and carefully chosen names reduce the risk of misinterpreting a name and assigning it to the wrong date or treatment.
Note: Don’t rely on computer timestamps to create versions of a file; if you open a file and save it, that date will change. Instead, explicitly add the date. For example, “Cleaned data--2 March 2021--outliers removed”.
In Part 2 of this article, I’ll discuss how to manage your project files.
 

無料メルマガ登録

メールアドレス
お名前

これからも約2週間に一度のペースで、英語で論文を書く方向けに役立つコンテンツをお届けしていきますので、お見逃しのないよう、上記のフォームよりご登録ください。
 
もちろん無料です。

バックナンバー

第1回 if、in case、when の正しい使い分け:確実性の程度を英語で正しく表現する

第2回 「装置」に対する英語表現

第3回 助動詞のニュアンスを正しく理解する:「~することが出来た」「~することが出来なかった」の表現

第4回 「~を用いて」の表現:by と with の違い

第5回 技術英文で使われる代名詞のitおよび指示代名詞thisとthatの違いとそれらの使用法

第6回 原因・結果を表す動詞の正しい使い方:その1 原因→結果

第7回 原因・結果を表す動詞の使い方:その2 結果→原因

第8回 受動態の多用と誤用に注意

第9回 top-heavyな英文を避ける

第10回 名詞の修飾語を前から修飾する場合の表現法

第11回 受動態による効果的表現

第12回 同格を表す接続詞thatの使い方

第13回 「技術」を表す英語表現

第14回 「特別に」を表す英語表現

第15回 所有を示すアポストロフィー + s ( ’s) の使い方

第16回 「つまり」「言い換えれば」を表す表現

第17回 寸法や重量を表す表現

第18回 前置詞 of の使い方: Part 1

第19回 前置詞 of の使い方: Part 2

第20回 物体や物質を表す英語表現

第21回 句動詞表現より1語動詞での表現へ

第22回 不定詞と動名詞: Part 1

第23回 不定詞と動名詞の使い分け: Part 2

第24回 理由を表す表現

第25回 総称表現 (a, theの使い方を含む)

第26回研究開発」を表す英語表現

第27回 「0~1の数値は単数か複数か?」

第28回 「時制-現在形の動詞の使い方」

第29回  then, however, therefore, for example など接続副詞の使い方​

第30回  まちがえやすいusing, based onの使い方-分詞構文​

第31回  比率や割合の表現(ratio, rate, proportion, percent, percentage)

第32回 英語論文の書き方 総集編

第33回 Quality Review Issue No. 23 report, show の時制について​

第34回 Quality Review Issue No. 24 参考文献で日本語論文をどう記載すべきか​

第35回 Quality Review Issue No. 25 略語を書き出すときによくある間違いとは?​

第36回 Quality Review Issue No. 26 %と℃の前にスペースを入れるかどうか

第37回 Quality Review Issue No. 27 同じ種類の名詞が続くとき冠詞は付けるべき?!​

第38回 Quality Review Issue No. 22  日本人が特に間違えやすい副詞の使い方​

第39回 Quality Review Issue No. 21  previous, preceding, earlierなどの表現のちがい

第40回 Quality Review Issue No. 20 using XX, by XXの表現の違い

第41回 Quality Review Issue No. 19 increase, rise, surgeなど動詞の選び方

第42回 Quality Review Issue No. 18 論文での受動態の使い方​

第43回 Quality Review Issue No. 17  Compared with とCompared toの違いは?​

第44回 Reported about, Approach toの前置詞は必要か?​

第45回 Think, propose, suggest, consider, believeの使い分け​

第46回 Quality Review Issue No. 14  Problematic prepositions scientific writing: by, through, and with -3つの前置詞について​

第47回 Quality Review Issue No. 13 名詞を前から修飾する場合と後ろから修飾する場合​

第48回 Quality Review Issue No. 13 単数用法のThey​

第49回 Quality Review Issue No. 12  study, investigation, research の微妙なニュアンスのちがい

第50回 SinceとBecause 用法に違いはあるのか?

第51回 Figure 1とFig.1の使い分け

第52回 数式を含む場合は現在形か?過去形か?

第53回 Quality Review Issue No. 8  By 2020とup to 2020の違い

第54回 Quality Review Issue No. 7  high-accuracy data? それとも High accurate data? 複合形容詞でのハイフンの使用

第55回 実験計画について

第56回 参考文献について

第57回 データの分析について

第58回 強調表現について

第59回 共同研究の論文執筆について

第60回 論文の略語について

第61回 冠詞の使い分けについて

第62回 大文字表記について

第63回 ダッシュの使い分け

第64回 英語の言葉選びの難しさについて

第65回 過去形と能動態について

第66回 「知識の呪い」について

第67回 「文献の引用パート1」について

第68回 「文献の引用パート2」について

第69回 「ジャーナル用の図表の準備」について

第70回 「結論を出す ~AbstractとConclusionsの違い~」について

第71回 「研究倫理 パート1: 研究デザインとデータ報告」について

第72回 「研究倫理 パート2: 読者の時間を無駄にしない」について

第73回 「記号と特殊文字の入力」について

第74回 「Liner regression(線形回帰)は慎重に」について

第75回 「Plagiarism(剽窃)を避ける」について

第76回 研究結果がもたらす影響を考える

第77回 「データの解析(パート1):データ探索を行う」について

第78回 「データの解析(パート2):統計分析」について

第79回 「データの解析(パート3):データを提示する」について

第80回 データ、その他の大事なものをバックアップする(パート1)

第81回 「データ以外のもの(パート2)」について


〒300-1206
茨城県牛久市ひたち野西3-12-2
オリオンピアA-5

TEL 029-870-3307
FAX 029-870-3308
ワールド翻訳サービス スタッフブログ ワールド翻訳サービス Facebook ワールド翻訳サービスの動画紹介