Research Made Easy: Efficient literature study

|||||||
|||||Beispielsuchanfrage bei Google Scholar. Metadaten

Self-organization is an important skill in many industries. This is especially the case in research, where multiple projects often need to be managed without predetermined tasks. The process of researching a topic in particular has the potential to be an endless task. To streamline this process, we present some tricks and tools in this article. We curated them by interviewing some ML2R researchers about their research process.

Literature study: How to find the right articles

Acquiring and keeping a firm grasp on state of the art research is a big part of every researcher’s job. This is especially true for machine learning, since new research is not only published in journals and at conferences but also at preprint services like arxiv.org. Preprint services, however, do not peer-review the contributions. While this means that these contributions have to be taken with a grain of salt they can still provide a useful glimpse into current research topics and directions. Due to the large amount of potentially relevant articles, it is very important to accurately select the truly relevant papers for the research question at hand. Some researchers prefer to first read the Abstract, Introduction and Conclusion chapters of each article. Abstract should summarize the contents of the paper, Introduction motivates the topic and should state key contributions, while Conclusion summarizes the results and discusses them. Since most literature in machine learning research is structured in this way, it allows researchers to quickly extract key contributions of the article and compare those to their own research question. This saves the researchers time and increases the chance of selecting truly relevant articles for deeper inspection.

Google Scholar is a popular search engine specifically designed for scientific literature search by keywords. Metadata such as publication date and location, as well as author information are easily retrieved together with the publication itself. Google Scholar also collects all known publications by specific authors on their dedicated author page.

Google Scholar model agnostic explainability methods 7 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
Example result for the search term “Model-agnostic explainability method”. Metadata, such as publication date and type of article can be restricted. Author names with underline are links toward that researcher’s author page, where all known publications are listed.

Besides Google Scholar, most high quality journals and conferences feature past publications on their website. Moreover, survey articles are a good way to get familiar with a new topic. A survey article lists and discusses many useful articles and methods for a specific topic or research question. Additionally, recorded university lectures and textbooks are very helpful if the topic is quite general and established. A new tool to find related research to a given article is the free tool Connected PapersThis tool visualizes a graph of related literature, where proximity in the graph represents closeness between the literature items. If two articles tend to reference the same literature then they are assumed to be similar, meaning that they are closer together in the graph.

Connected Papers unified approach to interpretion model predictions 7 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
© Connected Papers ( https://www.connectedpapers.com/)

The graph by Connected Papers visualizes similar articles and colors them according to their publication date. More information about the articles, as well as a list representation of the graph, are shown in the margins. Additionally, literature can be saved for later if a Connected Papers account is created.

Research also means: Finding the tree in the forest

Nowadays, researchers work with digital representations of literature, mostly in the form of PDF documents. While some still prefer to print out articles they would like to read, there are now tools that help researchers organize their literature library. Two very popular library managers are Mendeley and Zotero. Both work in a similar way: Every literature item contains metadata, such as author information, publication date and publication location. The user is then able to add literature to the manager and is now able to search and filter the library based on keywords. Also, it is possible to add PDF files of the articles themselves, which can be synchronized across devices, such as smartphones and tablets.

Zotero 7 - Lamarr Institute for Machine Learning (ML) and Artificial Intelligence (AI)
Literature items in Zotero. Top: A sortable and filterable list of items with title, authors and publication date. Bottom: Important metadata about the article.

Some researchers find it helpful to highlight passages in the article, write notes into the margins and add digital comments. Sometimes a dedicated color scheme is used to differentiate between semantically different concepts. As an example, important core concepts of the article could be highlighted in yellow, while potentially interesting literature references are highlighted in red. Such a system can be arbitrarily complex. However, it is likely easier to start with a simple system and expand upon it only when necessary. One often underestimated aspect is that these systems can be too complex to actually stick to them over a long period of time. The best organizational system is worthless if the researcher does not have the time or patience to stick to it. Here, the old principle known by the acronym KISS can be applied: Keep it simple and straightforward!

Self organization, time management and note taking

For new researchers the task of sifting through a large amount of possibly relevant literature can be a daunting task. However, this task gets easier when taking advantage of modern tools and rigorous principles. It is most important to not lose focus and stick to the systems implemented. The next blog post in this series will discuss tips and tools for time management and organization of extracted knowledge in the form of notes.

Matthias Jakobs,

13. June 2022

Topics

Matthias Jakobs

Research Focus: Trustworthy ML What problems are you currently working on? Providing guarantees in theory and application for explainability methods using Shapley values Combining explainability models with Bayesian Neural Networks (BNN) What are you particularly interested in? Shining a light into the decision-making process of black-box models giving users and experts confidence in the decisions produced by Neural Networks

More blog posts