Is python used in scientific research?

From the Science Student Council

Using powerful, open-source programming can streamline your research processes.

Comment:

By Shawn Rhoads

Programming is an essential skill within modern psychological science research. Use of programming allows researchers to quickly and efficiently complete specific tasks that would otherwise demand hours of time. However, barriers such as access to resources and the time it takes to learn can prevent trainees from knowing how to start, understanding what is possible, and identifying an appropriate programming language.

This article aims to introduce Python as a reliable programming language in psychological research to users with minimal programming background. While Python is the focus in this article, it is one of many languages that can help boost research productivity. For instance, R is a similarly popular open-source programming language used in science, and excels in data organization, analysis and visualization.

Why Python?

Python is a powerful general-purpose programming language and is becoming an increasingly popular tool in research. It is intuitive to learn, has a flourishing online community and is open-source (free to you and others). Its popularity partly arises from its easy-to-use, versatile functionality. Python can accomplish most day-to-day research tasks and can be used at multiple steps of the research pipeline (e.g., running experiments with participants, data organization, data processing/manipulation, statistical analysis/modeling and visualization). Instead of using different software programs to accomplish different tasks, Python can save researchers a significant amount of time and frustration.

Considering the movement towards open and reproducible science, Python offers an advantage over other proprietary software that often require expensive licenses and are therefore inaccessible to some researchers. With Python, after you write and share your "code," others can easily access and use it without running into paywalls or licensing issues. This community-driven aspect allows developers to deploy third-party "packages" (also called “libraries”), or easily shareable bundles of code (often including documentation, example data and tutorials) that extend Python’s base functionality. Packages save you considerable time. If you have a problem to solve (e.g., statistics, plotting, data filtering), someone has likely already solved it and has deployed a package open for usage.

Python is currently available in versions 2.7 or 3.X. Python 3 is cleaner and faster, but note that some third-party packages still only offer 2.7 support. Typically, current packages are written or updated for use with Python 3 (development for Python 2 has been discontinued). Download your preferred Python version on the website, or you may choose to install Python through Anaconda, which is a helpful environment manager that comes with a collection of many useful preinstalled open-source packages. Because packages are sometimes dependent on other packages, Anaconda saves researchers from dealing with any compatibility issues when developers update their packages.

What can I do?

There are many research tasks programming with Python can accomplish. Below are some ways in which Python and its openly available packages can help the research process.

  • Data collection. Using Python for data collection is appealing because programming experiments gives users complete control over every aspect of a psychology task procedure. For example, users can create electronic surveys or behavioral experiments with flexibility over how they present visual or audio stimuli (e.g., shapes, text, images, sounds, animations, movies), record precise timing measurements (e.g., stimuli onsets and durations), and collect behavioral responses (e.g., onset of button presses, reaction times). PsychoPy (Peirce, et al., 2019) is a Python package that allows researchers to run a wide range of neuroscience and psychology experiments. You can customize aspects of your experiments using PsychoPy's graphical user interface (Builder view). Alternatively, researchers can write code for the entire experiment from scratch. If you decide to write behavioral experiments using Python code, I recommend starting by finding online tutorials and editing others' code as a template for your experimental design.
  • Data processing and organization. Base Python comes with many helpful tools to help clean and organize data enabling you to iteratively create, move, copy or rename files/folders/directories. The os module allows users to use Python to interface with a computer's underlying operating system (e.g., Windows, Mac, Linux), which is especially useful when working with large amounts of data that are not stored in Excel spreadsheet-like formats. The pandas package is a flexible and intuitive tool that allows researchers to work with all sorts of data such as text data, comma-separated data or Excel-style spreadsheets. Data like these can be stored in "DataFrames" and pandas makes it easy to perform operations on both row- and column-labeled data, such as scoring questionnaire data or merging and reshaping datasets.
  • Data analysis. Python can also perform a wide variety of statistics for data analysis. Using pandas, for example, you can perform a quick pairwise Pearson's r correlations between data across columns (if observations are listed in rows). There are also more formal packages for statistics, including statsmodels, the scipy.stats module in SciPy and Pymer4.
  • Data visualization. Once you get the hang of it, the module matplotlib.pyplt in the Matplotlib package can create all sorts of plots. Seaborn is also a powerful and increasingly popular package based on Matplotlib which creates beautiful statistical graphics.

How do I get started?

Learning how to program and learning a new programming language can be very different experiences. For instance, two different programming languages can perform the same task or produce a similar outcome, but the method may vary. This often occurs because languages have different "syntax," or rules of each language. I recommend starting out by outlining the tasks you want to achieve with programming with flow charts. Flow charts can help you visualize your "algorithm," or a set of operations or instructions defined to produce a desired outcome. Thinking algorithmically will be crucial to your success. Once you have your algorithm, try coding it using the language’s syntax.

As scientists-in-training, writing your first line of code can be a daunting task. One helpful starting point is hands-on practice to perform small tasks throughout the week. As you get the hang of it, try continuing onto more challenging tasks. An experienced graduate student or postdoctoral researcher in your lab or department might also be able to help you. Another helpful way to start is by editing others' code for your own use. Below are some more ideas to help you get started.

  • Enroll in a course at your institution. Most academic institutions offer courses to introduce students to programming. If your institution doesn’t offer a course, check if your institution is part of a consortium of colleges/universities within your region. If it is, you might find a relevant course nearby.
  • Register for a workshop. Find an academic workshop within your field. Typically, ads for new workshops are sent around on various listservs (try signing up for your APA division listserv). Some university departments organize local workshops to create training opportunities for students. If you are planning to attend an academic conference, check if it will be offering any workshops as well. Some workshops offer travel awards to assist trainees. Most programming workshops will also post their materials online, so be sure to check those if you cannot attend.
  • Find free resources online. There are many openly accessible resources online. A simple Google search such as "how to perform a t-test in python," for instance, can yield extremely helpful results. Online forums such as StackOverflow allow users to ask and answer questions about all kinds of programming languages. A list of helpful resources and tutorials is also available below.

Programming offers a diverse set of tools at your disposal as a researcher in psychological science. Learning how to program early can prove to be very beneficial for your career, save you tons of time and allow you to begin adapting open and reproducible science practices. Start building your programming toolbox today and learn what you can accomplish in your research and beyond.

References

Peirce, J.W., Gray, J.R., Simpson, S., MacAskill, M.R., Höchenberger, R., Sogo, H., Kastman, E., Lindeløv, J. (2019). PsychoPy2: experiments in behavior made easy. Behavior Research Methods, 51(1), 195-203.

More helpful resources (Python and R)

Python

  • Google's Python Class
  • Beginner’s Guide to Python
  • Programming for Psychology in Python
  • 10 Minutes to pandas
  • Python For Beginners: Python's OS Module
  • Descriptive Statistics using Python
  • Programming for Psychologists (although syntax uses Python 2)
  • Codeacademy: Learn Python 3
  • Plotly Python Open Source Graphing Library
  • SciPy Tutorial
  • Python for MATLAB Users
  • DartBrains: Introduction to Programming (also includes data analysis methods for fMRI experiments)
  • PsychoPy Resources
  • DataCamp Python Syntax Cheat Sheets (Basics, pandas, Matplotlib, seaborn)

R

  • Learning Statistics with R
  • R Resources for Psychologists
  • A Psychologist's Guide to R
  • Using R for Psychological Research
  • R Cheat Sheet (for language syntax) (PDF, 202KB)
  • ggplot2 Cheat Sheet (for syntax) (PDF, 1.17MB)

About the author

Is python used in scientific research?
Shawn Rhoads is the social/personality representative to the APA Science Student Council. He is a rising third-year PhD student at Georgetown University. Find him on Twitter: @ShawnRhoads51.


The views expressed in this article are those of the author and do not reflect the opinions or policies of APA.

Comment:

Is python used in scientific research?

Is python used in scientific research?

Members may qualify for lower pricing

PSA is the monthly e-newsletter of the APA Science Directorate. It is read by psychologists, students, academic administrators, journalists and policymakers in Congress and federal science agencies.

Can you do scientific research with Python?

Python can accomplish most day-to-day research tasks and can be used at multiple steps of the research pipeline (e.g., running experiments with participants, data organization, data processing/manipulation, statistical analysis/modeling and visualization).

Is Python used by scientists?

Python is a general purpose language, used by data scientists and developers, which makes it easy to collaborate across your organization through its simple syntax. People choose to use Python so that they can communicate with other people. The other reason is rooted in academic research and statistical models.

Is Python used in NASA?

Moreover, Python, as one of the programming languages used by NASA, played a significant role in this.

Why is Python good for science?

What makes Python so valuable for scientific computation is not only Python's novice-friendly syntax, but also the many packages that allow many common programming tasks to be completed in dozens of lines of code rather than hundreds or thousands in other languages.