Workshop and Discussion Materials

Here are past discussion topics and workshops from BioData Club, sorted by the date they were given. Where possible, we have linked to recordings, presentations, and code.

Start Somewhere: Programming Fundamentals in Python

Presented by: Eric Earl, Marijane White

February 21, 2020

Recording Presentation

In this BioData Club workshop, attendees will gain skills and an understanding of key concepts foundational to computer programming. Instructors Eric Earl and Marijane White will teach with Python, a popular and friendly programming language. Topics covered will include: variables, data types, looping, and conditionals. There will also be time for open questions. Join us if you’ve never programmed before, but want to get started, or if you’re new to programming and want a refresh on the fundamentals and to experiment with Python.

How to Make a Reproducible Paper

Presented by: Aurora Blucher, Ted Laderas

January 10, 2020

Recording Presentation Code

In this workshop, Aurora Blucher will talk about making a publication reproducible. Come and learn about effective data management, building reproducible computing environments using Binder, and using RMarkdown notebooks to make reproducible result reports.

Data Storytelling

Presented by: Ted Laderas

November 15, 2019

Recording Presentation Code

You are making a figure for your paper and want it to be the best it can be. Come and learn techniques for communicating your findings clearly. Learn about the role of color, annotations, and simplifying your figures to communicate effectively. Please see the link for more information and how to RSVP for this event.

Data and Project Documentation with Read the Docs

Presented by: Robin Champieux, Eric Earl

October 18, 2019

Recording Presentation Code

Good data, project, or software documentation makes your work more discoverable, transparent, and reusable. Read the Docs makes documentation easy by automating the building, versioning, and hosting of your docs, plus it’s free and open source! In this hands-on BioData Club workshop attendees will learn about documentation best practices and publish a documentation website using Read the Docs and Markdown. All OHSU community members and friends are welcome! No prior experience is required. RSVP and More info at the link above.

Python, SQL, and Pandas: Oh My!

Presented by: Lucille Moore, Daniel Yeager

June 21, 2019

Presentation Code

SQL is the most popular programming language for generating and interacting with relational databases and is therefore an extremely useful language to learn. This OHSU Library BioData Club workshop will provide a gentle introduction to relational databases and how to use SQL to make queries. Instructors Luci Moore and Dan Yaeger will guide attendees through using Python to create a database and query data in SQLite. At the end of the workshop, attendees will have access to a Jupyter Notebook and a basic script that they can modify to create and query their own databases.

Logistic Regression Workshop

Presented by: Crista Moreno

June 07, 2019

Presentation Code

In biomedical research we often wish to classify data into two or more groups (eg. healthy and diseased) based on a variety of measurement variables, but how do you determine if the model you’ve selected is good? In this BioData Club workshop instructor Crista Moreno will discuss the mathematics of logistic regression for binary classification modeling, and how to prevent the harms of overfitting with cross validation in R. Anyone with interest in building a classification model for biomedical data is encouraged to utilize these materials! Prior experience with R, Rstudio, and a basic knowledge of classification modeling (also mathematical functions) will be helpful, but is not a requirement.

Data Exploration and Visualization with Tableau

Presented by: Connor Smith

May 17, 2019

Presentation Code

Tableau is a powerful visualization tool for exploring data and communicating information. It includes a suite of functionality relevant to data visualization novices and skilled programmers. This OHSU Library BioData Club workshop will provide a hands-on introduction to creating interactive visualizations with Tableau Public, the free version the software. Instructor, Connor Smith, will cover how to load and prepare data, the Tableau interface, and creating basic visualization types and interactive dashboards.

Practical Visualization using R and ggplot2

Presented by: Kevin Watanabe Smith, Emile Latour

April 12, 2019

Presentation Code

R and ggplot2 are tools for making impactful data visualizations. Some may fear the “code” learning curve, but with ggplot2’s “grammar of graphics” the syntax becomes intuitive and builds on itself. This workshop aims to introduce beginners to practical hands-on examples that can be applied to one’s own data analyses. Experienced users will have the opportunity to test their knowledge and learn new tricks and technique. Come to this workshop to improve your research exploration and communication skills with instructors Kevin Watanabe-Smith and Emile Latour.

Better BioData with Ontologies and Linked Data

Presented by: Marijane White

February 15, 2019

Have you ever wondered…What an ontology is and why anyone would use one? Who’s using ontologies, and how are they using them? How ontologies are created and maintained? How to use an ontology to make your research data more reusable? Come to the next BioData Club to find out! After a brief introduction to the subject, we’ll walk through a modified version of the OBO Tutorial to get first-hand experience using biomedical ontologies to mark-up research data for improved reuse. No previous experience with ontologies or linked data is required. Bring a laptop and your curiosity!

Data Scavenger Hunt

Presented by: Ted Laderas, Jessica Minnier, Thomas Frohwein

January 25, 2019

Presentation Code

Are you interested in Data Science but don’t know how to get started? Come learn about the power of exploring data at our workshop. We’ll use a publicly available dataset called NHANES (National Health and Nutritional Examination Survey) to learn about answering questions about diabetes and depression in a friendly group setting. No previous experience is required. Bring a laptop and your curiosity! If you’d like to come, please RSVP by filling out our pre-session survey.

Git for Collaboration

Presented by: Philip Robinson

October 05, 2018

Presentation Code

This is a workshop introducing you to Git and GitHub. Learn the basics of Git by teaming up in groups and sorting panels from Edward Gorey’s Gashlycrumb Tinies.

Educational Resources

R-Bootcamp

This was a four part course that is free that introduces everyone to visualizing, data wrangling, and simple statistics with tidyverse packages.

Discussion Topics

This is a list of discussion topics and talks hosted by BioData Club that we think have been successful and may be of interest

This is a paper by Greg Wilson that outlines how to improve your scientific software.

We had a discussion about Jeff Leek’s book How to be a Modern Scientist and what it means for students and postdocs today.

Alison Presmanes Hill gave a talk for our visualization hacky hour about how she slowly revised and improved a figure. Very funny and very informative.

We at BioData Club are cross-disciplinary by nature. What does it take to be a good cross-disciplinary collaborator?