
Data Analysis BML
Course Data Analysis and Visualization
Source: https://www.deviantart.com/gabimedia/art/3D-Illustration-of-human-brain-nerve-4-977052356, Created using AI tools. Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License
Contents
Links to data sets
Demo DataSaurus Dozen Dataset
Link to example portfolio
- Here you can find instructions for the final assignment with an example portfolio
- Here you can find instructions for the R part for Medical Diagnostics students
Installation of MS Excel, R en RStudio
Plotting in Excel on a Macbook
Schedule and Outline per Lesson
Introduction
The health sciences and biological sciences are undergoing a transformation, driven by the ever-growing flood of data generated from diverse sources. Genomic sequencing, proteomics, pharmacological data, clinical trials and wearable devices all contribute to a vast and complex landscape of information. To navigate this landscape effectively and gain valuable insights, researchers are increasingly turning to the powerful tools of data analysis and visualization.
Data analysis involves the systematic process of collecting, cleaning, processing, and interpreting health and biology related data. This allows researchers to uncover hidden patterns, identify trends, and understand complex relationships within the data. However, raw data can be overwhelming and difficult to interpret. This is where data visualization comes into play.
By transforming data into clear and concise visual representations such as graphs, charts, and maps, data visualization empowers users to:
- Gain deeper understanding: Visualizations make complex relationships and trends readily apparent, facilitating a more intuitive grasp of the data.
- Identify patterns and trends: Visualizations can readily reveal patterns and trends that might be missed in raw data, leading to new discoveries and potential breakthroughs.
- Communicate effectively: Complex data can be difficult to communicate clearly with text alone. Visualizations provide a powerful tool for conveying insights to colleagues, patients, and the public in a way that is both informative and engaging.
In conclusion, data analysis and visualization are critical tools in the modern health sciences and biology. They enable researchers to unlock the hidden potential of health-related data, leading to improved understanding of biological systems and improved healthcare delivery.
In this module, you will learn to distinguish and analyze different types of data. You will also learn to visualize data in an attractive way for reports. The focus will be on using the spreadsheet program Microsoft Excel and the programming language R (using the Tidyverse framework). The assignments can also be completed with Python, but there will be no explanation of Python on this website.
R and Python are currently the two most used programming languages in the field. R is used to gain some familiarity with the possibilities of data analysis with R, but not to teach you all the ins and outs of programming. The assignments that are offered should be completed with Excel and R. This module will conclude with an assignment that you will complete independently.
CureQ
This module is part of the CureQ project. CureQ is a consortium of partners that focuses on polyQ diseases with the aim to enable polyQ targeting therapies to better predict onset and progression of disease of the different patient groups (early-onset, adult-onset and carriers of intermediate repeats). The Hanze University is involved in two work packages of the CureQ project. During this course, we will often use datasets that are linked to various polyQ disorders.
Learning Outcomes
- You can explain the fundamental concepts of data, including data types, data structures, and the importance of data quality, and you can apply these concepts in the context of both Excel and R.
- You can import data from various sources (such as CSV, TXT, Excel files) into both Excel and R, and you are able to select the appropriate methods and functions for different data formats and structures. You can clean and prepare data for analysis in both Excel and R by applying techniques for identifying and handling missing values, outliers, inconsistencies, and duplicates.
- You can apply and interpret various data analysis functions in both Excel and R, including determining minimum and maximum values, calculating percentiles, selecting specific data based on criteria, applying conditional formatting to visually identify patterns, and sorting data to gain insights.
- You can create effective data visualizations in both Excel and R (such as bar charts, line charts, pie charts, and box plots) to communicate patterns, trends, and relationships in data, and you can customize the visualizations for a clear and purpose-driven presentation.
Some text on this web page is copied, adapted and modified from Wikipedia.org
For some textual parts, AI (GPT) was used. The output was verified in all cases and modified where needed.
This web page is distributed under the terms of the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Creative Commons License: CC BY-SA 4.0.