Evidence-Based Approaches to Public Health: Biostatistics – Descriptive Statistics: Data Visualization (Charts, Graphs, Histograms)
In this tutorial, we will explore the use of data visualization techniques, including charts, graphs, and histograms, to present and interpret data effectively. Visualizing data allows public health professionals to understand patterns, trends, and distributions within datasets and to communicate findings clearly. Understanding these methods is key for the Certified in Public Health (CPH) exam and for presenting public health data.
By the end of this tutorial, you will understand different types of data visualizations, how to create and interpret them, and when to use each type. Practice questions are included to help reinforce your knowledge.
Table of Contents:
- Introduction to Data Visualization
- Bar Charts
- Definition of Bar Charts
- When to Use Bar Charts
- How to Create and Interpret Bar Charts
- Line Graphs
- Definition of Line Graphs
- When to Use Line Graphs
- How to Create and Interpret Line Graphs
- Histograms
- Definition of Histograms
- When to Use Histograms
- How to Create and Interpret Histograms
- Practice Questions
- Conclusion
1. Introduction to Data Visualization
Data visualization is the process of creating graphical representations of data to make complex data easier to understand. Charts, graphs, and histograms are widely used in public health research to display descriptive statistics and to identify trends, distributions, and relationships within datasets.
Effective data visualization helps communicate data insights to diverse audiences, from policymakers to the public. The choice of visualization depends on the type of data and the information you want to convey.
2. Bar Charts
Bar charts are used to display categorical data by representing each category as a bar, with the length of the bar corresponding to the frequency or value of that category. Bar charts are useful for comparing the sizes of different groups or categories.
2.1 Definition of Bar Charts
A bar chart consists of rectangular bars, where the length of each bar represents the magnitude or frequency of the corresponding category. Bars can be arranged vertically or horizontally, and they are typically separated by gaps to emphasize that the data are categorical rather than continuous.
2.2 When to Use Bar Charts
Bar charts are best used when you are comparing discrete categories or groups, such as age groups, gender, or types of diseases. They are especially useful for illustrating the frequency of occurrences or for comparing quantities across different categories.
2.3 How to Create and Interpret Bar Charts
To create a bar chart:
- Step 1: List the categories on the x-axis (for vertical bars) or y-axis (for horizontal bars).
- Step 2: Plot the frequency or value of each category using bars of appropriate lengths.
When interpreting a bar chart, look at the lengths of the bars to compare the values or frequencies across categories. For example, a bar chart might show the number of people vaccinated in different age groups, with the longest bar representing the age group with the highest vaccination rate.
3. Line Graphs
Line graphs are used to display trends in data over time or continuous data. A line graph connects individual data points with a line, allowing viewers to see how a variable changes over time or across a continuous range.
3.1 Definition of Line Graphs
A line graph uses points connected by a line to represent data points in a sequence. It is most commonly used to track changes over time, but it can also be used to display continuous data. Line graphs are particularly effective for showing trends, patterns, and relationships over time.
3.2 When to Use Line Graphs
Line graphs are best used when you want to visualize changes over time or trends in continuous data. They are ideal for displaying data such as disease incidence rates over time, temperature changes, or population growth trends.
3.3 How to Create and Interpret Line Graphs
To create a line graph:
- Step 1: Plot time or continuous data on the x-axis.
- Step 2: Plot the corresponding values of the variable of interest on the y-axis.
- Step 3: Connect the data points with a line to visualize the trend.
When interpreting a line graph, look for trends over time or patterns in the data. For example, a line graph might show the rise and fall of flu cases throughout a year, with peaks during the winter months.
4. Histograms
Histograms are used to display the distribution of continuous data. They group data into intervals (bins) and display the frequency of data points that fall within each interval. Histograms provide a visual representation of the shape of the data distribution, such as whether the data are skewed or normally distributed.
4.1 Definition of Histograms
A histogram is similar to a bar chart, but it is used to display continuous data rather than categorical data. The x-axis represents the intervals (bins) of the continuous data, and the y-axis represents the frequency of data points within each bin. Unlike bar charts, the bars in a histogram touch each other to indicate that the data are continuous.
4.2 When to Use Histograms
Histograms are best used when you want to visualize the distribution of continuous data. They are particularly useful for showing the frequency of data points within specified ranges and for identifying patterns such as skewness, modality, or the presence of outliers.
4.3 How to Create and Interpret Histograms
To create a histogram:
- Step 1: Divide the range of the data into equal intervals (bins).
- Step 2: Count how many data points fall within each interval.
- Step 3: Draw bars to represent the frequency of data points within each bin.
When interpreting a histogram, look for the shape of the distribution. For example, a histogram might show that most data points are clustered around a central value, indicating a normal distribution, or it might reveal a skewed distribution with more data points on one side.
5. Practice Questions
Test your understanding of data visualization with these practice questions. Try answering them before checking the solutions.
Question 1:
A public health researcher wants to compare the number of flu cases in different cities. What type of graph should the researcher use?
Answer 1:
Answer, click to reveal
The researcher should use a bar chart to compare the number of flu cases across different cities, as bar charts are best for comparing categorical data.
Question 2:
A study tracks the daily number of new COVID-19 cases over the course of a year. What type of graph is most appropriate to display this data?
Answer 2:
Answer, click to reveal
A line graph is most appropriate, as it can display trends in the number of cases over time.
Question 3:
A public health researcher is analyzing the distribution of ages in a population. The ages range from 18 to 80. What type of visualization should the researcher use to show the distribution of ages?
Answer 3:
Answer, click to reveal
A histogram should be used, as it can show the frequency of ages within specified intervals, making it easy to see the overall distribution of the data.
6. Conclusion
Data visualization is a critical tool for understanding and communicating public health data. Charts, graphs, and histograms help present complex data in a clear and interpretable way, allowing public health professionals to identify trends, distributions, and relationships in their data. Each type of visualization has its strengths, and choosing the right one depends on the type of data and the message you want to convey.
Remember:
- Bar charts are ideal for comparing categorical data or groups.
- Line graphs are best for showing trends over time or continuous data.
- Histograms are useful for visualizing the distribution of continuous data.
Final Tip for the CPH Exam:
Ensure you understand when each type of visualization is best suited for what sort of underlying data. It’s encouraged you look at several studies, examine what data types they are using in their study, and how that data is then represented. Ask yourself if it was appropriate or if a better choice was available.
Humanities Moment
The featured image for this CPH Focus is Fruit Still Life With Two Parrots (1623) by Balthasar van der Ast (Dutch, 1593-1657). Balthasar van der Ast, a Dutch Golden Age painter born in Middelburg around 1593-1594, was a pioneer of shell still lifes and known for his intricate still life paintings of flowers, fruit, insects, and lizards. Trained by his brother-in-law Ambrosius Bosschaert the Elder, van der Ast influenced the “Bosschaert dynasty” and other prominent painters, eventually settling in Delft, where he joined the Guild of St. Luke and lived until his death in 1657.