library(car)
data(Vocab)
Session 2: Practice
- Create a new quarto document and name it “vocab_data_analysis.qmd”.
- Create three sections with subheaders:
- R setup
- Data
- Initial data analysis
- Load the
{car}
package and theVocab
data set:
- Add text to the quarto file explaining what your code does (e.g. why does the script load the
car
package?) - Make sure you assign code to the right section of your notebook (e.g. “R setup” vs. “Data”)
- Inspect the data frame:
str(Vocab)
?Vocab
- Add text that briefly describes the contents of the data set. For information on formatting, refer to the Markdown basics
- Use boldface to refer to the variables (Section “Text formatting”)
- Use bullet points to list the variables (Section “Lists”)
Tables
- Use a table to summarize the data by sex. Use piping to create a summary table that lists the following quantities:
- Number of respondents
- Average number of years of education
- Average number of points on the vocabulary “test”
- Make sure you use short and informative labels for the columns in the summary table. Revise your code if necessary.
- Add text to your quarto notebook to briefly describe the results of your data summary. Are there noticeable differences between male and female respondents?
- Render the document and make sure everything looks OK.
- Use a table to summarize the data by year. Make sure you use copy-and-paste as much as possible.
- Add text that briefly describes the results of your data summary. Can we see trends over time?
- Render the document and make sure everything looks OK.
Graphs
- Draw a bar chart showing the number of respondents:
- by year
- by sex
- Add text to describe what the graph shows.
- Render the document and make sure everything looks OK.
- Draw a histogram showing the distribution of:
- years of education
- score on the vocabulary “test”
- Add text to describe what the graph shows.
- Render the document and make sure everything looks OK.
- Draw a boxplot showing the distribution of years of education across time. If you need help with this, refer to the R graphics cookbook (Section 2.5), which is available for free online.
- Make sure
year
is assigned to the x-axis andvocabulary
to the y-axis. - Add text to describe what the graph shows. Have scores improved over time?
- Render the document and make sure everything looks OK.
- Draw a boxplot showing the distribution of test scores across time. Make sure you copy-and-paste as much as possible.
- Add text to describe what the graph shows. Has the level of education increased over time?
- Render the document and make sure everything looks OK.
Additional tasks (use the R graphics cookbook for help)
- Create a proportional stacked bar chart to see whether the distribution of male and female participants is balanced across years. (Section 3.8)
- Change to a black-and-white theme:
theme_bw()
- Copy the code you used to draw a histogram showing the distribution of years of education. Now use facetting to draw separate histograms for male and female respondents. (Section 11.1)
- Render the document and make sure everything looks OK.
Further tasks:
So far we haven’t spent much time with improving the top matter of the Quarto document. Scroll to the top of the quarto notebok. At the moment, it just contains two entries: title:
and format:
.
- Add an entry
author:
, giving your name in quotation marks - Add an entry
date:
. To automatically add the date, enterSys.time()
(without quotation marks) - Render the document and make sure everything looks OK.
- For nicer date formatting, enter
format(Sys.time(), "%d %B %Y")
(without quotation marks) - Add an entry
abstract:
, giving a short description of the purpose of this Quarto notebook. (in quotation marks)