Session 2: Exploring dataframes, creating tables and graphs
{dplyr} package{ggplot2}PhDPublications datasetAER package (that’s where the data are)Load the packages AER and tidyverse:
d (= copy it)<-Ph and hit Tab keystr()head()articles # articles published during last 3 years of PhDgendermarriedkids # of children less than 6 years oldprestige prestige of graduate programmentor # articles published by mentor'data.frame': 915 obs. of 6 variables:
$ articles: int 0 0 0 0 0 0 0 0 0 0 ...
$ gender : Factor w/ 2 levels "male","female": 1 2 2 1 2 2 2 1 1 2 ...
$ married : Factor w/ 2 levels "no","yes": 2 1 1 2 1 2 1 2 1 2 ...
$ kids : int 0 0 0 1 0 2 0 2 0 0 ...
$ prestige: num 2.52 2.05 3.75 1.18 3.75 ...
$ mentor : int 7 6 6 3 26 2 3 4 6 0 ...
- attr(*, "datalabel")= chr "Academic Biochemists / S Long"
- attr(*, "time.stamp")= chr "30 Jan 2001 10:49"
- attr(*, "formats")= chr [1:6] "%9.0g" "%9.0g" "%9.0g" "%9.0g" ...
- attr(*, "types")= int [1:6] 98 98 98 98 102 98
- attr(*, "val.labels")= chr [1:6] "" "sexlbl" "marlbl" "" ...
- attr(*, "var.labels")= chr [1:6] "Articles in last 3 yrs of PhD" "Gender: 1=female 0=male" "Married: 1=yes 0=no" "Number of children < 6" ...
- attr(*, "version")= int 6
- attr(*, "label.table")=List of 6
..$ marlbl: Named num [1:2] 0 1
.. ..- attr(*, "names")= chr [1:2] "Single" "Married"
..$ sexlbl: Named num [1:2] 0 1
.. ..- attr(*, "names")= chr [1:2] "Men" "Women"
..$ : NULL
..$ : NULL
..$ : NULL
..$ : NULL
### R setup (load packages)### Data (load data)### Initial data analysis (tables, graphs)dplyrtidyverse|>group_by() divide the datasetsummarize() summarize subsetsn() count how many there aremean(), median(), sd(), max()d |>
group_by(kids) |>
summarise(
N = n(),
mean_pubs = mean(articles),
median_pubs = median(articles),
sd_pubs = sd(articles),
max_pubs = max(articles))# A tibble: 4 × 6
kids N mean_pubs median_pubs sd_pubs max_pubs
<int> <int> <dbl> <dbl> <dbl> <int>
1 0 599 1.72 1 1.93 19
2 1 195 1.76 1 2.05 12
3 2 105 1.54 1 1.74 11
4 3 16 0.812 1 0.911 3
ggplot2tidyverseTime for practice! The tasks are available here. (there is also a link on my website)