class: center, middle, inverse, title-slide # Summarizing data with
dplyr
### SuffolkEcon --- # Chaining commands with the pipe The basic idea behind the package `dplyr` is to chain commands together through the pipe operator (`%>%`): ``` data %>% then do something %>% then do another thing %>% then do something totally different %>% then summarize ``` Chains can be modified as needed. --- class: inverse, center, middle # 1. Summarize `summarise()` --- count: false ### Mean .panel1-p1-auto[ ```r # Data *gapminder ``` ] .panel2-p1-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Mean .panel1-p1-auto[ ```r # Data gapminder %>% # average life expectancy over time * summarise(mean(lifeExp)) ``` ] .panel2-p1-auto[ ``` # A tibble: 1 x 1 `mean(lifeExp)` <dbl> 1 59.5 ``` ] <style> .panel1-p1-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p1-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p1-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ### Mean, median, standard deviation .panel1-p2-auto[ ```r # Data *gapminder ``` ] .panel2-p2-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Mean, median, standard deviation .panel1-p2-auto[ ```r # Data gapminder %>% # average/median/sd life expectancy over time * summarise( * mean(lifeExp), * median(lifeExp), * sd(lifeExp) * ) ``` ] .panel2-p2-auto[ ``` # A tibble: 1 x 3 `mean(lifeExp)` `median(lifeExp)` `sd(lifeExp)` <dbl> <dbl> <dbl> 1 59.5 60.7 12.9 ``` ] <style> .panel1-p2-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p2-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p2-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ### Nicer column names .panel1-p22-auto[ ```r # Data *gapminder ``` ] .panel2-p22-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Nicer column names .panel1-p22-auto[ ```r # Data gapminder %>% # average/median/sd life expectancy over time * summarise( * average_life = mean(lifeExp), * median_life = median(lifeExp), * sd_life = sd(lifeExp) * ) ``` ] .panel2-p22-auto[ ``` # A tibble: 1 x 3 average_life median_life sd_life <dbl> <dbl> <dbl> 1 59.5 60.7 12.9 ``` ] <style> .panel1-p22-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p22-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p22-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # 2. Group observations `group_by()` to `summarise()` --- count: false ### Mean by continent .panel1-p3-auto[ ```r # Data *gapminder ``` ] .panel2-p3-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Mean by continent .panel1-p3-auto[ ```r # Data gapminder %>% # group by continent * group_by(continent) ``` ] .panel2-p3-auto[ ``` # A tibble: 1,704 x 6 # Groups: continent [5] country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Mean by continent .panel1-p3-auto[ ```r # Data gapminder %>% # group by continent group_by(continent) %>% # average * summarise(mean(lifeExp)) ``` ] .panel2-p3-auto[ ``` # A tibble: 5 x 2 continent `mean(lifeExp)` <fct> <dbl> 1 Africa 48.9 2 Americas 64.7 3 Asia 60.1 4 Europe 71.9 5 Oceania 74.3 ``` ] <style> .panel1-p3-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p3-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p3-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ### Multiple groupings .panel1-p4-rotate[ ```r # Data gapminder %>% # group by ... * group_by() %>% # average summarise(mean(lifeExp)) ``` ] .panel2-p4-rotate[ ``` # A tibble: 1 x 1 `mean(lifeExp)` <dbl> 1 59.5 ``` ] --- count: false ### Multiple groupings .panel1-p4-rotate[ ```r # Data gapminder %>% # group by ... * group_by(continent) %>% # average summarise(mean(lifeExp)) ``` ] .panel2-p4-rotate[ ``` # A tibble: 5 x 2 continent `mean(lifeExp)` <fct> <dbl> 1 Africa 48.9 2 Americas 64.7 3 Asia 60.1 4 Europe 71.9 5 Oceania 74.3 ``` ] --- count: false ### Multiple groupings .panel1-p4-rotate[ ```r # Data gapminder %>% # group by ... * group_by(continent, year) %>% # average summarise(mean(lifeExp)) ``` ] .panel2-p4-rotate[ ``` # A tibble: 60 x 3 # Groups: continent [5] continent year `mean(lifeExp)` <fct> <int> <dbl> 1 Africa 1952 39.1 2 Africa 1957 41.3 3 Africa 1962 43.3 4 Africa 1967 45.3 5 Africa 1972 47.5 6 Africa 1977 49.6 7 Africa 1982 51.6 8 Africa 1987 53.3 9 Africa 1992 53.6 10 Africa 1997 53.6 # … with 50 more rows ``` ] <style> .panel1-p4-rotate { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p4-rotate { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p4-rotate { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # 3. Filter observations `filter()` to `group_by()` to `summarise()` --- count: false ### Filter .panel1-p5-auto[ ```r # Data *gapminder ``` ] .panel2-p5-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Filter .panel1-p5-auto[ ```r # Data gapminder %>% # keep just Oceania * filter(continent == "Oceania") ``` ] .panel2-p5-auto[ ``` # A tibble: 24 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Australia Oceania 1952 69.1 8691212 10040. 2 Australia Oceania 1957 70.3 9712569 10950. 3 Australia Oceania 1962 70.9 10794968 12217. 4 Australia Oceania 1967 71.1 11872264 14526. 5 Australia Oceania 1972 71.9 13177000 16789. 6 Australia Oceania 1977 73.5 14074100 18334. 7 Australia Oceania 1982 74.7 15184200 19477. 8 Australia Oceania 1987 76.3 16257249 21889. 9 Australia Oceania 1992 77.6 17481977 23425. 10 Australia Oceania 1997 78.8 18565243 26998. # … with 14 more rows ``` ] --- count: false ### Filter .panel1-p5-auto[ ```r # Data gapminder %>% # keep just Oceania filter(continent == "Oceania") %>% # group by country * group_by(country) ``` ] .panel2-p5-auto[ ``` # A tibble: 24 x 6 # Groups: country [2] country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Australia Oceania 1952 69.1 8691212 10040. 2 Australia Oceania 1957 70.3 9712569 10950. 3 Australia Oceania 1962 70.9 10794968 12217. 4 Australia Oceania 1967 71.1 11872264 14526. 5 Australia Oceania 1972 71.9 13177000 16789. 6 Australia Oceania 1977 73.5 14074100 18334. 7 Australia Oceania 1982 74.7 15184200 19477. 8 Australia Oceania 1987 76.3 16257249 21889. 9 Australia Oceania 1992 77.6 17481977 23425. 10 Australia Oceania 1997 78.8 18565243 26998. # … with 14 more rows ``` ] --- count: false ### Filter .panel1-p5-auto[ ```r # Data gapminder %>% # keep just Oceania filter(continent == "Oceania") %>% # group by country group_by(country) %>% # average * summarise(mean(lifeExp)) ``` ] .panel2-p5-auto[ ``` # A tibble: 2 x 2 country `mean(lifeExp)` <fct> <dbl> 1 Australia 74.7 2 New Zealand 74.0 ``` ] <style> .panel1-p5-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p5-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p5-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ### Multiple filters .panel1-p55-auto[ ```r # Data *gapminder ``` ] .panel2-p55-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Multiple filters .panel1-p55-auto[ ```r # Data gapminder %>% # keep just Oceania after 1994 * filter(continent == "Oceania" & year > 1994) ``` ] .panel2-p55-auto[ ``` # A tibble: 6 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 2 Australia Oceania 2002 80.4 19546792 30688. 3 Australia Oceania 2007 81.2 20434176 34435. 4 New Zealand Oceania 1997 77.6 3676187 21050. 5 New Zealand Oceania 2002 79.1 3908037 23190. 6 New Zealand Oceania 2007 80.2 4115771 25185. ``` ] --- count: false ### Multiple filters .panel1-p55-auto[ ```r # Data gapminder %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country * group_by(country, year) ``` ] .panel2-p55-auto[ ``` # A tibble: 6 x 6 # Groups: country, year [6] country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 2 Australia Oceania 2002 80.4 19546792 30688. 3 Australia Oceania 2007 81.2 20434176 34435. 4 New Zealand Oceania 1997 77.6 3676187 21050. 5 New Zealand Oceania 2002 79.1 3908037 23190. 6 New Zealand Oceania 2007 80.2 4115771 25185. ``` ] --- count: false ### Multiple filters .panel1-p55-auto[ ```r # Data gapminder %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country group_by(country, year) %>% # average * summarise(mean(lifeExp)) ``` ] .panel2-p55-auto[ ``` # A tibble: 6 x 3 # Groups: country [2] country year `mean(lifeExp)` <fct> <int> <dbl> 1 Australia 1997 78.8 2 Australia 2002 80.4 3 Australia 2007 81.2 4 New Zealand 1997 77.6 5 New Zealand 2002 79.1 6 New Zealand 2007 80.2 ``` ] <style> .panel1-p55-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p55-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p55-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- class: inverse, center, middle # 4. Mutate observations (i.e., create new columns) `mutate()` to `filter()` to `group_by()` to `summarise()` --- count: false ### Mutate .panel1-p66-auto[ ```r # Data *gapminder ``` ] .panel2-p66-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### Mutate .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita * mutate(log_gdpPercap = log(gdpPercap)) ``` ] .panel2-p66-auto[ ``` # A tibble: 1,704 x 7 country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 6.66 2 Afghanistan Asia 1957 30.3 9240934 821. 6.71 3 Afghanistan Asia 1962 32.0 10267083 853. 6.75 4 Afghanistan Asia 1967 34.0 11537966 836. 6.73 5 Afghanistan Asia 1972 36.1 13079460 740. 6.61 6 Afghanistan Asia 1977 38.4 14880372 786. 6.67 7 Afghanistan Asia 1982 39.9 12881816 978. 6.89 8 Afghanistan Asia 1987 40.8 13867957 852. 6.75 9 Afghanistan Asia 1992 41.7 16317921 649. 6.48 10 Afghanistan Asia 1997 41.8 22227415 635. 6.45 # … with 1,694 more rows ``` ] --- count: false ### Mutate .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 * filter(continent == "Oceania" & year > 1994) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 7 country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 10.2 2 Australia Oceania 2002 80.4 19546792 30688. 10.3 3 Australia Oceania 2007 81.2 20434176 34435. 10.4 4 New Zealand Oceania 1997 77.6 3676187 21050. 9.95 5 New Zealand Oceania 2002 79.1 3908037 23190. 10.1 6 New Zealand Oceania 2007 80.2 4115771 25185. 10.1 ``` ] --- count: false ### Mutate .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country * group_by(country, year) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 7 # Groups: country, year [6] country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 10.2 2 Australia Oceania 2002 80.4 19546792 30688. 10.3 3 Australia Oceania 2007 81.2 20434176 34435. 10.4 4 New Zealand Oceania 1997 77.6 3676187 21050. 9.95 5 New Zealand Oceania 2002 79.1 3908037 23190. 10.1 6 New Zealand Oceania 2007 80.2 4115771 25185. 10.1 ``` ] --- count: false ### Mutate .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country group_by(country, year) %>% # average * summarise(mean(log_gdpPercap)) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 3 # Groups: country [2] country year `mean(log_gdpPercap)` <fct> <int> <dbl> 1 Australia 1997 10.2 2 Australia 2002 10.3 3 Australia 2007 10.4 4 New Zealand 1997 9.95 5 New Zealand 2002 10.1 6 New Zealand 2007 10.1 ``` ] <style> .panel1-p66-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p66-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p66-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- count: false ### The kitchen sink .panel1-p66-auto[ ```r # Data *gapminder ``` ] .panel2-p66-auto[ ``` # A tibble: 1,704 x 6 country continent year lifeExp pop gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 2 Afghanistan Asia 1957 30.3 9240934 821. 3 Afghanistan Asia 1962 32.0 10267083 853. 4 Afghanistan Asia 1967 34.0 11537966 836. 5 Afghanistan Asia 1972 36.1 13079460 740. 6 Afghanistan Asia 1977 38.4 14880372 786. 7 Afghanistan Asia 1982 39.9 12881816 978. 8 Afghanistan Asia 1987 40.8 13867957 852. 9 Afghanistan Asia 1992 41.7 16317921 649. 10 Afghanistan Asia 1997 41.8 22227415 635. # … with 1,694 more rows ``` ] --- count: false ### The kitchen sink .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita * mutate(log_gdpPercap = log(gdpPercap)) ``` ] .panel2-p66-auto[ ``` # A tibble: 1,704 x 7 country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Afghanistan Asia 1952 28.8 8425333 779. 6.66 2 Afghanistan Asia 1957 30.3 9240934 821. 6.71 3 Afghanistan Asia 1962 32.0 10267083 853. 6.75 4 Afghanistan Asia 1967 34.0 11537966 836. 6.73 5 Afghanistan Asia 1972 36.1 13079460 740. 6.61 6 Afghanistan Asia 1977 38.4 14880372 786. 6.67 7 Afghanistan Asia 1982 39.9 12881816 978. 6.89 8 Afghanistan Asia 1987 40.8 13867957 852. 6.75 9 Afghanistan Asia 1992 41.7 16317921 649. 6.48 10 Afghanistan Asia 1997 41.8 22227415 635. 6.45 # … with 1,694 more rows ``` ] --- count: false ### The kitchen sink .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 * filter(continent == "Oceania" & year > 1994) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 7 country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 10.2 2 Australia Oceania 2002 80.4 19546792 30688. 10.3 3 Australia Oceania 2007 81.2 20434176 34435. 10.4 4 New Zealand Oceania 1997 77.6 3676187 21050. 9.95 5 New Zealand Oceania 2002 79.1 3908037 23190. 10.1 6 New Zealand Oceania 2007 80.2 4115771 25185. 10.1 ``` ] --- count: false ### The kitchen sink .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country * group_by(country, year) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 7 # Groups: country, year [6] country continent year lifeExp pop gdpPercap log_gdpPercap <fct> <fct> <int> <dbl> <int> <dbl> <dbl> 1 Australia Oceania 1997 78.8 18565243 26998. 10.2 2 Australia Oceania 2002 80.4 19546792 30688. 10.3 3 Australia Oceania 2007 81.2 20434176 34435. 10.4 4 New Zealand Oceania 1997 77.6 3676187 21050. 9.95 5 New Zealand Oceania 2002 79.1 3908037 23190. 10.1 6 New Zealand Oceania 2007 80.2 4115771 25185. 10.1 ``` ] --- count: false ### The kitchen sink .panel1-p66-auto[ ```r # Data gapminder %>% # create new variable: log GDP per capita mutate(log_gdpPercap = log(gdpPercap)) %>% # keep just Oceania after 1994 filter(continent == "Oceania" & year > 1994) %>% # group by country group_by(country, year) %>% # average * summarise(mean(log_gdpPercap)) ``` ] .panel2-p66-auto[ ``` # A tibble: 6 x 3 # Groups: country [2] country year `mean(log_gdpPercap)` <fct> <int> <dbl> 1 Australia 1997 10.2 2 Australia 2002 10.3 3 Australia 2007 10.4 4 New Zealand 1997 9.95 5 New Zealand 2002 10.1 6 New Zealand 2007 10.1 ``` ] <style> .panel1-p66-auto { color: black; width: 38.6060606060606%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel2-p66-auto { color: black; width: 59.3939393939394%; hight: 32%; float: left; padding-left: 1%; font-size: 80% } .panel3-p66-auto { color: black; width: NA%; hight: 33%; float: left; padding-left: 1%; font-size: 80% } </style> --- # Acknowledgements These slides were made with the `flipbookr` package by [Gina Reynolds](https://github.com/EvaMaeRey/flipbookr). Last update: June 2021 <style type="text/css"> .remark-code{line-height: 1.5; font-size: 80%} @media print { .has-continuation { display: block; } } code.r.hljs.remark-code{ position: relative; overflow-x: hidden; } code.r.hljs.remark-code:hover{ overflow-x:visible; width: 500px; border-style: solid; } </style>