Popupation pyramids updated

Upload the relevant packages and dataset. You can find the data on github here

options(scipen = 9)
mydt % filter(iso=='UGA')

The dataset includes population estimates at subnational level for Uganda.

# reformat the dataset using tidy

newdf % gather(variable, value,6:761) %>% separate(variable,c('year','sex', 'age'), sep='_') %>% mutate(sex=if_else(sex=='F','female','male')) %>%
spread(year, value) %>%
mutate(age2=recode(age, '1'='0-4', '4'='0-4', '5'='5-9','10'='10-14','15'='15-19', '20'='20-24', '25'= '25-29', '30'='30-34', '35'='35-39', '40'='40-44', '45'='45-49', '50'='50-54', '55'='55-59', '60'='60-64', '65'='65-69', '70'='70-74', '75'='75-79', '80'='80+')) %>%
mutate(age=recode(age, '1'='0', '4'='0'))

newdf$age %
gather(key = year, value = pop, 10:14) %>%
# mutate(pop = pop/1e03) %>%
filter(iso == "UGA"&adm_id==c("UGMIS2014452022"), year %in% c(2000, 2005, 2010, 2015, 2020))

newdf4 %
group_by(iso, adm_id, id, year, sex, age, age2, ageno) %>%
summarise(pop= sum(pop)) %>%
mutate(ageno = ageno + 1)

ggplot(data = newdf4, aes(x = age, y = pop/1000, fill = year)) +
#bars for all but 2100
geom_bar(data = newdf4 %>% filter(sex == "female", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity", width = 4.5) +
geom_bar(data = newdf4 %>% filter(sex == "male", year != 2100) %>% arrange(rev(year)),
stat = "identity",
position = "identity",
mapping = aes(y = -pop/1000)) +
coord_flip() +
scale_y_continuous(labels = abs, breaks = seq(-600, 600, 250)) +
geom_hline(yintercept = 0) +
theme_economist_white(horizontal = FALSE) +
scale_fill_economist() +
labs(fill = "", x = "", y = "")

Screen Shot 2019-07-14 at 15.46.36


Composite plots: grid.arrange

I really like composite plots, where there’s a top part that describes a phenomenon and a bottom part with a synthetic time view of the overall process.
I’ve recently discovered this beautiful representation of educational differentials by gender, by Sara Lopus and Margaret Frye, and the beauty of this dataviz is that it tells a story on its own. (Click on the link for the publication)

I have used a random generated data to reproduce the graph in ggplot and used grid.arrange from gridExtra package to bind grobs, the top and bottom components.

grid.arrange(top, bottom, heights=c(10,5), widths=c(20), padding=0)

I have saved the map as a .png file png package and used rasterGrob from package grid to create a raster image graphical object.

Screen Shot 2018-08-30 at 11.26.50

Abandon the rainbows?

Are rainbow palettes really that bad?
If it depended on me, my ideal color palettes would be the color of my favorite toy as a kid: fluo squishy slime, the kind that lights up in the dark. However, not everyone appreciates it, especially scientific journals.
In the end, the presence of colors should be motivated by what’s being represented: a cluster of lines depicting the TFR trends in every country in the world during the past century would not benefit from a 250 colors scale. However, it’s different for maps and heat-maps as the right palette can improve the message, condensing and directing information (see this Lancet paper ).
Rainbow palettes work however rather well (in my opinion) in some instances although they can easily be substituted with less catchy and more printer friendly colors.

I’ll always be a fan of bright colors but I see the point for plotting minimalism. I’ve really enjoyed this article: End of the Rainbow? New Map Scale is More Readable by People Who Are Color Blind

I have downloaded the json file from here and transformed it into a dataframe using the rjson package.
I have used a bunch of color palettes to compare results. Ever since, discovering the viridis palettes, I am a huge fan of the ‘magma’ and ‘inferno’ as their darkest color is a deep black and it’s easier to highlight everything else.


Red-yellow-green palette:

Screen Shot 2018-08-14 at 09.48.02

Red-purple palette:

Screen Shot 2018-08-14 at 09.48.13

Rainbow palette:

Screen Shot 2018-08-14 at 09.48.21.png

Magma palette:

Screen Shot 2018-08-14 at 09.50.51.png
Inferno palette:

Screen Shot 2018-08-14 at 09.51.14
Plasma palette:

Screen Shot 2018-08-14 at 09.51.45
Viridis palette:

Screen Shot 2018-08-14 at 09.52.04.png

Cividis palette (from cividis library here):
Screen Shot 2018-08-14 at 09.52.45.png

Greys palette:

Screen Shot 2018-08-14 at 09.53.06

Inverted grey palette:
Screen Shot 2018-08-14 at 09.53.14


Here’s the code for the cividis palette plot:

ggplot(dt, aes(order, year))+ geom_tile(aes(fill = temp)) + scale_fill_cividis(na.value = "transparent")+ scale_y_reverse(name='', breaks = c(1876, 1900, 1950, 2000, 2018), labels=c('1876', '1900', '1950', '2000', '2018'))+ scale_x_continuous(name='', breaks = c(30-15, 61-15, 92-15, 122-15), labels=c('June', 'July', 'August', 'September'))+ theme(axis.ticks.x = element_blank())+ geom_vline(xintercept=c(30, 61, 92), linetype = "longdash" )