Population pyramids in ggplot

I have downloaded the data on resident population by age and sex from Demo ISTAT . Here is an example made using 2015 data from a North-Western Italian village of 2587 souls, Caltignaga, famous for its Roman aqueduct:

age sex   value
0   males  8
1   males  10
2   males  11
3   males  5
4   males  11
5   males  20


ggplot(data=dt) +
geom_bar(aes(age,value,group=sex,fill=sex), stat = "identity",subset(dt1,dt1$sex=="females")) +
geom_bar(aes(age,-value,group=sex,fill=sex), stat = "identity",subset(dt1,dt1$sex=="males")) +
scale_y_continuous(breaks=seq(-100,40,10),labels=abs(seq(-100,40,10))) +

extra annotation for the text and arrow:

aes(x = 96, y = 2, xend = 96, yend = 10),
data = dt1,curvature = -0.2,
arrow = arrow(length = unit(0.03, "npc")))+
annotate("text", x = 96, y = 16, label = "This is my grandma!",fontface="italic")

The gap between desired and observed fertility in Europe. Part 2: Childlessness levels.

To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

Involuntary childlessness has gained momentum in mainstream media, which attribute a large part (if not the totality) of the blame on the postponement of childbearing: women wait too long to have children, they don’t hear their biological clock ticking and bam! no children. Ever.

Delaying childbearing to later ages has undoubtedly a repercussion on the biological ability to have children, but it is hardly a simple component of the total effect. What the mainstream discussion is often missing on is that the great majority of children are conceived in unions, hence it is a couple’s decision to have children. Indeed, being single is an important if not pivotal deterrent to motherhood, usually delayed until union formation.

This is why it is important to consider factors such as union dissolution risk to appreciate the variation in involuntary childlessness. To better understand the effect of postponement we tried to measure it by calculating the effect of time spent on contraception while in a union by women who want to have children, a ‘conscious’ way to postpone childbearing.

This is a preview of average population childlessness obtained through simulation using 3 variables: celibacy (%of women ending up single and never entering a union), divorce (%women previously in a union but currently without a partner), and waiting time, the average time spent on contraception at the beginning of a union by a woman who wishes to have children.


>ggplot(dt, aes( Age, value, linetype=Variable, col=Variable))+
> geom_line( size=1) +
> scale_color_manual( values=c( "black", "#666666", "grey","black", "#666666", "grey"), guide=guide_legend( nrow=3, byrow=F, title =  "Childlessness" )) +
> xlab("")+
>scale_linetype_manual( values=c("solid", "solid",  "solid", "twodash", "dotted", "dashed"), guide=guide_legend( nrow=3, byrow= F, title =  "Childlessness" ))+
>theme( plot.margin= unit(c(1,4,1,1), "cm"), legend.position="bottom", legend.direction= "vertical")

1. ggplot(dt, aes( Age, value, linetype= Variable, col=Variable))

linetype= Variable and col=Variable set in the aes tell ggplot to automatically divide the lines based on the number of Variable(s);

2. scale_color_manual sets the colors of the lines contained in values. I was not satisfied with what I got with scale_color_grey so I set my colors manually (_manual!);

3. since I want the legend at the bottom AND in two columns (or 3 rows) AND I have two features specified in the aes I need to add a guide=guide_legend(nrow=3) to each scale_blablabla_manual (that is to say scale_color_manual AND scale_linetype_manual);

4. In guide=guide_legend the byrow=F means that I do not want the legend to appear ordered by row, but rather by columns;

5. in theme( legend.position=”bottom”) tells ggplot to put the legend below the graph and legend.direction to plot it in a vertical way (which I divide in 3 rows)

The GAP between desired and observed fertility in Europe. Part 1.

Using data from the FFS and the Human Fertility Database we have recomputed desired fertility estimates using Rodriguez and Trussel (1981) method and simulated the Parity Progression Ratios to first births for women in 11 European countries.

Desired vs Observed PPR

Working paper soon to follow.

1887 crude mortality rate in Spain using classInt package

TBM_1887 jenks
Crude Mortality Rate in Spain, 1887 Census

TBM_1887 quantile TBM_1887 bclust TBM_1887 fisher

>nclassint <- 5 #number of colors to be used in the palette
>cat <- classIntervals(dt$TBM, nclassint,style = "jenks")
>colpal <- brewer.pal(nclassint,"Reds")
>color <- findColours(cat,colpal) #sequential
>bins <- cat$brks
>lb <- length(bins)

style: jenks
[20.3,25.9] (25.9,30.5] (30.5,34.4] (34.4,38.4] (38.4,58.2]
68         114         130         115          35

Save the categories into a data.frame (dat)

type first second third fourth fifth
1 quantile    91     93    92     91    95
2       sd    10    202   244      5     0
3    equal   100    246   113      2     1
4   kmeans    68    115   142    118    19
5    jenks    68    114   130    115    35
6   hclust   100    174   153     34     1
7   bclust    53    120   275     13     1
8   fisher    68    114   130    115    35

and melt it into a long format (required by ggplot):

dat1 <- melt(dat,id.vars=c("type"),value.name="n.breaks")

geom_bar(stat="identity", position=position_dodge())