A space-time box plot of Spain’s TFR for 910 comarcas.

The idea behind spatial analysis is that space matters and near things are more similar: a variable measured in city A is (ideally) different from the same variable measured in city B. A simple way to get a feeling and to represent this hypothesis is through graphical visualization, usually a map(s).

TFRG_all_4years_Spain

However, when dealing with time series maps are cumbersome and  with sometimes some information is lost, such as the national average or path convergence. Box plots are a simple yet very effective way to synthesize a lot of information in one graph. The following plot depicts TFR over a 30 years period for 910 Spanish areas with respect to the national average value (thick black line in the middle of the boxes).

p <- ggplot(dat, aes(x=factor(YEAR), y=dat$TFR))
p <- p + geom_boxplot()
p <- p + scale_y_continuous(limits=c(0,2.5)) + scale_x_discrete("YEAR", breaks=seq(1981,2011,by=5))

TFRG

Moran plots in ggplot2

Moran plots are one of the many way to depict spatial autocorrelation:
moran.test(varofint,listw)
where “varofint” is the variable we are studying, “listw” a listwise neighbourhood matrix, and the function “moran.test” performs the Moran’s test (duh!) for spatial autocorrelation and is included in the spdep funtionality. The same plot can be done using ggplo2 library. Provided that we already have our listwise matrix of neighborhood relationships listw, we first define the variable and the lagged variable under study, computing their mean and saving them into a data frame (there are a lot of datasets you can find implemented in R: afcon, columbus, syracuse, just to cite a few). The purpose is to obtain something that looks like this (I have used my own *large* set of Spanish data to obtain it):

ggplot2.moranplot1

Upload your data. Here is Anselin (1995) data on African conflicts, afcon:

data(afcon)
varofint listw varlag var.name <- "Total Conflicts"
m.varofint m.varlag
and compute the local Moran's statistic using localmoran:

lisa
and save everything into a dataframe:
df

use these variables to derive the four sectors "High-High"(red), "Low-Low"(blue), "Low-High"(lightblue), "High-Low"(pink):
df$sector significance vec =df$m.varofint & df$varlag>=df$m.varlag]  df$sector[df$varofint<df$m.varofint & df$varlag<df$m.varlag]  df$sector[df$varofint<df$m.varofint & df$varlag>=df$m.varlag]  =df$m.varofint & df$varlag<df$m.varlag]

df$sec.data

df$sector.col[df$sec.data==1] <- "red"
df$sector.col[df$sec.data==2] <- "blue"
df$sector.col[df$sec.data==3] <- "lightblue"
df$sector.col[df$sec.data==4] <- "pink"
df$sector.col[df$sec.data==0] <- "white"

df$sizevar df$sizevar 0.1)
df$FILL df$BORDER
to get the ggplot graph:
p 0.05", "High-High", "Low-Low","Low-High","High-Low"))+
scale_x_continuous(name=var.name)+
scale_y_continuous(name=paste("Lagged",var.name))+
theme(axis.line=element_line(color="black"),
axis.title.x=element_text(size=20,face="bold",vjust=0.1),
axis.title.y=element_text(size=20,face="bold",vjust=0.1),
axis.text= element_text(colour="black", size=20, angle=0,face = "plain"),
plot.margin=unit(c(0,1.5,0.5,2),"lines"),
panel.background=element_rect(fill="white",colour="black"),
panel.grid=element_line(colour="grey"),
axis.text.x  = element_text(hjust=.5, vjust=.5),
axis.text.y  = element_text(hjust=1, vjust=1),
strip.text.x  = element_text(size = 20, colour ="black", angle = 0),
plot.title= element_text(size=20))+
stat_smooth(method="lm",se=F,colour="black", size=1)+
geom_vline(xintercept=m.varofint,colour="black",linetype="longdash")+
geom_hline(yintercept=m.varlag,colour="black",linetype="longdash")+
theme(legend.background =element_rect("white"))+
theme(legend.key=element_rect("white",colour="white"),
legend.text =element_text(size=20))

Check out the interactive shiny version on pracademic

You can find me at PAA 2015 Poster Session 2 and 3!

PAA2015_Poster_Spatial2PAA2015_Poster_Spatial_30You can find me at PAA 2015:

The Relationship between Space and Time: a Spatial Approach, Poster Session 2, slot 15, 10:30 AM – 12:30 PM

A Spatial Econometrics Analysis of Three Decades of Fertility Change in Spain, Poster Session 3, slot 27 Thursday, April 30 1:00 PM – 3:00 PM

Indigo Ballrooms A-H Level 2

A ggmap of 2015 Israeli elections by city

IL_el_percThe recent Israeli elections are a reminder of how Demography and Space play a crucial role in the outcome of the 20th Knesset. For more insight, read the full Demotrends blog post by Ashira Menashe-Oren the demographics of the Israeli electorate here. The map has been done using ggmap and ggplot, two simple mapping tools I really like. If you are interested in the code, below you can find the relative syntax and data.

To start upload the libraries:

library(maptools) #reads the shape file 
library(ggmap) 
library(ggplot2)

Download the shape file (I normally use Diva-GIS website) and read it:

map.ogr<- readOGR(".","ISR_adm1")

Data set:

df <- structure(list(lon = c(35.148529, 35.303546, 34.753934, 34.781768,34.989571, 34.824785, 34.808871, 34.883879, 34.844675, 34.90761, 35.010397, 34.871326, 35.21371, 34.655314, 34.887762, 34.792501, 34.574252, 34.791462, 34.748019, 34.787384, 34.853196, 34.811272, 34.919652, 34.888075, 35.098051, 35.119773, 34.872938, 34.835226, 34.988099, 35.002462), lat = c(32.517127, 32.699635, 31.394548, 32.0853, 32.794046, 32.068424, 32.072176, 32.149961, 32.162413, 32.178195, 31.890267, 32.184781, 31.768319, 31.804381, 32.084041, 31.973001, 31.668789, 31.252973, 32.013186, 32.015833, 32.321458, 31.892773, 32.434046, 31.951014, 33.008536, 32.809144, 31.931566,32.084932, 31.747041, 31.90912), City = structure(c(30L, 19L,24L, 29L, 9L, 25L, 7L, 11L, 10L, 14L, 16L, 23L, 13L, 1L, 21L,28L, 2L, 4L, 3L, 12L, 20L, 27L, 8L, 15L, 18L, 22L, 26L, 6L, 5L, 17L), .Label = c("Ashdod", "Ashkelon", "Bat yam", "Beersheva",  "Beit  Shemesh", "Bnei brak", "Giv'atayim", "Hadera", "Haifa",  "Herzliyya", "Hod HaSharon", "Holon", "Jerusalem", "Kefar Sava",  "Lod", "Modi'in - Makkabbim - Re'ut", "Modi'in Illit", "Nahariyya", "Nazareth ", "Netanya", "Petach Tikva", "Qiryat Atta", "Ra'annana",  "Rahat", "Ramat gan", "Ramla", "Rehovot", "Rishon", "Tel-Aviv",  "Umm Al-Fahm"), class = "factor"), most.votes = c(96.28, 91.41,  87.62, 34.03, 24.98, 30.93, 40.1, 38.77, 34.2, 34.66, 28.95,  32.75, 23.9, 30.96, 27.87, 29.78, 39.31, 37.17, 32.88, 30.86,  33.14, 26.95, 31.77, 32.22, 34.25, 35.01, 39.1, 57.56, 27.89,  71.63), party = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L,  2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L), .Label = c("joint list", "labour", "likud", "yahadut hatora"), class = "factor")), .Names = c("lon", "lat",  "City", "most.votes", "party"), class = "data.frame", row.names = c(NA,  -30L)) 

get the map using get_map

gmap <- get_map(location=c(34.2,29.4,36,33.5),zoom=7,source="stamen",maptype="watercolor")

and plot the map:


ggmap(gmap)+ 

geom_polygon(aes(x = long, y = lat, group=id), data = map.ogr, color ="blue", fill ="white", alpha = .8, size = .4)+ 

geom_point(aes(x=lon,y=lat,color=party,size=most.votes),data=df)+ scale_colour_discrete("Coalition", labels = c("Joint List", 
"Labour","Likud","United Torah Judaism"), breaks = c("joint list", 
"labour","likud","yahadut hatora")) +  
scale_size_continuous("Coalition", labels = c("Joint List",
 "Labour","Likud","United Torah Judaism"), breaks = c("joint list", 
"labour","likud","yahadut hatora"), range=c(10,15), guide = FALSE)+ 
theme(axis.text=element_text(size=18), 
plot.title=element_text(size=rel(3)), 
legend.key = element_rect(fill = "white"), 
legend.background =element_rect("white"), 
legend.text = element_text(size = 25), 
legend.title = element_text(size = 25))+ 
guides(colour = guide_legend(override.aes = list(size=8)))+  
labs(x="",y="")

IL_el_perc_city_names_color If you want to add city names you can use the “annotate” option, adding the code below after guides(...)+. I have modified the coordinates to avoid overlapping of labels and colored names to match the color of the winner party.

annotate("text", x=c(35.14853+ 0.2,35.21371+0.15,35.00246+ 0.15,34.79146+0.15, 34.98957-0.08,34.78177-0.14), 
y=c(32.51713,31.76832,31.90912,31.25297, 32.79405,32.08530),
size=5,font=3, 
label=c("Umm Al-Fahm","Jerusalem","Modin  Illit",
"Beersheva","Haifa","Tel Aviv"), 
color=c("darkred","blue4","deeppink4", "blue4",
"springgreen4","green4"))+

For beginners I highly recommend ggplot2 mailing list, a great and shame-free place to learn.

A Spatial Analysis of Recent Fertility Patterns in Spain – EPC poster

Here is the third poster session winner for EPC Budapest 2014 presenting the main results for our research on fertility differentials in Spain!
EPC Poster here is the link to the high resolution pdf, in case you’re interested.

EPC Poster_final Alessandra Carioli
All the graphics have been realized in R using maptools library for maps and ggplot library for graphs.

The working paper will soon follow.

Location, location, location! Why space matters in demography and why we should care.

Read my first contribution to the Demotrends blog! and don’t forget to like Demotrends either in facebook or twitter 🙂
Of course all graphics have been realized in R (maptools library and a bunch of others).
Location, location, location! Why space matters in demography and why we should care..

Reference managers: is there really an inexpensive solution?

Navigating the ocean of scientific writing in the last years, two issues of ‘the art of writing a scientific paper’ bothered me most: word processors and reference managers. The first issue was solved after I was introduced to the wonders of Scrivener. On the other hand reference managers are still an open wound.

I am currently reconsidering using Papers3 any further as I have had major issues with it. Namely, lost attachments and notes even after I corrected them manually.

Also, most importantly, the auto-match function is quite a mess: sometimes filling in the wrong volumes, articles, and page numbers. Always double check once you complete your references.

[under construction…]

I work in the field of Social Sciences where Statistics and Sociology meet. Sometimes Geography joins the party too. This means that I have a lot of tables, graphs, and words. Navigating the ocean of scientific writing in the last years, two issues of ‘the art of writing a scientific paper’ bothered me most: word processors and reference managers. The first issue was solved after I was introduced to the wonders of Scrivener: I couldn’t ask for anything better for my PhD thesis and for managing projects. Reference management is still an open wound. I went through the three stages of reference management so far: anger, illumination, abandonment. I tried out several free reference managers, all of them with issues I could not and would not want to bear with (starting to pay not being an option for me). The expensive (and obsolete) office-provided reference manager was even worse than the free version: manually entering most of my articles was torture and an incredible loss of time I did not want. Is there a solution?I mostly use a combination of Scrivener (composition), Latex (graphs and tables management) and Word (I don’t like it but it has a very good tracking system for changes), thus my ideal reference manager needs to work with all three of them. Papers 2 did and made me happy: then OS X Yosemite came and inline citation in Scrivener (and TextEdit and TexShop) ended. After writing to Scrivener customer support, which by the way was totally unaware of the problem, I found out that this was a ‘side effect’ of the last Mac update, mostly a Papers 2 problem and that they are not going to solve it (see support.mekentosj). Duh! Abandonment.
I have found my temporary solution in the Papers 3 trial version, but has two major problems: lost attachments and wrong matching.

I definitely want to check out more reference managers in the future, time allowing (namely Sente and Bookends).