Upload files from Excel, STATA, SAS, SPSS and text
First set the working directory (or check it)
getwd() # get working directory
[1] "/Users/me/My Folder/"
setwd("./My Subfolder/") # set working directory
1. .csv and .txt files
the read.csv function has many options, some of them are header=T which sets the first line as column names, sep=“,” the field separator character (in this case the semicolon), dec=“.” decimal sep character, skip=2 number of lines to skip (in this case 2).
read.csv2 is identical to read.csv except it assumes commas to be the decimal operators and semicolon as field separator
read.table works similarly to read.csv, but reads text files.
mydata <- read.csv("mydata.csv", header=T)
When importing data in R, if any column’s name is a number, R will add an X to it (as in general it is a very bad idea to have numbers for column names, but can be handy). You can replace column names with:
colnames(mydata) <- c("name1", "name2", "name3", "2017", "2018", "2019")
If you change or add anything to your data and want to save it then ( write.table for txt output):
write.csv(mydata, "mydata.csv", row.names=FALSE)
2. STATA files .dta
library(foreign) write.dta(mydata, "mydata.dta")
3. SPSS files .sav
use.value.labels by default is TRUE and converts value labels into factors. The mydata.txt is the name for data output, while the mydata.sps is the code output.
library(foreign) mydata <- read.spss("mydata", to.data.frame=T, use.value.labels = FALSE) write.foreign(as.data.frame(mydata), "mydata.txt", "mydata.sps", package="SPSS")
4. SAS files .sas
Note that by default it converts value labels into factors
## to read from SAS library(Hmisc) mydata <- sasxport.get("mydata.xpt") ## to save in SAS format library(foreign) write.foreign(as.data.frame(mydata), "mydata.txt", "mydata.sas", package="SAS")
5. Excel spreadsheet
# library(xlsx) mydata <- read.xlsx("c:/myexcel.xlsx", 1) # 1 refers to the first worksheet-page altrenatively... mydata <- read.xlsx("c:/myexcel.xlsx", sheetName="Data input") write.xlsx(mydata, "mydata.xlsx")
# library(readxl) mydata <-system.file("mypath/myexcel.xlsx", package = "readxl") mydata <- read_excel(mydata, 1)
(A few) quick tricks
# head(mydata, n=10) # first 10 rows tail(mydata, n=10) # last 10 rows mydata[1,1:10] # print first row and first 10 columns names(mydata) # variable names length(mydata) nrow(mydata) # number of rows ncol(mydata) # number of columns str(mydata) # list structure of data class(mydata) # class of data view(mydata) # opens viewer window