Biostatistics with R programming
Set knit directory
1 | knitr::opts_chunk$set(root.dir = "/Users/doublefire_chen/Downloads") |
Set work directory
1 | ## CHANGE THIS TO YOUR OWN PATH! |
Load dataset from csv file
1 | d <- read.delim2("./data_monday-3.csv", sep = ";", dec = ",") |
The essential R commands for understanding your dataset
1 | names(d) # column, or variables names |
Basic calculation and function
1 | mean(d$weight) # calculates the mean of a vector |
Histogram
1 | hist(d$weight, breaks = 12, col = "darkorange", border = "darkblue", main = "Collected Weight Measurments", xlab = "Weight (kg)", ylab = "Freq", density=30, cex.axis = 1.3, cex.main = 2, labels = T, cex.lab = 1.3, xlim =c(40, 100)) |
1 | # Use ggplot to draw histogram |
Plot
1 | plot(d$weight, d$il6r,col = "red", type = "p", main = "Weights", xlab = "Weights(kg)", ylab = "il6r expression") |
Density plot
1 | plot(density(d$weight[d$sex == "f"]), col = "red", lwd = 2, xlab = "Weights (kg)", main = "Density plot for weight", xlim = c(35, 105)) |
Boxplot
1 | boxplot(d$weight ~ d$gen, col = c("blue", "red", "pink"), main = "Weights by genotype", xlab = "Genotype", ylab = "Weight(kg)") |
Test
Normal distribution test
1 | hist(d2$bio3) |
t-test
One sample t-test
1 | t.test(d$bio1, mu = 17, conf.level = 0.99) |
Two sample t-test
1 | t.test(d2$bio3 ~ d2$sex) |
Two sample paired t-test
1 | t.test(d$marker1[d$drug1 == "1"], d$marker2[d$drug1 == "1"], conf.level = 0.95, paired = TRUE) |
non-parametric test
1 | # sign-ranked test |
correlation test
1 | # Pearson correlation is the parametric test |