```{r setup, echo=FALSE, cache=FALSE} # stop scientific notation in numeric output options(scipen = 999) ``` Title of your lab report ======================================================== ## Introduction You are going to write your lab report as an R Markdown document. Markdown is a simple formatting syntax that is used to write plain text documents that can be created and edited in many different programs and easily converted into HTML and PDF. You should open and edit this document in RStudio, which has many useful features for writing markdown. To learn more, click the **Help** toolbar button for more details on using R Markdown. It should take you five minutes to learn all about Markdown that you need to know to write your lab report. If you want to spend more than five minutes, you can read the detailed syntax [here](http://daringfireball.net/projects/markdown/syntax). When you click the **Knit HTML** button in RStudio, a web page will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this: ```{r input-all-data, echo = FALSE, message = FALSE, cache=FALSE} ########### DO THIS FIRST !!! ################################## # this chunk connects to the google sheet and brings the data into # your R environment. All of the chunks below here assume that you've # done this one first. Don't move this chunk from this location. require(au13uwgeoarchlab) my_data <- get_data() ``` ## Background The reason why we are using this exotic format for our lab report is to learn about reproducible research. Reproducible research is an approach to science that connects specific instructions to data analysis and empirical data so that scholarship can be recreated, better understood and verified. In practice, this means that all the results from a publication or report can be reproduced with the code and data available online. The advantages of this approach include: * We can automatically regenerate documents when our code, data, or assumptions change. For example if we do further analysis of our samples we can just add it on to this report. * We minimize or eliminate transposition errors that occur when copying results (eg from Excel) into documents. * We preserve a detailed preserve contextual narrative about why analysis was performed in a certain fashion (ie. we don't have to search on our computer for all the files because everything is one file) * Documentation for the analytic and computational processes from which conclusions are drawn. In case anyone wants to know _all_ the details of our statistical methods, they can see them all in the code blocks here. Our specific tool-kit for reproducible research is a very common one: R + RStudio + Markdown. Although it's common, it has a bit of a steep learning curve. There's a lot of documentation and examples on the web that can ease the frustration. You may never (want to) write another document using this tool-kit ever again, but I think it's important that you have an understanding of this approach to doing science, in the hope that some of the ideas will rub off to improve any kind of empirical work that you might do in the future. This approach to doing science is becoming widespread in computer science and some areas of biology and psychology. I expect it's part of a general shift in the way all sciences will be done, and will come to archaeology eventually. When you write your report you should delete the above text and use this section to write about archaeological sites and environmental records that are relevant to understanding our lab data. ## Methods and Materials This would be a good place to mention the radiocarbon dates... and refer to the excavation report for details about the site and excavation methods. ### Chemical analyses Brief description of how you measured pH, EC, SOM , CaCO~3~... The maximum value of organics was `r max(my_data$mean.Organic)`% at `r my_data[which.max(my_data$mean.Organic),]$Sample.ID`m below the surface The maximum value of carbonates was `r max(my_data$mean.CaCO3)`% at `r my_data[which.max(my_data$mean.CaCO3),]$Sample.ID`m below the surface, approximately `r au13uwgeoarchlab::interpolate_date(my_data, my_data[which.max(my_data$mean.CaCO3),]$Sample.ID)` BP The range in electrical conductivity ranged from `r max(my_data$mean.EC)` to `r min(my_data$mean.EC)` ### Physical analyses Brief description of how you measured colour, magnetic suceptibility and particle size distributions... ## Results Your one or two sentence summary observations of the most striking changes in the sedimentary sequence at the site... ### Chemical analyses Now the details... pH, EC, SOM, CaCO~3~ e.g. The pH values ranged from `r max(my_data$mean.pH)` to `r min(my_data$mean.pH)`... ### Physical analyses Particle size distributions... lots of possible plots here, chose wisely! Magnetic susceptibilty values ranged from `r round(max(na.omit(my_data$mean.MS.LF)),0)` to `r round(min(na.omit(my_data$mean.MS.LF)),0)`... ### Discussion Here's where you describe the implications of your data in terms of: 1. Past human behaviours 2. Past environments 3. Site formation processes You will need to draw on previously published studies to make new connections with your data. Show how your data support or contradict previous claims. You may want to make some connection to a change in our variables at say, sample 3.4 and other data from other sites. You'll want to refer to the age of sample 3.4, which you can compute like this, right in the middle of your sentence: "This substantial change in xxx occurs at about `r au13uwgeoarchlab::interpolate_date(my_data, 0.7)` cal years BP." ### References Use APA style, look on [google scholar](http://scholar.google.com.offcampus.lib.washington.edu/) for the little 'cite' link that will generate nicely formatted references for you. ### Tables and Figures For convienence, let's put all the tables and figures at the end of the text. This is a common convention when preparing manuscripts for submission to journals or books for publication. Do be careful to put the tables and figures in a logical order that reflects the order that you mention things in your text. And don't forget to edit the captions for your tables and figures to be richly detailed. Tables first... Here's what you'd do for a simple table. This style of table is for simple qualitative tables (ie. mostly text in the table, not numbers, read on for making tables of numbers...): ```{r table-simple, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") my.data <- " Tables | Are | Cool col 3 is | right-aligned | $1600 col 2 is | centered | $12 zebra stripes | are neat | $1" df <- read.delim(textConnection(my.data),header=FALSE,sep="|",strip.white=TRUE,stringsAsFactors=FALSE) names(df) <- unname(as.list(df[1,])) # put headers on df <- df[-1,] # remove first row row.names(df)<-NULL pander(df, style = 'rmarkdown') ``` Or we can make a table from our data, so the table will update when we update our data (a much better approach! Do this for any lab data you want to tabulate). For example: ```{r table-dates, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") table_data <- na.omit(my_data[, c( 'DirectAMS.code', 'Submitter.ID', 'OxCal.median', 'OxCal.sigma')]) pander(table_data, style = 'rmarkdown') ``` Don't forget to edit the table captions... ```{r table-data, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") table_data <- my_data[, c('Sample.ID', 'mean.pH', 'mean.EC', 'mean.MS.LF', 'mean.MS.FD', 'mean.Organic', 'mean.CaCO3')] pander(table_data, style = 'rmarkdown') ``` ```{r table-summary, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") # specify variables that you want to summarise my_vars <- c('mean.pH', 'mean.EC', 'mean.MS.LF', 'mean.MS.FD', 'mean.Organic', 'mean.CaCO3') # apply a function to a subset of my_data that # only includes these variables smry <- data.frame(sapply(my_data[,my_vars], function(i) list(mean(i), range(i)))) row.names(smry) <- c('Min', 'Max') pander(smry, style = 'rmarkdown') ``` You might just want to have a table of sediment munsell colour values... ```{r table-colour, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") table_data <- my_data[, c('Sample.ID', 'Dry.Color.ID', 'Dry.Color.nomenclature', 'Wet.Color.ID', 'Wet.Color.Nomenclature')] pander(table_data, style = 'rmarkdown') ``` And a table summarising the particle size distributions ```{r table-psd, echo=FALSE, message=FALSE, warnings=FALSE, results='asis'} require(pander) panderOptions('table.split.table', Inf) set.caption("My great data") require(au13uwgeoarchlab) my_psd_stats <- psd_stats(my_data, plot = FALSE) pander(my_psd_stats, style = 'rmarkdown') ``` And now Figures... ```{r dates-plot, echo=FALSE, message=FALSE, warning=FALSE, fig.cap='this is one plot'} require(au13uwgeoarchlab) plot_dates(my_data) ``` ```{r key-variables-stratigraphic-plot, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='here are the data!'} require(au13uwgeoarchlab) strat_plot_key_variables(my_data, c('Sample.ID', 'mean.pH', 'mean.EC', 'mean.MS.LF', 'mean.MS.FD', 'mean.Organic', 'mean.CaCO3'), cluster = TRUE, n = 3) ``` ```{r finds-stratigraphic-plot2, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='here are the data!'} require(au13uwgeoarchlab) strat_plot_key_variables(my_data, c('Sample.ID', 'Lithic_mass_g', 'Freshwater_shell_mass_g', 'Bone_teeth_mass_g'), cluster = FALSE) ``` ```{r LF-FD-biplot, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='this is one plot'} # Use this chunk as an example of how to make a biplot. You may want copy this # chunk and put it elsewhere in your report to make a biplot of other # variables, such as 'mean.Organic' and 'mean.MS.LF', just replace # them the function. To get a list of our variables, type: names(my_data) require(au13uwgeoarchlab) biplot_with_correlation(my_data, 'mean.pH', 'mean.Organic') ``` ```{r MS-lithic-biplot, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='this is one plot'} # Use this chunk as an example of how to make a biplot. You may want copy this # chunk and put it elsewhere in your report to make a biplot of other # variables, such as 'mean.Organic' and 'mean.MS.LF', just replace # them the function. To get a list of our variables, type: names(my_data) require(au13uwgeoarchlab) biplot_with_correlation(my_data, 'mean.MS.LF', 'Lithic_mass_g') ``` ```{r carbonate-shell-biplot, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='this is one plot'} # Use this chunk as an example of how to make a biplot. You may want copy this # chunk and put it elsewhere in your report to make a biplot of other # variables, such as 'mean.Organic' and 'mean.MS.LF', just replace # them the function. To get a list of our variables, type: names(my_data) require(au13uwgeoarchlab) biplot_with_correlation(my_data, 'mean.CaCO3', 'Freshwater_shell_mass_g') ``` ```{r sed-shell-biplot, echo=FALSE, message=FALSE, warnings=FALSE, fig.cap='this is one plot'} # Use this chunk as an example of how to make a biplot. You may want copy this # chunk and put it elsewhere in your report to make a biplot of other # variables, such as 'mean.Organic' and 'mean.MS.LF', just replace # them the function. To get a list of our variables, type: names(my_data) require(au13uwgeoarchlab) my_data <- derivative_dates(my_data) # add derivative column biplot_with_correlation(my_data, 'deriv', 'Freshwater_shell_mass_g') ``` ```{r psd-ternary-plot, echo=FALSE, message=FALSE, warning=FALSE, fig.cap='this is one plot'} # draw a triangle plot for all samples require(au13uwgeoarchlab) psd_ternary_plot(my_data) ``` ```{r psd-plot-a-few-samples, echo=FALSE, message=FALSE, fig.cap='this is one plot'} require(au13uwgeoarchlab) require(gridExtra) # package to combine multiple plots on one view plot1 <- psd_plot_one_sample(my_data, 1.5) # change the number to chose a different sample plot2 <- psd_plot_one_sample(my_data, 3.4) # ditto grid.arrange(plot1, plot2, ncol=1) # sets layout of vertical stack of plots ``` ```{r psd-strat-plot, echo=FALSE, message=FALSE, warning=FALSE, fig.cap='this is one plot'} # draw a stratigraphic plot of three classes of particle sizes all samples require(au13uwgeoarchlab) psd_strat_plot(my_data, scale.percent = TRUE) ``` ```{r psd-stats-plot, echo=FALSE, message=FALSE, warning=FALSE, fig.cap='this is one plot'} # draw a stratigraphic plot of some particle size summary stats require(au13uwgeoarchlab) my_psd_stats <- psd_stats(my_data, plot = TRUE) ``` ```{r, code-to-make-PDF, echo=FALSE, message=FALSE, eval=FALSE} # This chunk is to run the code and generate the PDF # it will not appear in the PDF because we've set 'echo=FALSE' and # it will not run when you knit HTML because we've set 'eval=FALSE' so # you'll need to run this block line-by-line to get the PDF (save # the file first!) # For this step you'll need to have two other programs installed on your computer # 1. Pandoc: http://johnmacfarlane.net/pandoc/installing.html # 2. LaTeX: follow the instructions on the Pandoc download page require(au13uwgeoarchlab) rmd_to_pdf("vignette", "C:/Users/marwick/Documents/GitHub/au13uwgeoarchlab/vignettes") # If you get an error about a '.Random.seed' not found', then go to the # directory that contains your Rmd file and delete the folder called # cache and the folder called figure. # If you get an error that something 'had status 1', then shut all the PDF files # that you have open and try again. ```