Create, scan, and correct exams with R
By Edgar Treischl
July 10, 2021
Create, scan, and correct exams with R
As instructors, we are creating exams each semester and often we use multiple-choice questions to assess the knowledge of our students. I don’t know how many times I have manually corrected an exam which is why this blog gives a short introduction and shows why you should use the R
exams package for single or multiple choice exams, even if you are not a heavy R user. The R
exams package provides some feature to automate the entire process, from generating the exam up to assigning grades. This blog shows how the package works to encourage people with little R experience to use the
First, why should you consider R for exams? Short answer, the
exams package helps you to generate, scan, and assess the exam. It helps you to create a scan sheet that your students fill out in the exam. After the exam, you can use a regular copy machine to scan the sheets. The
exams package scans those images, assess students’ answers, and provides documents with the results instantly. That is a awesome reason, don’t you think, but there are additional reasons why you should consider the
exams package. It helps you to reduce your own mistakes because students’ answers are no longer corrected manually. Furthermore, it returns a HTML file for each participant that shows the scan sheet, the given and the correct answers, and how many points the person has earned.
Let’s see how it works. In order to use exams, we need to setup a folder that contains all the question of the exam, we need the images of the exam to extract the information and we have to assess the students’ answers.Give it a try and learn how each step works before you setup your own exam, because the exams package comes with a nice tutorial and all necessary demo files to create and scan an exam. The next subsections give you a quick summary of the demonstration from the website, but you can find a step-by-step guide on the R exams website as well.
The demo from the R exams package
Before we can start to create the exam, we need exercises or questions of the exam and the package provides a list of exercises files that we can download on the website. Make a new folder where all the files will be stored. Copy the exercises from the website and save them in a new folder named exercises. Next, create a new R script, install and load the library.
As a first step, we need to decide which questions should be included in the exam. Therefore, we create a list that stores questions names. In the last section, we will see how to create our own questions, but as first step it is fine if we use the questions provided by the
exams package. As the next output shows, I created a list (
exercises_exam) with several example questions:
exercises_exam <- list( "tstat2.Rnw", "ttest.Rnw", "relfreq.Rnw", "anova.Rnw", c("boxplots.Rnw", "scatterplot.Rnw"), "cholesky.Rnw" )
Second, I use this list to create the exam with the help of the
exams2nops function. The latter creates a PDF file based on the
exercises_exam list and adds the scan sheet as first page. As the
exams package outlines, you may want to set a seed to reproduce the results and we assign the exam as
set.seed(403) test_exam <- exams2nops(exercises_exam, dir = "nops_pdf", name = "demo", date = "2015-07-29", points = c(1, 1, 1, 2, 2, 3), n = 2)
There are several options how to adjust the exam and you may check on the documentation of the package to adjust the minimal code for your own purposes. As the minimal code shows, we need to provide the directory (
dir) where the pdf will be stored (here:
nops_pdf), a name and the date of the exam; and I created two different versions of the exam (with randomized order of the questions), but you can create more versions if you want to. In addition to the pdf files, the
exams2nops function saves a
.rds file in the directory which contains all of the meta data about the exam (e.g. solutions). We will see where this information is coming in the last section.
The package provides everything that is needed for a test run and includes also test images that we can use to scan. The next code snippet saves two example scan files as
scan_image. We can use the two files to see how we can evaluate images and scan an exam.
scan_image <- dir(system.file("nops", package = "exams"), pattern = "nops_scan", full.names = TRUE)
All of the scan images must be stored in one directory. As the minimal example of the exam package shows, we can create a new directory with
dir.create("nops_scan") where the scan files will be saved and the second line of code copies the demo files
scan_image into the
nops_scan folder with
dir.create("nops_scan") file.copy(scan_image, to = "nops_scan")
After you run this code chunk, two fake examples of scan images appear in your folder, one from Ambi Dexter and another one from Jane Doe, which the next figure displays to illustrate.
Now, we have prepared all the essential steps, we can use the
nops_scan() function to scan the images. As the next console shows, it trims and rotates the files and extracts the information from the PNG.
nops_scan(dir = "nops_scan")
Have a look in your directory. The
nops_scan() function saves the result as a archive, which includes the png files as well as a text file with the extracted information.
Before we can finally evaluate the results, we need a list with the information of our students to match them with the results of the exam. The minimal example gives you also a code snippet to create a csv file that contains the information about the two fake students from the demo.
write.table(data.frame( registration = c("1501090", "9901071"), name = c("Jane Doe", "Ambi Dexter"), id = c("jane_doe", "ambi_dexter")), file = "Exam-2015-07-29.csv", sep = ";", quote = FALSE, row.names = FALSE)
Finally, we can use the
nops_eval() function to evaluate the scanned images. The
register points to the students’ matching list,
solutions points to the meta data of the exam,
scans provides the directory and name of the scan results,
eval determines how the results are evaluated (e.g. do we give partial points), and
interactive gives information whether errors should be handled interactively or not. Check out the documentation of
nops_eval()for more options and information.
exam_results <- nops_eval( register = "Exam-2015-07-29.csv", solutions = "nops_pdf/demo.rds", scans = Sys.glob("nops_scan/nops_scan_*.zip"), eval = exams_eval(partial = FALSE, negative = FALSE), interactive = TRUE )
nops_eval() returns a data frame that contains the answers, solutions, and given points for each student!
Checkout your folder. The
nops_eval() function has already exported the exam file and it created an archive that contains a short summary document of the exam results for each student. As the next figure shows, it displays the meta information of your students, an assessment of each question, and the image of the scan sheet used to extract the information. Thus, you are really prepared if your students show up to review the exam.
exams package reduces a lot of pain when it comes to correct exams and I really hope, that I have convinced you that you can handle the discussed steps, even if you only limited experience using R. To boost the popularity of the package, and to convince you that you should stick to R for the next steps as well, the next section shows you how can use R to prepare the data and automate the process to communicate the exam results. Sure, you can use any software to finalize the data, but R gives you some nice features to automate this process and you can even use R to make a summary document for your students without much effort. Unfortunately, this implies that I assume in the next section that you have some basic knowledge about R. I try to outline most important steps to outline what happens if you run the code, but I skip the details.
R is your exam friend
How do we communicate the results of the exam and how can we automate this process? The data is saved as
nops_eval.csv and we can use R to wrangle the exam data, prepare a final list with the results (a list to enter grades in the educational system), and to communicate the results to your students.
Obviously, the exam data is already loaded, but we have to import the data if we want to provide a short summary for the participants or if we want to rerun the data management steps. Thus, use the
readr package to import the data and the
tidyverse package for data wrangling. If you are not familiar with loading data in R, import the data with the import data function in RStudio. It gives you a preview of the data, shows you the corresponding packages and the code to import the data. As the following code snippet shows, you can read a delimited file (including csv & tsv) with the
read_delim() function and we have to tweak the delimiters, because our file contains semicolons instead of commas to separate values.
library(readr) exam_df <- read_delim("nops_eval.csv", ";", escape_double = FALSE, trim_ws = TRUE)
Next, I exclude all variables which are not longer necessary for the report after importing that data. The
dplyr package gives a lot handy functions to work with data and the package is included in the
tidyverse package. We can use the
select() function to make a narrow data frame with an ID variable (register number) and the points variable from the exam data only. I generated some fake data to illustrate this process, but the code shows you furthermore how you can save a new data frame with the selected variables:
library(tidyverse) exam_df <- tribble( ~ID, ~points, 1, 57, 2, 60, 3, 84, 4, 45, 5, 82 ) exam_df <- exam_df %>% select(ID, points) exam_df
## # A tibble: 5 × 2 ## ID points ## <dbl> <dbl> ## 1 1 57 ## 2 2 60 ## 3 3 84 ## 4 4 45 ## 5 5 82
Again, the exam package makes our life very easy since there is not much to do which is why I try to encourage people to use R even if you have little experience using it. As the next code chunk illustrates, we have to generate a new variable that stores the grade depending on the points people have achieved. Use
mutate() to extend the data frame and the
case_when() function assigns grades in accordance to the points of the exam. I decided that grade level goes from 100 to 50 points with a range of 5 points for each grade level, but that is not the important point here. The
case_when() function checks whether the condition (e.g.
points >= 95 ~ 1.0) is fulfilled and assigns the corresponding grade if that’s the case. Let’s see how it works:
exam_df <- exam_df %>% mutate( grade = ( case_when( points >= 95 ~ 1.0, points >= 90 ~ 1.3, points >= 85 ~ 1.7, points >= 80 ~ 2.0, points >= 75 ~ 2.3, points >= 70 ~ 2.7, points >= 65 ~ 3.0, points >= 60 ~ 3.3, points >= 55 ~ 3.7, points >= 50 ~ 4.0, points <= 49 ~ 5.0 ) ) ) exam_df
## # A tibble: 5 × 3 ## ID points grade ## <dbl> <dbl> <dbl> ## 1 1 57 3.7 ## 2 2 60 3.3 ## 3 3 84 2 ## 4 4 45 5 ## 5 5 82 2
As the output shows, Person 1 has 57 points and gets the grade 3.7 (German grading system); person 4 gets the grade 5 because he/she has achieved less than 50, and so on. Please check each grade level to make sure that there are no mistakes, no typos, or any other problems.
The next steps depend on how you have to enter the grades in your higher educational system. For instance, I need a sorted list and the grades multiplied by 100 at my university. Nothing easier than that, use
mutate() again to extend our data frame with an additional grade and use
arrange() to sort the data.
exam_df %>% mutate(grade_system = grade * 100) %>% arrange(ID)
## # A tibble: 5 × 4 ## ID points grade grade_system ## <dbl> <dbl> <dbl> <dbl> ## 1 1 57 3.7 370 ## 2 2 60 3.3 330 ## 3 3 84 2 200 ## 4 4 45 5 500 ## 5 5 82 2 200
Finally, I need to match the exam list with a list of students who have actually registered for the exam, but some did not show up in the exam. This step also depends on what your institution wants from you, which makes it hard for me to give you any useful advice. You may want to check out how to merge data (in my case I used a
left_join()); after I merged the data, I save the final results with the
readr package, for example, as a csv file.
Thus, you can create, scan and correct exams even if you have only limited knowledge about R, but I know from my own experience that the start can be tricky and we all need sometimes an incentive.
I guess reducing mistakes when correct exams is already an huge incentive, but you can also use R to generate a small report for your students. You may want to check out
rmarkdown which let you easily create different files (pdf, html, word) and I have a standard document for my students that contains a table as well as a histogram that depicts the exam grades. The next console shows the code to generate such a histogram with the help of the
mean_grade <- exam_df %>% pull(grade) %>% mean() %>% round(2) ggplot(exam_df, aes(x=grade)) + geom_histogram(colour="black", fill="white", bins = 11)+ geom_vline(xintercept=mean_grade, size=1.5, color="red")+ geom_text(aes(x=mean_grade+0.5, label=paste0("Mean\n",mean_grade), y=8))+ theme_minimal(base_size = 14)
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0. ## ℹ Please use `linewidth` instead. ## This warning is displayed once every 8 hours. ## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was ## generated.
ggplot() package, rmarkdown, r-exam, maybe you feel a lit bit overwhelmed depending on your background. I just wanted to outline the advantages if we create all steps in the same environment, and R gives you the possibilities to do all essential steps when it comes to exams. Moreover, it is very easy to learn rmarkdown or
ggplot() in case you have never heard of it before. I hope that the code examples give you a start how to apply it on your own. You could even have my own RMarkdown template, but RStudio comes with several rmarkdown templates and if you copy the code from above, you have essentially the same as I use.
From my opinion there is only one thing left for me to do. You have to create your own exercises before you can think of using R for your exams.
Create your own exercises
The next output shows you an example of an exercise. The exercises need to be available as a Markdown or a RMarkdown file. Obviously, another good reason why you want to learn more about RMarkdown. Anyway, even if you are not familiar with both, creating new exercises is very easy and the structure of the exercises is not complicated at all.
Question ======== What is the question? Answerlist ---------- * A * B * C * D Solution ======== A and C Answerlist ---------- * True * False * True * False Meta-information ================ exname: question1 extype: mchoice exsolution: 1010 exshuffle: TRUE
I don’t think there is much to say how to provide a question, answerlist or the solutions. So, let’s have a look at the meta-information at the end of the exercise. In this section you have to outline whether you use a single (schoice) or multiple choice (mchoice); exsolution points to the binary string code for the solutions, here A and C are right which leads to 1010. Ultimately, you can decide whether the answers of the questions is shuffled or not.
The R exams package has much more to offer than I could possibly show you. I just tried to give a quick summary how R can be used for exams. Visit the website for tutorials, the dynamic exercises, or e-learning tests. But most of all I hope that I could convince some people that learning how to create, scan, and correct exams with R is no magic at all.
- Posted on:
- July 10, 2021
- 14 minute read, 2802 words
- See Also: