Introduction to zoomclass
zoomclass.Rmd
Functions in {zoomclass} package are divided in 2 categories which aims to analyse input from (mainly) Zoom’s participants report .csv
and Zoom’s chat .txt
file.
Participants Report
In this category, High-level functions are design to analyse Zoom’s participants report (.csv
file), especially for checking student’s attendance in Zoom classroom.
Zoom’s Participant Report
The term I used “Zoom’s participant report”, basically, is a CSV log file that host can downloaded after Zoom meeting has ended. It contains the following columns:
- Name (Original Name): show participant’s current name in Zoom meeting with optional original name in the last balanced parenthesis.
- User Email: an email that participant signed in Zoom account
- Join Time: join time of each participants by each individual sessions
- Leave Time: leave time of each participants by each individual sessions
-
Duration (Minutes): computed by
Leave Time
-Join Time
of each session - Guest: “Yes” means that person is a participants; “No” means he/she is a host
- Recording Consent: “Y” means participant allowed recording in zoom
This package comes with an example data of Zoom’s participant report in .csv
which I will use for demo in this tutorial.
# List all example data in `zoomclass` package
zoomclass_example()
#> [1] "participants_heroes_full.csv" "participants_heroes.csv"
#> [3] "zoom-chat-1.txt" "zoom-chat-2.txt"
# Get a path to specific example data
path_heroes <- zoomclass_example("participants_heroes.csv")
path_heroes
#> [1] "/home/runner/work/_temp/Library/zoomclass/extdata/participants_heroes.csv"
Path to all example files can be obtained by the following command.
system.file("extdata", package = "zoomclass")
#> [1] "/home/runner/work/_temp/Library/zoomclass/extdata"
Read Participants
read_participants()
read the data in using readr::read_csv()
and then performs the followings steps:
- Read Zoom’s participant report into a tibble
- Clean column names, so that we can manipulate more easily in
R
- Apply appropriate time formatting (POSIXct) for relevant columns
- Extract participant’s current (displayed) name and original name into
Name
andName_Original
columns, respectively.
The output is a tibble with zoom_participants
subclass. If metadata regarding Zoom meeting was found, it will be assigned to a meeting_overview
attribute.
pp_heroes <- read_participants(path_heroes)
str(pp_heroes)
#> zoom_participants [31 × 9] (S3: zoom_participants/tbl_df/tbl/data.frame)
#> $ Name (Original Name): chr [1:31] "015_Loki (Loki Laufeyson)" "019_Mockingbird" "014_Clea" "003_Power Girl (Kara Zor-L)" ...
#> $ Name : chr [1:31] "015_Loki" "019_Mockingbird" "014_Clea" "003_Power Girl" ...
#> $ Name_Original : chr [1:31] "Loki Laufeyson" NA NA "Kara Zor-L" ...
#> $ Email : chr [1:31] "loki_asgardian@marvel.com" "mockingbird_human@marvel.com" "clea_unknown@marvel.com" "power-girl_kryptonian@dc.com" ...
#> $ Join_Time : POSIXct[1:31], format: "2021-11-19 09:54:15" "2021-11-19 10:07:44" ...
#> $ Leave_Time : POSIXct[1:31], format: "2021-11-19 12:01:59" "2021-11-19 12:01:58" ...
#> $ Duration_Minutes : num [1:31] 128 115 10 114 52 114 113 113 112 112 ...
#> $ Guest : chr [1:31] "Yes" "Yes" "No" "Yes" ...
#> $ Rec_Consent : chr [1:31] "Y" "Y" NA "Y" ...
As you can see, almost all column names were cleaned with no spaces; Join_Time
and Leave_Time
were formatted as POSIXct object, correctly. Moreover, “015_Loki (Loki Laufeyson)” in original Name (Original Name)
column is extracted into “015_Loki” in Name
and “Loki Laufeyson” in Name_Original
, respectively.
The output tibble pp_heroes
contain sessions information of each participants. If we count names of each participants, we can see that some rows is duplicated.
pp_heroes %>%
count(`Name (Original Name)`, sort = TRUE) %>%
head()
#> # A tibble: 6 × 2
#> `Name (Original Name)` n
#> <chr> <int>
#> 1 003_Power Girl (Kara Zor-L) 2
#> 2 010_Cyborg Superman 2
#> 3 017_Swarm 2
#> 4 022_Dazzler 2
#> 5 001_Magus 1
#> 6 002_She-Thing 1
For example, “Power Girl” has 2 rows because she had 2 sessions in Zoom (join and leave 2 times).
pp_heroes %>%
filter(Name == "003_Power Girl") %>%
select(Name, Email, Join_Time, Leave_Time, Duration_Minutes)
#> # A tibble: 2 × 5
#> Name Email Join_Time Leave_Time Duration_Minutes
#> <chr> <chr> <dttm> <dttm> <dbl>
#> 1 003_Power Girl power… 2021-11-19 10:08:26 2021-11-19 12:01:58 114
#> 2 003_Power Girl power… 2021-11-19 10:11:31 2021-11-19 12:01:58 111
But why Join_Time
and Leave_Time
is overlapped in these 2 sessions? This might be the case that she’s joined Zoom with 2 devices at the same time. As I will show you next, The upcoming functions in this package can detect this scenario.
Classroom in Zoom
Now that we have read Zoom’s participants data in, It’s time to analyse participants in a context of a classroom. From now on, I’ll call a “participant” as “student”.
Next, 3 class_*
functions will be introduced:
-
class_session()
summarizes time information about individual sessions of each students (If student has multiple sessions, output will show ≥ 1 rows per that student) -
class_students()
summarizes time information of each students (1 row per student) -
class_studentsID()
summarizes time information of each student’s ID extracted from student name (1 row per student’s ID)
The first argument of these functions receives input from zoom_participants
tibble as created by read_participants()
.
A typical academic classroom usually has an explicit start and end time. If students arrive to class later than certain cutoff time point, teacher can mark them as late.
In the class_*
functions, you must provide class_start
and class_end
arguments by which they will used to compute 4 time intervals that student spent before, during, and after Zoom class:
-
Before_Class
: represent time interval that student spent in Zoom before class started -
During_Class
: represent time interval that student spent in Zoom during class -
After_Class
: represent time interval that student spent in Zoom after class ended -
Total_Time
: the sum of timeBefore_Class
+During_Class
+After_Class
For class_students()
and class_studentsID()
, you can compute late time period of each student by providing late_cutoff
argument.
All of theses time intervals are Period object from lubridate package. (It can be converted to hours, minutes, or secound as well.)
Furthermore, class_*
functions can tell you whether each students joined Zoom with multiple devices at the same time period. The use case of this might be when live-exam is conducted in Zoom, and you want to check that each students joined with 1 device only (no cheating).
Class Session
Suppose that our Zoom classroom was started at 10:00 and ended at 12:00, I will call class_session()
as follows:
(Input class_start
and class_end
as 24-hours clock time with no AM or PM)
pp_heroes_session <-
class_session(pp_heroes,
class_start = "10:00", # Official class started at 10:00 AM
class_end = "12:00" # Official class ended at 12:00 PM
)
pp_heroes_session
#> # A tibble: 31 × 17
#> `Name (Original Name)` Name Name_Original Email Session Class_Start
#> <chr> <chr> <chr> <chr> <int> <dttm>
#> 1 001_Magus 001_… NA magu… 1 2021-11-19 10:00:00
#> 2 002_She-Thing 002_… NA she-… 1 2021-11-19 10:00:00
#> 3 003_Power Girl (Kara Z… 003_… Kara Zor-L powe… 1 2021-11-19 10:00:00
#> 4 003_Power Girl (Kara Z… 003_… Kara Zor-L powe… 2 2021-11-19 10:00:00
#> 5 004_Angel Salvadore 004_… NA ange… 1 2021-11-19 10:00:00
#> 6 005_Donna Troy 005_… NA donn… 1 2021-11-19 10:00:00
#> 7 006_Phoenix 006_… NA phoe… 1 2021-11-19 10:00:00
#> 8 007_Birdman 007_… NA bird… 1 2021-11-19 10:00:00
#> 9 008_Simon Baz 008_… NA simo… 1 2021-11-19 10:00:00
#> 10 009_Juggernaut (Cain M… 009_… Cain Marko jugg… 1 2021-11-19 10:00:00
#> # … with 21 more rows, and 11 more variables: Class_End <dttm>,
#> # Join_Time <dttm>, Leave_Time <dttm>, Before_Class <Period>,
#> # During_Class <Period>, After_Class <Period>, Total_Time <Period>,
#> # Duration_Minutes <dbl>, Guest <chr>, Rec_Consent <chr>, Multi_Device <lgl>
As previously stated, classroom-related time intervals were computed.
pp_heroes_session %>%
select(Name, ends_with("Class"), Total_Time) %>%
head()
#> # A tibble: 6 × 5
#> Name Before_Class During_Class After_Class Total_Time
#> <chr> <Period> <Period> <Period> <Period>
#> 1 001_Magus NA 1H 15M 57S 1M 59S 1H 17M 56S
#> 2 002_She-Thing NA 22M 59S 1M 58S 24M 57S
#> 3 003_Power Girl NA 1H 51M 34S 1M 58S 1H 53M 32S
#> 4 003_Power Girl NA 1H 48M 29S 1M 58S 1H 50M 27S
#> 5 004_Angel Salvadore NA 1H 44M 49S 1M 58S 1H 46M 47S
#> 6 005_Donna Troy NA 1H 50M 34S 1M 55S 1H 52M 29S
Let’s see who spend time during class all the time
pp_heroes_session %>%
filter(During_Class == lubridate::hours(2)) %>%
select(Name, ends_with("Class"))
#> # A tibble: 1 × 4
#> Name Before_Class During_Class After_Class
#> <chr> <Period> <Period> <Period>
#> 1 015_Loki 5M 45S 2H 0M 0S 1M 59S
Or, may be Loki could cast illusion to attend Zoom till the end :-)
Now, let’s see who attend Zoom using multiple device at the same time by filter Multi_Device = TRUE
.
pp_heroes_session %>%
filter(Multi_Device == TRUE) %>%
select(Name, Session, Join_Time, Leave_Time, Multi_Device)
#> # A tibble: 4 × 5
#> Name Session Join_Time Leave_Time Multi_Device
#> <chr> <int> <dttm> <dttm> <lgl>
#> 1 003_Power Girl 1 2021-11-19 10:08:26 2021-11-19 12:01:58 TRUE
#> 2 003_Power Girl 2 2021-11-19 10:11:31 2021-11-19 12:01:58 TRUE
#> 3 017_Swarm 1 2021-11-19 10:14:30 2021-11-19 10:24:26 TRUE
#> 4 017_Swarm 2 2021-11-19 10:24:12 2021-11-19 12:01:58 TRUE
Session
column displays session number of each student as ranked by Join_Time
. Let’s see who joined Zoom more than 1 time.
As you can see, “Power Girl” and “Swarm” has session number = 2; however, be aware that these two joined Zoom only one time using 2 devices.
pp_heroes_session %>%
filter(Session > 1) %>%
select(Name, Session, Join_Time, Leave_Time, Multi_Device)
#> # A tibble: 4 × 5
#> Name Session Join_Time Leave_Time Multi_Device
#> <chr> <int> <dttm> <dttm> <lgl>
#> 1 003_Power Girl 2 2021-11-19 10:11:31 2021-11-19 12:01:58 TRUE
#> 2 010_Cyborg Super… 2 2021-11-19 11:28:11 2021-11-19 12:01:58 NA
#> 3 017_Swarm 2 2021-11-19 10:24:12 2021-11-19 12:01:58 TRUE
#> 4 022_Dazzler 2 2021-11-19 11:02:46 2021-11-19 11:29:38 NA
Class Student
class_students()
summarizes time information grouped by each students (grouping variables is Name (Original Name)
and Email
). The result will be a tibble with one row per student.
Most columns are the same as output from class_session()
, except for the followings:
-
Session_Count
: show session counts of each student (how many times each student join and leave class) -
First_Join_Time
: if student has multiple sessions, this would choose only the earliest join time. -
Last_Leave_Time
: if student has multiple sessions, this would choose only the latest leave time.
New optional argument is late_cutoff
from which student will be considered late if first joined time is later than this cutoff. The late time period will be shown in Late_Time
column.
pp_heroes_students <-
class_students(pp_heroes,
class_start = "10:00",
class_end = "12:00",
late_cutoff = "10:15" # If student joined later than 10:15 will considered late
)
head(pp_heroes_students)
#> # A tibble: 6 × 16
#> `Name (Original …` Name Name_Original Email Session_Count Class_Start
#> <chr> <chr> <chr> <chr> <int> <dttm>
#> 1 001_Magus 001_… NA magu… 1 2021-11-19 10:00:00
#> 2 002_She-Thing 002_… NA she-… 1 2021-11-19 10:00:00
#> 3 003_Power Girl (K… 003_… Kara Zor-L powe… 2 2021-11-19 10:00:00
#> 4 004_Angel Salvado… 004_… NA ange… 1 2021-11-19 10:00:00
#> 5 005_Donna Troy 005_… NA donn… 1 2021-11-19 10:00:00
#> 6 006_Phoenix 006_… NA phoe… 1 2021-11-19 10:00:00
#> # … with 10 more variables: Class_End <dttm>, First_Join_Time <dttm>,
#> # Last_Leave_Time <dttm>, Before_Class <Period>, During_Class <Period>,
#> # After_Class <Period>, Total_Time <Period>, Duration_Minutes <dbl>,
#> # Multi_Device <lgl>, Late_Time <Period>
Let’s see who joined class late > 10 minutes (later than 10:25).
pp_heroes_students %>%
filter(Late_Time > lubridate::minutes(10)) %>%
select(Name, First_Join_Time, Late_Time)
#> # A tibble: 5 × 3
#> Name First_Join_Time Late_Time
#> <chr> <dttm> <Period>
#> 1 001_Magus 2021-11-19 10:44:03 29M 3S
#> 2 002_She-Thing 2021-11-19 11:37:01 1H 22M 1S
#> 3 012_Shang-Chi 2021-11-19 11:18:16 1H 3M 16S
#> 4 021_Winter Soldier 2021-11-19 11:00:49 45M 49S
#> 5 022_Dazzler 2021-11-19 11:01:02 46M 2S
Class Student ID
Supposed that after you’ve checked class attendance of each student, perhaps you want to merge this data into a database by some key columns which is usually student’s ID.
First, you informed students to put student’s ID in their names, for example: “001_Megus”.
class_studentsID()
will help you summarizes time information grouped by each student’s ID. Then, you can used these IDs to merge into a database.
The internal processes are that class_studentsID()
extracts student’s ID from Name (Original Name)
column using regular expression as provided by id_regex
. Then, time information will be summarized per student’s ID (grouping variable is ID
), and the rest of output columns is similar to class_students()
. Finally, the result will be a tibble with one row per ID
.
pp_heroes_studentsID <-
class_studentsID(pp_heroes,
id_regex = "\\d+", # Extract digits from student name as student's ID
class_start = "10:00",
class_end = "12:00",
late_cutoff = "10:15" # If student joined later than 10:15 will considered late
)
head(pp_heroes_studentsID)
#> # A tibble: 6 × 15
#> ID Name Email Session_Count Class_Start Class_End
#> <chr> <chr> <chr> <int> <dttm> <dttm>
#> 1 001 001_Magus magu… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 2 002 002_She-Thi… she-… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 3 003 003_Power G… powe… 2 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 4 004 004_Angel S… ange… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 5 005 005_Donna T… donn… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 6 006 006_Phoenix phoe… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> # … with 9 more variables: First_Join_Time <dttm>, Last_Leave_Time <dttm>,
#> # Before_Class <Period>, During_Class <Period>, After_Class <Period>,
#> # Total_Time <Period>, Duration_Minutes <dbl>, Multi_Device <lgl>,
#> # Late_Time <Period>
(If the same ID
is founded in multiple names (e.g. “001_Magus”, “001_magus”), the Name
column will contain all combinations of names for that particular ID
.)
Merge Data
Here, I will give an example of merging student data to a database.
This package provides heroes_students
data frame that contain names and ID of student who enrolled the course in this semester.
head(heroes_students)
#> ID Name Comment
#> 1 001 Magus <NA>
#> 2 004 Angel Salvadore <NA>
#> 3 005 Donna Troy <NA>
#> 4 006 Phoenix <NA>
#> 5 008 Simon Baz <NA>
#> 6 009 Juggernaut <NA>
You can join heroes_students
with pp_heroes_studentsID
using dplyr::*_join
functions by ID
.
Check Students NOT in Zoom Class
To check whether students joined Zoom class room or not, it can be obtained by using a filtering join function: dplyr::anti_join()
.
These students below are in the heroes_students
data frame, but not in pp_heroes_studentsID
, which means that they didn’t join Zoom classroom.
Check Non-Student in Zoom Class
Likewise, You can also check participants who joined Zoom classroom but not in the heroes_students
data frame (non-students). Again, dplyr::anti_join()
can be used.
pp_heroes_studentsID %>%
anti_join(heroes_students, by = "ID")
#> # A tibble: 5 × 15
#> ID Name Email Session_Count Class_Start Class_End
#> <chr> <chr> <chr> <int> <dttm> <dttm>
#> 1 002 002_She-Thi… she-… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 2 003 003_Power G… powe… 2 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 3 007 007_Birdman bird… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 4 013 013_Thunder… thun… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> 5 018 018_Blizzar… bliz… 1 2021-11-19 10:00:00 2021-11-19 12:00:00
#> # … with 9 more variables: First_Join_Time <dttm>, Last_Leave_Time <dttm>,
#> # Before_Class <Period>, During_Class <Period>, After_Class <Period>,
#> # Total_Time <Period>, Duration_Minutes <dbl>, Multi_Device <lgl>,
#> # Late_Time <Period>
Merge All
dplyr::full_join()
is a mutating join function that merge 2 data frame without any rows lost from either one.
In this example, I fully joined 2 data frame pp_heroes_studentsID
and heroes_students
by ID
. Suffix "_from_ID" and "_from_Zoom" represents unmatched rows from the list of students (heroes_students
) and Zoom classroom, respectively.
# Merge to Zoom data by `ID`
heroes_students_joined <- heroes_students %>%
full_join(pp_heroes_studentsID, by = "ID", suffix = c("_from_ID", "_from_Zoom")) %>%
relocate(ID, starts_with("Name"))
head(heroes_students_joined)
#> ID Name_from_ID Name_from_Zoom Comment
#> 1 001 Magus 001_Magus <NA>
#> 2 004 Angel Salvadore 004_Angel Salvadore <NA>
#> 3 005 Donna Troy 005_Donna Troy <NA>
#> 4 006 Phoenix 006_Phoenix <NA>
#> 5 008 Simon Baz 008_Simon Baz <NA>
#> 6 009 Juggernaut 009_Juggernaut (Cain Marko) <NA>
#> Email Session_Count Class_Start
#> 1 magus_unknown@marvel.com 1 2021-11-19 10:00:00
#> 2 angel-salvadore_unknown@marvel.com 1 2021-11-19 10:00:00
#> 3 donna-troy_amazon@dc.com 1 2021-11-19 10:00:00
#> 4 phoenix_mutant@marvel.com 1 2021-11-19 10:00:00
#> 5 simon-baz_human@dc.com 1 2021-11-19 10:00:00
#> 6 juggernaut_human@marvel.com 1 2021-11-19 10:00:00
#> Class_End First_Join_Time Last_Leave_Time Before_Class
#> 1 2021-11-19 12:00:00 2021-11-19 10:44:03 2021-11-19 12:01:59 <NA>
#> 2 2021-11-19 12:00:00 2021-11-19 10:15:11 2021-11-19 12:01:58 <NA>
#> 3 2021-11-19 12:00:00 2021-11-19 10:09:26 2021-11-19 12:01:55 <NA>
#> 4 2021-11-19 12:00:00 2021-11-19 10:19:40 2021-11-19 12:01:54 <NA>
#> 5 2021-11-19 12:00:00 2021-11-19 10:11:26 2021-11-19 12:01:59 <NA>
#> 6 2021-11-19 12:00:00 2021-11-19 10:10:49 2021-11-19 12:01:56 <NA>
#> During_Class After_Class Total_Time Duration_Minutes Multi_Device Late_Time
#> 1 1H 15M 57S 1M 59S 1H 17M 56S 78 NA 29M 3S
#> 2 1H 44M 49S 1M 58S 1H 46M 47S 107 NA 11S
#> 3 1H 50M 34S 1M 55S 1H 52M 29S 113 NA <NA>
#> 4 1H 40M 20S 1M 54S 1H 42M 14S 103 NA 4M 40S
#> 5 1H 48M 34S 1M 59S 1H 50M 33S 111 NA <NA>
#> 6 1H 49M 11S 1M 56S 1H 51M 7S 112 NA <NA>
Student who didn’t joined Zoom classroom will have NA
presented in the Name_from_Zoom
column, whereas participants who joined Zoom classroom but not in the list of students will have NA
presented in the Name_from_ID
column.
heroes_students_joined %>%
filter(if_any(starts_with("Name"), is.na)) %>%
select(ID, starts_with("Name"), Comment)
#> ID Name_from_ID Name_from_Zoom Comment
#> 1 031 Impulse <NA> <NA>
#> 2 032 Marvel Girl <NA> on a space mission
#> 3 033 Giganta <NA> <NA>
#> 4 034 Vision <NA> last seen in Westview
#> 5 036 Silk Spectre <NA> <NA>
#> 6 002 <NA> 002_She-Thing <NA>
#> 7 003 <NA> 003_Power Girl (Kara Zor-L) <NA>
#> 8 007 <NA> 007_Birdman <NA>
#> 9 013 <NA> 013_Thunderbird II <NA>
#> 10 018 <NA> 018_Blizzard II <NA>
Zoom Chat
You might want to parse Zoom chat file from .txt
file to a tibble.
# Path to example Zoom Chat file
path <- zoomclass_example("zoom-chat-1.txt")
See that raw zoom-chat-1.txt
looks like this.
readLines(path) %>%
glue::as_glue()
#> 01:07:16 Tom: Hi everyone
#> 01:09:00 Anny: Hello Tom!
#> 01:10:30 Max: Do you have a good weekend?
#> 01:11:10 Anny: Yes we do.
#> 01:11:34 Jenny: It's great :)
#> 01:17:15 Max: I wish I could be there.
#> 01:18:14 Anny: Come and join us next time.
#> 01:19:42 Max: Sure, I will.
Use read_zoom_chat()
to read it into a data frame.
# Read from Text to a Data Frame
read_zoom_chat(path)
#> # A tibble: 8 × 3
#> Time Name Content
#> <chr> <chr> <chr>
#> 1 01:07:16 Tom Hi everyone
#> 2 01:09:00 Anny Hello Tom!
#> 3 01:10:30 Max Do you have a good weekend?
#> 4 01:11:10 Anny Yes we do.
#> 5 01:11:34 Jenny It's great :)
#> 6 01:17:15 Max I wish I could be there.
#> 7 01:18:14 Anny Come and join us next time.
#> 8 01:19:42 Max Sure, I will.
Last updated: 2022-04-03