You will often have a time component to your biological research and research hypotheses - i.e. you might want to explain variability over time.. As mentioned in the Data Collection and Curation section, it is a good idea to record the time of your observation as separate columns of year, month, day, etc. But you will also need to work with data collected by others where date (and sometimes time) information is together one column (variable).
How you can do it:
Working with dates and times involves two steps:
First, you need to let R know the data should be viewed as a date and time object (formatting the data as a date and time object)
Second, you might want to extract some part of the date and time object (e.g. extract the year)
R has many options for working with dates and times.
In the base package:
In the base package which is installed along with R1, you can use the as.Date() function to format your data as a date:
Consider the data:
myDat <-read.csv("DTData.csv") # load in the datahead(myDat) # examine the first few rows
where the Dat$Time column gives the year, month, day, hour and minutes of an observation given in the Dat$Value column.
You can format the Dat$Time column as a date and time with:
myDat$Time <-as.Date(x = myDat$Time, # the date and time columnformat ="%Y/%m/%d %H:%M") # describing the format of the date and time columnhead(myDat) # examine the first few rows
Time Value
1 2024-10-03 36
2 2020-06-09 35
3 2020-10-07 57
4 2021-10-09 43
5 2020-06-04 44
6 2021-12-02 36
You can learn more about the formatting syntax with ?strptime. Note that the code above replaces the myDat$Time column with the new, formatted date and time information.
You can extract parts of the date and time column with functions like months() for months, years() for years, etc.
For example:
myDat$Months <-months(myDat$Time) # extract only the monthsstr(myDat) # structure of the data frame
'data.frame': 15 obs. of 3 variables:
$ Time : Date, format: "2024-10-03" "2020-06-09" ...
$ Value : int 36 35 57 43 44 36 47 39 38 50 ...
$ Months: chr "October" "June" "October" "October" ...
Using the lubridate package:
The lubridate package was created to make working with dates and times easier. There are still two steps to the process. You can repeat the steps above but now with the lubridate package.
Again, consider the data:
myDat <-read.csv("DTData.csv") # load in the datahead(myDat) # examine the first few rows
where the Dat$Time column gives the year, month, day, hour and minutes of an observation given in the Dat$Value column.
You can format the Dat$Time column as a date and time with:
library(lubridate) # load the lubridate package
Attaching package: 'lubridate'
The following objects are masked from 'package:base':
date, intersect, setdiff, union
myDat$Time <-ymd_hm(myDat$Time) # the date and time columnstr(myDat) # examine the structure of the data
'data.frame': 15 obs. of 2 variables:
$ Time : POSIXct, format: "2024-10-03 10:32:00" "2020-06-09 00:17:00" ...
$ Value: int 36 35 57 43 44 36 47 39 38 50 ...
You can extract parts of the date and time column with functions like month() for months, year() for years, etc. Notice that the function names are not plural with the lubridate package.2
For example:
myDat$Months <-month(myDat$Time) # extract only the monthsstr(myDat) # examine the structure of the data
'data.frame': 15 obs. of 3 variables:
$ Time : POSIXct, format: "2024-10-03 10:32:00" "2020-06-09 00:17:00" ...
$ Value : int 36 35 57 43 44 36 47 39 38 50 ...
$ Months: num 10 6 10 10 6 12 3 4 11 11 ...
Much more is available in the lubridate packages, including determining durations and dealing with time-zones. Check the lubridate package “cheat sheet” for more information.