R语言中和日期时间相关的一共有三种格式,分别是日期(date),日期时间(data-times)和时间(time).
一般来说,日期格式需要符合ISO8601
格式,这是一种国际上通用的日期时间格式,包含的内容从大到小包括:年,月,日,小时,分钟和秒.
library(readr)
parse_datetime("2019-11-11T2010")
## [1] "2019-11-11 20:10:00 UTC"
parse_datetime()
是readr
包中的用于解析日期的函数.
1 日期格式转换
使用as.Date()
函数将文本格式日期转换为date格式数据.一共需要两个参数:
x:文本格式日期.
format:文本格式日期的格式.
1.1 比如美国这边常见的月/日/年
.
date <- c("8/6/2014", "3/18/2015")
class(date)
## [1] "character"
date2 <- as.Date(x = date, format = "%m/%d/%Y")
date2
## [1] "2014-08-06" "2015-03-18"
class(date2)
## [1] "Date"
date <- c("08/06/2014", "03/18/2015")
date2 <- as.Date(x = date, format = "%m/%d/%Y")
date2
## [1] "2014-08-06" "2015-03-18"
class(date2)
## [1] "Date"
date <- c("08/06/14", "03/18/15")
date2 <- as.Date(x = date, format = "%m/%d/%y")
date2
## [1] "2014-08-06" "2015-03-18"
class(date2)
## [1] "Date"
注意可以看到,对于年
,有时候大家会省写,比如2015
写为15
.这时候在format
参数中代表年的位置应当由全写的Y
改为小写的y
.对于月
和日
则不存在这个问题.
1.2 国内常见常见的年/月/日
当然对于其他的类似使用/
分隔开日期形式,只要变换format
中的顺序即可.
date <- c("2015/8/6", "3/18/15")
class(date)
## [1] "character"
date2 <- as.Date(x = date, format = c("%Y/%m/%d" ,"%m/%d/%y"))
date2
## [1] "2015-08-06" "2015-03-18"
class(date2)
## [1] "Date"
可以看到对于格式不同的数据,可以将format
设置为向量.
1.3 常见日期各个元素的写法及相应format的写法
1.3.1 年
%Y: 4位完整数字的年.如
2019
%y: 2位数字的年.如
19
1.3.2 月
%m: 两位或者一位的月,如
01
或者10
.%b: 月份的简写,比如
Jan
.%B: 月份的全写,比如
January
.
1.3.3 日
%d: 两位或者一位的日,比如
1
或者28
.%d: 两位或者一位的日,比如
1
或者28
.
1.3.4 时间
%H: 0-23小时的时间格式.
%I: 0-12小时的时间格式.必须和
%p
联合使用.%p:
a.m.
和p.m.
%M: 分钟.
%S: 整数秒.
%OS: 真实秒.
%Z: 时区.必须是一个地名.如
America
或者Chicago
.
## locale-specific version of the date
format(Sys.Date(), "%a %b %d")
## [1] "Thu Jul 01"
x <- c("1jan1960", "2jan1960", "31mar1960", "30jul1960")
z <- as.Date(x, "%d%b%Y")
z
## [1] "1960-01-01" "1960-01-02" "1960-03-31" "1960-07-30"
## read in date/time info in format 'm/d/y'
dates <- c("02/27/92", "02/27/92", "01/14/92", "02/28/92", "02/01/92")
as.Date(dates, "%m/%d/%y")
## [1] "1992-02-27" "1992-02-27" "1992-01-14" "1992-02-28" "1992-02-01"
## date given as number of days since 1900-01-01 (a date in 1989)
as.Date(32768, origin = "1900-01-01")
## [1] "1989-09-19"
## Excel is said to use 1900-01-01 as day 1 (Windows default) or
## 1904-01-01 as day 0 (Mac default), but this is complicated by Excel
## incorrectly treating 1900 as a leap year.
## So for dates (post-1901) from Windows Excel
as.Date(35981, origin = "1899-12-30") # 1998-07-05
## [1] "1998-07-05"
## and Mac Excel
as.Date(34519, origin = "1904-01-01") # 1998-07-05
## [1] "1998-07-05"
## (these values come from http://support.microsoft.com/kb/214330)
## Experiment shows that Matlab's origin is 719529 days before ours,
## (it takes the non-existent 0000-01-01 as day 1)
## so Matlab day 734373 can be imported as
as.Date(734373, origin = "1970-01-01") - 719529 # 2010-08-23
## [1] "2010-08-23"
## (value from
## http://www.mathworks.de/de/help/matlab/matlab_prog/represent-date-and-times-in-MATLAB.html)
## Time zone effect
z <- ISOdate(2010, 04, 13, c(0,12)) # midnight and midday UTC
as.Date(z) # in UTC
## [1] "2010-04-13" "2010-04-13"
## these time zone names are common
as.Date(z, tz = "NZ")
## [1] "2010-04-13" "2010-04-14"
as.Date(z, tz = "HST") # Hawaii
## [1] "2010-04-12" "2010-04-13"