作者peterwu76 (金冈)
看板R_Language
标题Re: [问题] Json to R and Data talbe/matrix arrange
时间Wed Jun 7 09:32:12 2017
感谢 celestialgod 前辈的帮忙
附上最後的code for download historical data,这样也可以让前辈了解我想做什麽。
好读版
https://hackmd.io/KwU2CMDMENQWmAZgEwGM4BZrmXcB2AThDmn2GQBNxKAGDVYYIA==?both
备注说明:
原本也想考虑抓Realtime数据,结果发现抓出来的资料时间跟历史资料最新的data时间
是一样的,抓取资料的时间点跟最新能抓到的资料的时间点相隔8小时左右。
举例来说,我09:00使用此code抓资料,抓出的资料最後(最新的)一笔资料的时间大概
是01:00。
所以最後放弃所谓Realtime抓资料的想法,不如抓历史historical资料。
MY Final codes for historical data
# Get data from Json link
library(jsonlite)
url <- "
https://data.lass-net.org/data/history.php?device_id=74DA38C7D1D2"
x <- fromJSON(url)
# Arrange data into table matrix
library(data.table)
library(lubridate)
outDT <- rbindlist(x$feeds$AirBox)
# Claim the timestamp into correct time format by lubridate package
outDT[ , `:=`(source = x$source, version = ymd_hms(x$version),
device_id = x$device_id, timestamp = ymd_hms(timestamp))]
sortD <- outDT
# Capture data by column names
headers<-c("timestamp","s_d0","s_t0","s_h0","date","time","device_id","gps_lon","gps_lat","version")
sortD <- subset(outDT,select=headers)
# rename column names
colnames(sortD)[which(names(sortD) == "s_d0")] <- "PM2.5"
colnames(sortD)[which(names(sortD) == "s_t0")] <- "Temperature"
colnames(sortD)[which(names(sortD) == "s_h0")] <- "Humidity"
# Sort data
sortD$timestamp <- as.POSIXct(sortD$timestamp, tz='UTC')
class(sortD$timestamp)
Final_data<-sortD[order(sortD$timestamp)]
View(Final_data)
# Output data
date <- sprintf("AirBox_74DA38C7D1D2_%s.csv", format(Sys.time(),"%Y%m%d%H%M"))
outfile = paste("D:\\AirBoxTest\\", date, sep = "")
write.csv(Final_data, file = outfile)
#################################
# Hourly average and output
#################################
Final_data_hourly<- aggregate(list(PM2.5 = Final_data$PM2.5,
Humidity = Final_data$Humidity,
Temperature = Final_data$Temperature),
list(hourofday = cut(Final_data$timestamp, "1 hour")),
mean)
# Output data
date <- sprintf("AirBox_74DA38C7D1D2_Hourly_%s.csv",
format(Sys.time(),"%Y%m%d%H%M"))
outfile = paste("D:\\AirBoxTest\\", date, sep = "")
write.csv(Final_data_hourly, file = outfile)
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 140.112.1.187
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/R_Language/M.1496799136.A.80A.html
1F:→ carl090105: 提供两点看法 06/07 12:47
2F:→ carl090105: 1. subset 可用: outDT[, c(headers), with=FALSE] 06/07 12:48
3F:→ carl090105: 2. rename 可用: data.table::setnames 06/07 12:49
4F:→ peterwu76: 感谢提供更好的方法!! 06/07 14:19
5F:→ clansoda: first point could be outDT[, .(headers)] 06/07 16:00
6F:→ clansoda: it could be more concise 06/07 16:00
7F:→ carl090105: I think if header is a string vector then in 06/07 20:26
8F:→ carl090105: data.table v1.10.2 can use DT[, ..header] 06/07 20:26
9F:→ celestialgod: rename那里其实用match可以一次改三个名字 06/07 20:29
10F:→ celestialgod: 你那样做会复制三次data.frame满可怕的 06/07 20:29
11F:→ peterwu76: Thanks! I will try to revise it! :) 06/10 18:14