問題描述
數據以 15 分鐘為間隔:
<前>時間價值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 122010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00通常會有 96 分.
根據數值,我們可能會注意到,00:15 到 01:45 之間的值彼此接近,02:00 到 03:00 之間的值彼此接近,而從 03:15 到04:00 他們靠得很近.
基于彼此接近"規則,我希望將數據分組"為 3 部分:
- 00:15 到 01:45
- 02:00 至 03:00
- 03:15 到 04:00
請注意數據可能是隨機的,根據上面定義的規則可以分為 3 個以上的部分,但最多不應超過 10 個部分.并且分組必須遵循時間順序,例如,您不能將 00:15/02:30/04:45 歸為 1 組,因為這 3 個點不連續.
請給出一些如何在 t-sql 中實現它的想法.
更新:該值可能是:
<前>時間價值2010-01-01 00:15 32010-01-01 00:30 22010-01-01 00:45 42010-01-01 01:00 52010-01-01 01:15 12010-01-01 01:30 32010-01-01 01:45 42010-01-01 02:00 122010-01-01 02:15 132010-01-01 02:30 4 --突然減少2010-01-01 02:45 142010-01-01 03:00 152010-01-01 03:15 32010-01-01 03:30 22010-01-01 03:45 32010-01-01 04:00 5...............2010-01-02 00:00對于這種情況,我們不應該把 02:30 分開分組,因為我們希望分組大小必須至少為 3 點,我們會將那個點 (02:30) 放到前一組(從 02:00 到 03:00).
由于你的問題變化太大,這里是新問題的新答案,我只包括代碼部分.
聲明@t table(time datetime, value int)聲明@variation浮動設置@variation = 2不計較插入@t 值('2010-01-01 00:15',3)插入@t 值('2010-01-01 00:30',2)插入@t 值('2010-01-01 00:45',4)插入@t 值('2010-01-01 01:00',5)插入@t 值('2010-01-01 01:15',1)插入@t 值('2010-01-01 01:30',3)插入@t 值('2010-01-01 01:45',4)插入@t 值('2010-01-01 02:00',52)插入@t 值('2010-01-01 02:15',5)插入@t 值('2010-01-01 02:30',52)插入@t 值('2010-01-01 02:45',54)插入@t 值('2010-01-01 03:00',55)插入@t 值('2010-01-01 03:15',3)插入@t 值('2010-01-01 03:30',2)插入@t 值('2010-01-01 03:45',3)插入@t 值('2010-01-01 04:00',5)聲明@result 表(最小時間日期時間,最大時間日期時間)A:刪除@result; t 為(select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0),作為(從 t 中選擇時間、lv、rn、0 grp,其中 rn = 1聯合所有選擇 t1.time, a.lv, t1.rn,存在的情況(從 t t2 中選擇 1 其中 t1.rn 介于 rn + 1 和 rn + 3 之間,并且lv 之間 t1.lv - @variation 和 t1.lv +@variation) 然后 grp else grp + 1 end從 t t1 加入 a ont1.rn = a.rn +1)插入@result通過 grp 從組中選擇 min(time), max(time)如果@@rowcount >10開始設置@variation=@variation + .5轉到一個結尾從@result中選擇*
結果:
mintime maxtime2010-01-01 00:15:00.000 2010-01-01 01:45:00.0002010-01-01 02:00:00.000 2010-01-01 03:00:00.0002010-01-01 03:15:00.000 2010-01-01 04:00:00.000
the data is in 15 minute interval:
Time Value 2010-01-01 00:15 3 2010-01-01 00:30 2 2010-01-01 00:45 4 2010-01-01 01:00 5 2010-01-01 01:15 1 2010-01-01 01:30 3 2010-01-01 01:45 4 2010-01-01 02:00 12 2010-01-01 02:15 13 2010-01-01 02:30 12 2010-01-01 02:45 14 2010-01-01 03:00 15 2010-01-01 03:15 3 2010-01-01 03:30 2 2010-01-01 03:45 3 2010-01-01 04:00 5 .......... .......... .......... 2010-01-02 00:00
Typically there will be 96 points.
According to the values, we may notice that the values from 00:15 to 01:45 are close to each other, and from 02:00 to 03:00 they are close to each other, and from 03:15 to 04:00 they are close to each other.
Based on the "close to each other" rule, I want the data to be "grouped" into 3 parts:
- 00:15 to 01:45
- 02:00 to 03:00
- 03:15 to 04:00
Please consider that the data could be random, and could be grouped into more than 3 parts according to the rule defined above, but maximum should not be more than 10 parts. And the grouping must honor the time sequence, for example, you cannot just put 00:15/02:30/04:45 into 1 group because these 3 points are NOT consecutive.
Please give some thoughts how to implement it in t-sql.
updated: The value could be:
Time Value 2010-01-01 00:15 3 2010-01-01 00:30 2 2010-01-01 00:45 4 2010-01-01 01:00 5 2010-01-01 01:15 1 2010-01-01 01:30 3 2010-01-01 01:45 4 2010-01-01 02:00 12 2010-01-01 02:15 13 2010-01-01 02:30 4 --suddenly decreased 2010-01-01 02:45 14 2010-01-01 03:00 15 2010-01-01 03:15 3 2010-01-01 03:30 2 2010-01-01 03:45 3 2010-01-01 04:00 5 .......... .......... .......... 2010-01-02 00:00
for these kinds of situation, we should not group 02:30 separately, because we want the group size has to be at least 3 points, and we will put that point (02:30) to the previous group (from 02:00 to 03:00).
Since your question changed so much, here is a new answer to the new question, I only included the code part.
declare @t table(time datetime, value int)
declare @variation float
set @variation = 2
set nocount on
insert @t values('2010-01-01 00:15',3)
insert @t values('2010-01-01 00:30',2)
insert @t values('2010-01-01 00:45',4)
insert @t values('2010-01-01 01:00',5)
insert @t values('2010-01-01 01:15',1)
insert @t values('2010-01-01 01:30',3)
insert @t values('2010-01-01 01:45',4)
insert @t values('2010-01-01 02:00',52)
insert @t values('2010-01-01 02:15',5)
insert @t values('2010-01-01 02:30',52)
insert @t values('2010-01-01 02:45',54)
insert @t values('2010-01-01 03:00',55)
insert @t values('2010-01-01 03:15',3)
insert @t values('2010-01-01 03:30',2)
insert @t values('2010-01-01 03:45',3)
insert @t values('2010-01-01 04:00',5)
declare @result table(mintime datetime, maxtime datetime)
a:
delete @result
;with t as
(
select *, rn = row_number() over(order by time), log(value) lv from @t where datediff(day, time, '2010-01-01') = 0
), a as
(
select time, lv, rn, 0 grp from t where rn = 1
union all
select t1.time, a.lv, t1.rn,
case when exists (select 1 from t t2 where t1.rn between rn + 1 and rn + 3 and
lv between t1.lv - @variation and t1.lv +@variation) then grp else grp + 1 end
from t t1 join a on
t1.rn = a.rn +1
)
insert @result
select min(time), max(time) from a group by grp
if @@rowcount > 10
begin
set @variation=@variation + .5
goto a
end
select * from @result
Result:
mintime maxtime
2010-01-01 00:15:00.000 2010-01-01 01:45:00.000
2010-01-01 02:00:00.000 2010-01-01 03:00:00.000
2010-01-01 03:15:00.000 2010-01-01 04:00:00.000
這篇關于根據sql server中的相關值對數據進行分組的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!