久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

給定輪班列表,創(chuàng)建時(shí)間表的摘要描述

Create a summary description of a schedule given a list of shifts(給定輪班列表,創(chuàng)建時(shí)間表的摘要描述)
本文介紹了給定輪班列表,創(chuàng)建時(shí)間表的摘要描述的處理方法,對(duì)大家解決問題具有一定的參考價(jià)值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

問題描述

假設(shè)我有一個(gè)事件的輪班列表(格式為開始日期/時(shí)間、結(jié)束日期/時(shí)間) - 是否有某種算法可以用來創(chuàng)建日程的概括摘要?大多數(shù)輪班陷入某種常見的重復(fù)模式(即星期一上午 9:00 到下午 1:00,星期二上午 10:00 到下午 3:00 等)是很常見的.但是,此規(guī)則可以(并且將會(huì))有例外(例如,其中一個(gè)班次在假期發(fā)生并被重新安排在第二天).最好從我的摘要"中排除那些,因?yàn)槲蚁M峁┮粋€(gè)更一般的答案,說明此事件通常何時(shí)發(fā)生.

我想我正在尋找某種統(tǒng)計(jì)方法來確定發(fā)生的日期和時(shí)間,并根據(jù)列表中找到的最頻繁出現(xiàn)的情況創(chuàng)建描述.對(duì)于這樣的事情是否有某種通用算法?有沒有人創(chuàng)建過類似的東西?

理想情況下,我正在尋找 C# 或 VB.NET 中的解決方案,但不介意從任何其他語言移植.

提前致謝!

解決方案

您可以使用

在這里您可以清楚地看到我們的七個(gè)集群.

這解決了您的部分問題:識(shí)別數(shù)據(jù).現(xiàn)在您還希望能夠?qū)ζ溥M(jìn)行標(biāo)記.

因此,我們將獲取每個(gè)集群并取平均值(四舍五入):

Table[Round[Mean[clusters[[i]]]], {i, 7}]

結(jié)果是:

日開始結(jié)束{1",10",15"},{1",12",17"},{3"、10"、15"}、{3",14",17"},{5"、10"、15"}、{5"、11"、15"}、{1"、7"、9"}

這樣你就可以重新獲得七門課了.

現(xiàn)在,也許您想對(duì)班次進(jìn)行分類,無論是哪一天.如果同一個(gè)人每天在同一時(shí)間做同樣的任務(wù),那么稱之為周一從 10 點(diǎn)到 15 點(diǎn)"是沒有用的,因?yàn)樗舶l(fā)生在周三和周五(如我們的例子中).

讓我們不考慮第一列來分析數(shù)據(jù):

集群=FindClusters[Take[data, All, -2],Method->{Agglomerate",Linkage"->Complete"}];

在這種情況下,我們不會(huì)選擇要檢索的集群數(shù)量,而是由包決定.

結(jié)果是

您可以看到已識(shí)別出五個(gè)集群.

讓我們嘗試標(biāo)記"他們和以前一樣:

Grid[Table[Round[Mean[clusters[[i]]]], {i, 5}]]

結(jié)果是:

 開始 結(jié)束{10",15"},{12",17"},{14",17"},{11",15"},{7",9"}

這正是我們懷疑"的:每天同一時(shí)間都有重復(fù)的事件可以組合在一起.

夜班和標(biāo)準(zhǔn)化

如果您有(或計(jì)劃有)從一天開始到下一天結(jié)束的輪班,最好建模

{Start-Day Start-Hour Length}//正確!

{Start-Day Start-Hour End-Day End-Hour}//不正確!

那是因?yàn)榕c任何統(tǒng)計(jì)方法一樣,必須明確變量之間的相關(guān)性,否則該方法會(huì)失敗.該原則可以運(yùn)行類似保持您的候選數(shù)據(jù)規(guī)范化"的內(nèi)容.兩個(gè)概念幾乎一樣(屬性應(yīng)該是獨(dú)立的).

--- 編輯結(jié)束---

現(xiàn)在我猜你已經(jīng)很清楚你可以用這種 if 分析做什么樣的事情了.

一些參考

  1. 當(dāng)然,維基百科及其參考資料"和進(jìn)一步閱讀"是很好的向?qū)?
  2. 一個(gè)不錯(cuò)的視頻此處展示了 Statsoft 的功能,但您可以到達(dá)那里許多關(guān)于你可以用算法做的其他事情的想法.
  3. 這里是算法的基本解釋涉及
  4. 在這里您可以找到R 令人印象深刻的聚類分析功能(R 是一個(gè)非常好的選擇)
  5. 最后,這里您可以找到一長(zhǎng)串用于統(tǒng)計(jì)的免費(fèi)和商業(yè)軟件,包括聚類.

HTH!

Assuming I have a list of shifts for an event (in the format start date/time, end date/time) - is there some sort of algorithm I could use to create a generalized summary of the schedule? It is quite common for most of the shifts to fall into some sort of common recurrence pattern (ie. Mondays from 9:00 am to 1:00 pm, Tuesdays from 10:00 am to 3:00 pm, etc). However, there can (and will be) exceptions to this rule (eg. one of the shifts fell on a holiday and was rescheduled for the next day). It would be fine to exclude those from my "summary", as I'm looking to provide a more general answer of when does this event usually occur.

I guess I'm looking for some sort of statistical method to determine the day and time occurences and create a description based on the most frequent occurences found in the list. Is there some sort of general algorithm for something like this? Has anyone created something similar?

Ideally I'm looking for a solution in C# or VB.NET, but don't mind porting from any other language.

Thanks in advance!

解決方案

You may use Cluster Analysis.

Clustering is a way to segregate a set of data into similar components (subsets). The "similarity" concept involves some definition of "distance" between points. Many usual formulas for the distance exists, among others the usual Euclidean distance.

Practical Case

Before pointing you to the quirks of the trade, let's show a practical case for your problem, so you may get involved in the algorithms and packages, or discard them upfront.

For easiness, I modelled the problem in Mathematica, because Cluster Analysis is included in the software and very straightforward to set up.

First, generate the data. The format is { DAY, START TIME, END TIME }.
The start and end times have a random variable added (+half hour, zero, -half hour} to show the capability of the algorithm to cope with "noise".

There are three days, three shifts per day and one extra (the last one) "anomalous" shift, which starts at 7 AM and ends at 9 AM (poor guys!).

There are 150 events in each "normal" shift and only two in the exceptional one.

As you can see, some shifts are not very far apart from each other.

I include the code in Mathematica, in case you have access to the software. I'm trying to avoid using the functional syntax, to make the code easier to read for "foreigners".

Here is the data generation code:

Rn[] := 0.5 * RandomInteger[{-1, 1}];

monshft1 = Table[{ 1 , 10 + Rn[] , 15 + Rn[] }, {150}];  // 1
monshft2 = Table[{ 1 , 12 + Rn[] , 17 + Rn[] }, {150}];  // 2
wedshft1 = Table[{ 3 , 10 + Rn[] , 15 + Rn[] }, {150}];  // 3
wedshft2 = Table[{ 3 , 14 + Rn[] , 17 + Rn[] }, {150}];  // 4
frishft1 = Table[{ 5 , 10 + Rn[] , 15 + Rn[] }, {150}];  // 5
frishft2 = Table[{ 5 , 11 + Rn[] , 15 + Rn[] }, {150}];  // 6
monexcp  = Table[{ 1 , 7  + Rn[] , 9  + Rn[] }, {2}];    // 7

Now we join the data, obtaining one big dataset:

data = Join[monshft1, monshft2, wedshft1, wedshft2, frishft1, frishft2, monexcp];

Let's run a cluster analysis for the data:

clusters = FindClusters[data, 7, Method->{"Agglomerate","Linkage"->"Complete"}]

"Agglomerate" and "Linkage" -> "Complete" are two fine tuning options of the clustering methods implemented in Mathematica. They just specify we are trying to find very compact clusters.

I specified to try to detect 7 clusters. If the right number of shifts is unknown, you can try several reasonable values and see the results, or let the algorithm select the more proper value.

We can get a chart with the results, each cluster in a different color (don't mind the code)

ListPointPlot3D[ clusters, 
           PlotStyle->{{PointSize[Large], Pink},    {PointSize[Large], Green},   
                       {PointSize[Large], Yellow},  {PointSize[Large], Red},  
                       {PointSize[Large], Black},   {PointSize[Large], Blue},   
                       {PointSize[Large], Purple},  {PointSize[Large], Brown}},  
                       AxesLabel -> {"DAY", "START TIME", "END TIME"}]  

And the result is:

Where you can see our seven clusters clearly apart.

That solves part of your problem: identifying the data. Now you also want to be able to label it.

So, we'll get each cluster and take means (rounded):

Table[Round[Mean[clusters[[i]]]], {i, 7}]  

The result is:

Day   Start  End
{"1", "10", "15"},
{"1", "12", "17"},
{"3", "10", "15"},
{"3", "14", "17"},
{"5", "10", "15"},
{"5", "11", "15"},
{"1",  "7",  "9"}

And with that you get again your seven classes.

Now, perhaps you want to classify the shifts, no matter the day. If the same people make the same task at the same time everyday, so it's no useful to call it "Monday shift from 10 to 15", because it happens also on Weds and Fridays (as in our example).

Let's analyze the data disregarding the first column:

clusters=
 FindClusters[Take[data, All, -2],Method->{"Agglomerate","Linkage"->"Complete"}];

In this case, we are not selecting the number of clusters to retrieve, leaving the decision to the package.

The result is

You can see that five clusters have been identified.

Let's try to "label" them as before:

Grid[Table[Round[Mean[clusters[[i]]]], {i, 5}]]

The result is:

 START  END
{"10", "15"},
{"12", "17"},
{"14", "17"},
{"11", "15"},
{ "7",  "9"}

Which is exactly what we "suspected": there are repeated events each day at the same time that could be grouped together.

Edit: Overnight Shifts and Normalization

If you have (or plan to have) shifts that start one day and end on the following, it's better to model

{Start-Day Start-Hour Length}  // Correct!

than

{Start-Day Start-Hour End-Day End-Hour}  // Incorrect!  

That's because as with any statistical method, the correlation between the variables must be made explicit, or the method fails miserably. The principle could run something like "keep your candidate data normalized". Both concepts are almost the same (the attributes should be independent).

--- Edit end ---

By now I guess you understand pretty well what kind of things you can do with this kind if analysis.

Some references

  1. Of course, Wikipedia, its "references" and "further reading" are good guide.
  2. A nice video here showing the capabilities of Statsoft, but you can get there many ideas about other things you can do with the algorithm.
  3. Here is a basic explanation of the algorithms involved
  4. Here you can find the impressive functionality of R for Cluster Analysis (R is a VERY good option)
  5. Finally, here you can find a long list of free and commercial software for statistics in general, including clustering.

HTH!

這篇關(guān)于給定輪班列表,創(chuàng)建時(shí)間表的摘要描述的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請(qǐng)聯(lián)系我們刪除處理,感謝您的支持!

相關(guān)文檔推薦

Use of Different .Net Languages?(使用不同的 .Net 語言?)
Is there a C# library that will perform the Excel NORMINV function?(是否有執(zhí)行 Excel NORMINV 函數(shù)的 C# 庫?)
Determining an #39;active#39; user count of an ASP.NET site(確定 ASP.NET 站點(diǎn)的“活動(dòng)用戶數(shù))
Select x random elements from a weighted list in C# (without replacement)(從 C# 中的加權(quán)列表中選擇 x 個(gè)隨機(jī)元素(無需替換))
Best way to keep track of current online users(跟蹤當(dāng)前在線用戶的最佳方式)
C# Normal Random Number(C# 普通隨機(jī)數(shù))
主站蜘蛛池模板: 国产一级特黄视频 | 欧美精品一区二区在线观看 | 成人免费视频观看 | 亚洲成人综合网站 | 欧美日韩精品中文字幕 | 97福利在线| 欧美a在线| 欧美www在线 | 色爱综合网 | 亚洲国产成人av好男人在线观看 | 欧美一级免费 | 九九热re| 中文在线一区二区 | 国产欧美一级二级三级在线视频 | 欧美三级视频 | 久久久久久高潮国产精品视 | 羞羞网站在线免费观看 | 欧美视频1区 | 国产高清自拍视频在线观看 | 综合久久av | 国产精品区二区三区日本 | 久久国产精彩视频 | 国产 欧美 日韩 一区 | 天天天操操操 | 国产亚洲第一页 | 亚洲电影一级片 | 国产精品欧美一区二区 | 五月天综合影院 | 亚洲欧美激情精品一区二区 | 欧美激情国产日韩精品一区18 | 国产精品福利视频 | 国产精品国产三级国产aⅴ入口 | 欧美另类视频 | 久久综合伊人 | 国产精品成人在线 | 国产一区二区影院 | 久久久久久久久久久福利观看 | 九九九视频在线观看 | 国产美女福利在线观看 | 精品国产不卡一区二区三区 | 亚洲免费一区二区 |