問(wèn)題描述
我有一個(gè)數(shù)據(jù)框:
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 982 entries, 2009-10-30 00:00:00 to 2012-12-16 00:00:00
Data columns (total 4 columns):
rain 981 non-null values
temp_max 982 non-null values
temp_min 982 non-null values
temp 982 non-null values
dtypes: float64(4)
對(duì)于每年/每月的求和,我使用:
For summing per Year/Month i use :
mdata = data.groupby([lambda x: x.year, lambda x: x.month]).agg([sum])
但我需要季節(jié)性分析(夏季、冬季等),那么我如何創(chuàng)建特定月份的總和,例如每年的 [1 ,2 ,3]?
But i need Seasonal analysis (summer, winter etc), so how i can create the Sum of specific months like [1 ,2 ,3] of each year?
泰
推薦答案
是的,對(duì)我來(lái)說(shuō)似乎很簡(jiǎn)潔的一種解決方案是使用 Seasons 字典,然后使用函數(shù)對(duì)數(shù)據(jù)進(jìn)行分組.作為組鍵傳遞的任何函數(shù),每個(gè)索引值都會(huì)調(diào)用一次,返回值用作組名.
Yes, one solution which seems neat to me is to use a Seasons dictionary and then group the data using a function. Any function passed as a group key is called once per index value and the return values are used as the group names.
import pandas as pd
import numpy as np
from pandas import DataFrame
import datetime
# Create a year's worth of data
base = datetime.date.today() - datetime.timedelta(365)
Datelist = [base + datetime.timedelta(days = x) for x in range(365)]
DF = DataFrame(np.random.rand(365), index = Datelist)
# Create a Seasonal Dictionary that will map months to seasons
SeasonDict = {11: 'Winter', 12: 'Winter', 1: 'Winter', 2: 'Spring', 3: 'Spring', 4: 'Spring', 5: 'Summer', 6: 'Summer', 7: 'Summer',
8: 'Autumn', 9: 'Autumn', 10: 'Autumn'}
# Write a function that will be used to group the data
def GroupFunc(x):
return SeasonDict[x.month]
# Call the function with the groupby operation.
Grouped = DF.groupby(GroupFunc)
Grouped.sum()
該函數(shù)獲取每個(gè)索引值并在季節(jié)字典中查找月份并返回與月份鍵對(duì)應(yīng)的值.該值隨后成為組名.
The function takes each index value and looks up the month in the Seasons Dictionary and returns the value corresponding to the month key. This value then becomes the group name.
或者,您可以使用示例中的 lambda(效率更高,但我認(rèn)為上面的內(nèi)容更容易理解):
Alternatively you can use the lambda as in your example (which is more efficient, but I thought the above would be easier to understand):
DF.groupby(lambda x: SeasonDict[x.month]).sum()
根據(jù)評(píng)論的附加代碼在我看來(lái),您最好對(duì)數(shù)據(jù)進(jìn)行切片.因此,您可以執(zhí)行以下操作
ADDITIONAL CODE AS PER COMMENTS It seems to me like you would be better off slicing the data. So you could do the following
DF['Season'] = ""
for row in DF.index:
DF.Season[row] = SeasonDict[row.month]
DFWinter = DF[DF.Season == 'Winter']
現(xiàn)在您有了一個(gè)包含冬季數(shù)據(jù)的新數(shù)據(jù)框,可以隨意使用.不同之處在于 groupby 操作允許您對(duì)所有數(shù)據(jù)進(jìn)行相同的操作,而聽(tīng)起來(lái)您想以不同的方式調(diào)查數(shù)據(jù)集不同部分的屬性.為此,最好進(jìn)行切片,在這種情況下使用布爾切片.
Now you have a new data frame with the winter data in, to play with as you desire. The difference is that the groupby operations allow you to undertake the same operations on all the data, whereas it sounds like you wanted to investigate the properties of different parts of your data set in different ways. To do that its better to slice, in this case using Boolean slicing.
這篇關(guān)于Pandas、groupby 和特定月份的求和的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!