問題描述
我最近剛剛從 R 切換到 python,并且在再次習慣數據幀而不是使用 R 的 data.table 時遇到了一些麻煩.我遇到的問題是我想獲取一個字符串列表,檢查一個值,然后將該字符串的計數相加 - 由用戶分解.所以我想把這些數據:
I just recently made the switch from R to python and have been having some trouble getting used to data frames again as opposed to using R's data.table. The problem I've been having is that I'd like to take a list of strings, check for a value, then sum the count of that string- broken down by user. So I would like to take this data:
A_id B C
1: a1 "up" 100
2: a2 "down" 102
3: a3 "up" 100
3: a3 "up" 250
4: a4 "left" 100
5: a5 "right" 102
然后返回:
A_id_grouped sum_up sum_down ... over_200_up
1: a1 1 0 ... 0
2: a2 0 1 0
3: a3 2 0 ... 1
4: a4 0 0 0
5: a5 0 0 ... 0
在我用 R 代碼做之前(使用 data.table)
Before I did it with the R code (using data.table)
>DT[ ,list(A_id_grouped, sum_up = sum(B == "up"),
+ sum_down = sum(B == "down"),
+ ...,
+ over_200_up = sum(up == "up" & < 200), by=list(A)];
但是,我最近使用 Python 的所有嘗試都失敗了:
However all of my recent attempts with Python have failed me:
DT.agg({"D": [np.sum(DT[DT["B"]=="up"]),np.sum(DT[DT["B"]=="up"])], ...
"C": np.sum(DT[(DT["B"]=="up") & (DT["C"]>200)])
})
提前感謝您!這似乎是一個簡單的問題,但我在任何地方都找不到.
Thank you in advance! it seems like a simple question however I couldn't find it anywhere.
推薦答案
為了補充 unutbu 的答案,這里有一個在 groupby 對象上使用 apply
的方法.
To complement unutbu's answer, here's an approach using apply
on the groupby object.
>>> df.groupby('A_id').apply(lambda x: pd.Series(dict(
sum_up=(x.B == 'up').sum(),
sum_down=(x.B == 'down').sum(),
over_200_up=((x.B == 'up') & (x.C > 200)).sum()
)))
over_200_up sum_down sum_up
A_id
a1 0 0 1
a2 0 1 0
a3 1 0 2
a4 0 0 0
a5 0 0 0
這篇關于pandas 聚合的條件總和的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!