問題描述
我有一個看起來像這樣的列表:
I have a list of lists that looks like:
c = [['470', '4189.0', 'asdfgw', 'fds'],
['470', '4189.0', 'qwer', 'fds'],
['470', '4189.0', 'qwer', 'dsfs fdv']
...]
c
有大約 30,000 個內部列表.我想做的是根據每個內部列表中的第 4 項消除重復項.所以上面的列表看起來像:
c
has about 30,000 interior lists. What I'd like to do is eliminate duplicates based on the 4th item on each interior list. So the list of lists above would look like:
c = [['470', '4189.0', 'asdfgw', 'fds'],['470', '4189.0', 'qwer', 'dsfs fdv'] ...]
這是我目前所擁有的:
d = [] #list that will contain condensed c
d.append(c[0]) #append first element, so I can compare lists
for bact in c: #c is my list of lists with 30,000 interior list
for items in d:
if bact[3] != items[3]:
d.append(bact)
我認為這應該可行,但它只是運行和運行.我讓它運行了 30 分鐘,然后殺死了它.我不認為程序應該花這么長時間,所以我猜我的邏輯有問題.
I think this should work, but it just runs and runs. I let it run for 30 minutes, then killed it. I don't think the program should take so long, so I'm guessing there is something wrong with my logic.
我覺得創建一個全新的列表非常愚蠢.任何幫助將不勝感激,請在我學習時隨時挑剔.如果我的詞匯不正確,請更正我的詞匯.
I have a feeling that creating a whole new list of lists is pretty stupid. Any help would be much appreciated, and please feel free to nitpick as I am learning. Also please correct my vocabulary if it is incorrect.
推薦答案
我會這樣做:
seen = set()
cond = [x for x in c if x[3] not in seen and not seen.add(x[3])]
解釋:
seen
是一個跟蹤每個子列表中已經遇到的第四個元素的集合.cond
是精簡列表.如果 x[3]
(其中 x
是 c
中的子列表)不在 seen
中,則 x
將被添加到 cond
并且 x[3]
將被添加到 seen
.
seen
is a set which keeps track of already encountered fourth elements of each sublist.
cond
is the condensed list. In case x[3]
(where x
is a sublist in c
) is not in seen
, x
will be added to cond
and x[3]
will be added to seen
.
seen.add(x[3])
將返回 None
,因此 not seen.add(x[3])
將始終為 True
,但只有當 x[3] not in seen
為 True
時才會評估該部分,因為 Python 使用短路評估.如果第二個條件得到評估,它將始終返回 True
并具有將 x[3]
添加到 seen
的副作用.這是正在發生的另一個示例(print
返回 None
并具有打印某些內容的副作用"):
seen.add(x[3])
will return None
, so not seen.add(x[3])
will always be True
, but that part will only be evaluated if x[3] not in seen
is True
since Python uses short circuit evaluation. If the second condition gets evaluated, it will always return True
and have the side effect of adding x[3]
to seen
. Here's another example of what's happening (print
returns None
and has the "side-effect" of printing something):
>>> False and not print('hi')
False
>>> True and not print('hi')
hi
True
這篇關于根據每個子列表中的第三項刪除列表列表中的重復項的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!