問題描述
根據(jù) Lennart Regebro 的回答更新
UPDATED based on Lennart Regebro's answer
假設你遍歷一個字典,有時需要刪除一個元素.以下是非常有效的:
Suppose you iterate through a dictionary, and sometimes need to delete an element. The following is very efficient:
remove = []
for k, v in dict_.items():
if condition(k, v):
remove.append(k)
continue
# do other things you need to do in this loop
for k in remove:
del dict_[k]
這里唯一的開銷是構建要刪除的鍵列表;除非它與字典大小相比變大,否則這不是問題.但是,這種方法需要一些額外的編碼,所以不是很流行.
The only overhead here is building the list of keys to remove; unless it grows large compared to the dictionary size, it's not an issue. However, this approach requires some extra coding, so it's not very popular.
流行的字典理解方法:
dict_ = {k : v for k, v in dict_ if not condition(k, v)}
for k, v in dict_.items():
# do other things you need to do in this loop
會產(chǎn)生完整的字典副本,因此如果字典變大或經(jīng)常調(diào)用包含函數(shù),則可能會出現(xiàn)愚蠢的性能損失.
results in a full dictionary copy, and so has the risk of a silly performance hit if dictionaries grow large or the containing function is called often.
更好的方法是只復制鍵而不是整個字典:
A much better approach is to copy the keys only rather than whole dictionary:
for k in list(dict_.keys()):
if condition(k, dict_[k]):
del dict_[k]
continue
# do other things you need to do in this loop
(請注意,所有代碼示例都在 Python 3 中,因此 keys()
、items()
返回的是視圖,而不是副本.)
(Note that all code examples are in Python 3, so keys()
, items()
returns a view, not a copy.)
在大多數(shù)情況下,它不會對性能造成太大影響,因為檢查最簡單的條件(更不用說您在循環(huán)中執(zhí)行的其他操作)的時間通常比添加一個鍵的時間要長一個列表.
In most cases, it won't hurt performance that much, since the time to check even the simplest condition (not to mention other stuff you're doing in the loop) is usually greater than the time to add one key to a list.
不過,我想知道是否可以使用允許在迭代時刪除的自定義字典來避免這種情況:
Still, I am wondering if it's possible to avoid even that with a custom dictionary that allows deletions while iterating:
for k, v in dict_.items():
if condition(k, v):
del dict_[k]
continue
# do other things you need to do in this loop
也許迭代器總是可以向前看,這樣當 __next__
被調(diào)用時,迭代器甚至不用看當前元素就知道去哪里(它只需要在它首先到達它).如果沒有下一個元素,迭代器可以設置一個標志,當再次調(diào)用 __next__
時會引發(fā) StopIteration
異常.
Perhaps an iterator could always look ahead, so that when the __next__
is called, the iterator knows where to go without even looking at the current element (it would only need to look at the element when it first gets to it). And if there is no next element, the iterator could just set the flag that would cause StopIteration
exception raised whenever __next__
is called again.
如果迭代器嘗試前進的元素被刪除,則可以引發(fā)異常;當多個迭代同時進行時,不需要支持刪除.
If the element the iterator tries to advance to turns out to be deleted, it's fine to raise an exception; there is no need to support deletions while multiple iterations are going on simultaneously.
這種方法有什么問題嗎?
Are there any problems with this approach?
一個問題是,與現(xiàn)有的 dict
相比,我不確定它是否可以在沒有材料開銷的情況下完成;否則,使用 list(dict_)
方法會更快!
One problem is that I'm not sure it can be done with no material overhead compared to the existing dict
; otherwise, it would be faster to use the list(dict_)
approach!
更新:
我嘗試了所有版本.我沒有報告時間,因為它們顯然非常依賴于確切的情況.但可以肯定地說,在許多情況下,最快的方法可能是 list(dict_)
.畢竟,如果你想一想,復制是最快的操作,它會隨著列表的大小線性增長;幾乎任何其他開銷,只要它也與列表大小成正比,都可能更大.
I tried all the versions. I don't report the timing, since they are clearly very dependent on the exact situation. But it seems safe to say that in many cases, the fastest approach is likely to be list(dict_)
. After all, if you think about, the copy is the fastest operation that grows linearly with size of the list; almost any other overhead, as long as it's also proportional to the list size, is likely to be bigger.
我真的很喜歡所有的想法,但由于我只能選擇一個,我接受上下文管理器解決方案,因為它允許使用字典作為正常或增強",只需非常小的代碼更改.
I really like all the ideas, but since I have to select only one, I'm accepting the context manager solution since it allows to use the dictionary as either normal or "enhanced" with very small code changes.
推薦答案
正如您所注意到的,您可以將要刪除的項目存儲在某處,并將它們的刪除推遲到以后.然后問題就變成了何時 清除它們以及如何 以確保最終調(diào)用清除方法.答案是上下文管理器,它也是 dict
的子類.
As you note, you can store the items to delete somewhere and defer the deletion of them until later. The problem then becomes when to purge them and how to make sure that the purge method eventually gets called. The answer to this is a context manager which is also a subclass of dict
.
class dd_dict(dict): # the dd is for "deferred delete"
_deletes = None
def __delitem__(self, key):
if key not in self:
raise KeyError(str(key))
dict.__delitem__(self, key) if self._deletes is None else self._deletes.add(key)
def __enter__(self):
self._deletes = set()
def __exit__(self, type, value, tb):
for key in self._deletes:
try:
dict.__delitem__(self, key)
except KeyError:
pass
self._deletes = None
用法:
# make the dict and do whatever to it
ddd = dd_dict(a=1, b=2, c=3)
# now iterate over it, deferring deletes
with ddd:
for k, v in ddd.iteritems():
if k is "a":
del ddd[k]
print ddd # shows that "a" is still there
print ddd # shows that "a" has been deleted
如果您不在 with
塊中,當然,刪除是立即的;由于這是一個 dict
子類,它的工作方式與上下文管理器之外的常規(guī) dict
一樣.
If you're not in a with
block, of course, deletes are immediate; as this is a dict
subclass, it works just like a regular dict
outside of a context manager.
您也可以將其實現(xiàn)為字典的包裝類:
You could also implement this as a wrapper class for a dictionary:
class deferring_delete(object):
def __init__(self, d):
self._dict = d
def __enter__(self):
self._deletes = set()
return self
def __exit__(self, type, value, tb):
for key in self._deletes:
try:
del self._dict[key]
except KeyError:
pass
del self._deletes
def __delitem__(self, key):
if key not in self._dict:
raise KeyError(str(key))
self._deletes.add(key)
d = dict(a=1, b=2, c=3)
with deferring_delete(d) as dd:
for k, v in d.iteritems():
if k is "a":
del dd[k] # delete through wrapper
print d
如果您愿意,甚至可以將包裝類完全用作字典,盡管這需要更多代碼.
It's even possible to make the wrapper class fully functional as a dictionary, if you want, though that's a fair bit more code.
在性能方面,誠然這不是一場勝利,但從程序員友好的角度來看,我喜歡它.第二種方法應該會稍微快一些,因為它不會在每次刪除時測試一個標志.
Performance-wise, this is admittedly not such a win, but I like it from a programmer-friendliness standpoint. The second method should be very slightly faster since it's not testing a flag on each delete.
這篇關于允許在迭代期間刪除的自定義字典的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!