問題描述
我想進一步了解 iterators
,所以如果我錯了,請糾正我.
I wanted to understand a bit more about iterators
, so please correct me if I'm wrong.
迭代器是一個對象,它有一個指向下一個對象的指針,并被讀取為緩沖區或流(即鏈表).它們特別有效,因為它們所做的只是通過引用而不是使用索引來告訴您下一步是什么.
An iterator is an object which has a pointer to the next object and is read as a buffer or stream (i.e. a linked list). They're particularly efficient cause all they do is tell you what is next by references instead of using indexing.
但是我仍然不明白為什么會發生以下行為:
However I still don't understand why is the following behavior happening:
In [1]: iter = (i for i in range(5))
In [2]: for _ in iter:
....: print _
....:
0
1
2
3
4
In [3]: for _ in iter:
....: print _
....:
In [4]:
在通過迭代器 (In [2]
) 的第一個循環之后,就好像它被消耗并留空,所以第二個循環 (In [3]
)什么都不打印.
After a first loop through the iterator (In [2]
) it's as if it was consumed and left empty, so the second loop (In [3]
) prints nothing.
但是我從未為 iter
變量分配新值.
However I never assigned a new value to the iter
variable.
for
循環的底層到底發生了什么?
What is really happening under the hood of the for
loop?
推薦答案
你的懷疑是正確的:迭代器已經被消費了.
Your suspicion is correct: the iterator has been consumed.
實際上,您的迭代器是一個 generator,它是一個能夠只迭代一次.
In actuality, your iterator is a generator, which is an object which has the ability to be iterated through only once.
type((i for i in range(5))) # says it's type generator
def another_generator():
yield 1 # the yield expression makes it a generator, not a function
type(another_generator()) # also a generator
它們高效的原因與通過引用"告訴您下一步是什么無關.它們是高效的,因為它們只根據請求生成下一個項目;所有項目都不是一次生成的.事實上,你可以擁有一個無限的生成器:
The reason they are efficient has nothing to do with telling you what is next "by reference." They are efficient because they only generate the next item upon request; all of the items are not generated at once. In fact, you can have an infinite generator:
def my_gen():
while True:
yield 1 # again: yield means it is a generator, not a function
for _ in my_gen(): print(_) # hit ctl+c to stop this infinite loop!
其他一些有助于提高理解的更正:
Some other corrections to help improve your understanding:
- 生成器不是指針,其行為方式與您在其他語言中可能熟悉的指針不同.
- 與其他語言的區別之一:如上所述,生成器的每個結果都是動態生成的.在請求之前不會生成下一個結果.
- 關鍵字組合
for
in
接受一個可迭代對象作為其第二個參數. - 可迭代對象可以是生成器,如您的示例情況,但它也可以是任何其他可迭代對象,例如
list
或dict
,或str
對象(字符串)或提供所需功能的用戶定義類型. - 應用了
iter
函數到對象以獲取迭代器(順便說一句:不要像您所做的那樣在 Python 中使用iter
作為變量名 - 它是關鍵字之一).實際上,更準確地說,對象的__iter__
method 被調用(也就是說,在大多數情況下,所有iter
函數無論如何都會執行;__iter__
是 Python 所謂的魔術方法"之一). - 如果調用
__iter__
成功,函數next()
在循環中一遍又一遍地應用于可迭代對象,并將第一個變量提供給for
in
分配給next()
函數的結果.(記住:可迭代對象可以是生成器,或者容器對象的迭代器,或者任何其他可迭代對象.)實際上,更準確地說:它調用迭代器對象的__next__
方法,這是另一種魔術方法". for
循環在next()
引發StopIteration
異常(這通常發生在當調用next()
時可迭代對象沒有要產生的另一個對象時).
- The generator is not a pointer, and does not behave like a pointer as you might be familiar with in other languages.
- One of the differences from other languages: as said above, each result of the generator is generated on the fly. The next result is not produced until it is requested.
- The keyword combination
for
in
accepts an iterable object as its second argument. - The iterable object can be a generator, as in your example case, but it can also be any other iterable object, such as a
list
, ordict
, or astr
object (string), or a user-defined type that provides the required functionality. - The
iter
function is applied to the object to get an iterator (by the way: don't useiter
as a variable name in Python, as you have done - it is one of the keywords). Actually, to be more precise, the object's__iter__
method is called (which is, for the most part, all theiter
function does anyway;__iter__
is one of Python's so-called "magic methods"). - If the call to
__iter__
is successful, the functionnext()
is applied to the iterable object over and over again, in a loop, and the first variable supplied tofor
in
is assigned to the result of thenext()
function. (Remember: the iterable object could be a generator, or a container object's iterator, or any other iterable object.) Actually, to be more precise: it calls the iterator object's__next__
method, which is another "magic method". - The
for
loop ends whennext()
raises theStopIteration
exception (which usually happens when the iterable does not have another object to yield whennext()
is called).
您可以通過這種方式在 python 中手動"實現 for
循環(可能并不完美,但足夠接近):
You can "manually" implement a for
loop in python this way (probably not perfect, but close enough):
try:
temp = iterable.__iter__()
except AttributeError():
raise TypeError("'{}' object is not iterable".format(type(iterable).__name__))
else:
while True:
try:
_ = temp.__next__()
except StopIteration:
break
except AttributeError:
raise TypeError("iter() returned non-iterator of type '{}'".format(type(temp).__name__))
# this is the "body" of the for loop
continue
上面的代碼和你的示例代碼幾乎沒有區別.
There is pretty much no difference between the above and your example code.
實際上,for
循環中更有趣的部分不是for
,而是in
.單獨使用 in
會產生與 for
in
不同的效果,但了解 in
的作用非常有用使用它的參數,因為 for
in
實現了非常相似的行為.
Actually, the more interesting part of a for
loop is not the for
, but the in
. Using in
by itself produces a different effect than for
in
, but it is very useful to understand what in
does with its arguments, since for
in
implements very similar behavior.
單獨使用時,
in
關鍵字首先調用對象的__contains__
方法,又是一個神奇的方法"(注意使用for
時會跳過這一步在代碼>).在容器上單獨使用
in
,您可以執行以下操作:
When used by itself, the
in
keyword first calls the object's__contains__
method, which is yet another "magic method" (note that this step is skipped when usingfor
in
). Usingin
by itself on a container, you can do things like this:
1 in [1, 2, 3] # True
'He' in 'Hello' # True
3 in range(10) # True
'eH' in 'Hello'[::-1] # True
如果可迭代對象不是容器(即它沒有 __contains__
方法),in
接下來會嘗試調用對象的 __iter__
方法.如前所述:__iter__
方法返回 Python 中已知的 迭代器.基本上,迭代器是一個對象,您可以使用內置的通用函數 next()
on1.生成器只是迭代器的一種.
If the iterable object is NOT a container (i.e. it doesn't have a __contains__
method), in
next tries to call the object's __iter__
method. As was said previously: the __iter__
method returns what is known in Python as an iterator. Basically, an iterator is an object that you can use the built-in generic function next()
on1. A generator is just one type of iterator.
如果您希望創建自己的對象類型以進行迭代(即,您可以使用 for
in
,或僅使用 in
,on它),了解 yield 關鍵字很有用"noreferrer">生成器(如上所述).
If you wish to create your own object type to iterate over (i.e, you can use for
in
, or just in
, on it), it's useful to know about the yield
keyword, which is used in generators (as mentioned above).
class MyIterable():
def __iter__(self):
yield 1
m = MyIterable()
for _ in m: print(_) # 1
1 in m # True
yield
的存在將函數或方法變成了生成器,而不是常規的函數/方法.如果您使用生成器,則不需要 __next__
方法(它會自動帶來 __next__
).
The presence of yield
turns a function or method into a generator instead of a regular function/method. You don't need the __next__
method if you use a generator (it brings __next__
along with it automatically).
如果您希望創建自己的容器對象類型(即,您可以在其上單獨使用 in
,但不能使用 for
in
),您只需要 __contains__
方法.
If you wish to create your own container object type (i.e, you can use in
on it by itself, but NOT for
in
), you just need the __contains__
method.
class MyUselessContainer():
def __contains__(self, obj):
return True
m = MyUselessContainer()
1 in m # True
'Foo' in m # True
TypeError in m # True
None in m # True
<小時>
1 請注意,要成為迭代器,對象必須實現 迭代器協議.這僅意味著 __next__
和 __iter__
方法都必須正確實現(生成器免費"提供此功能,所以你不要使用時無需擔心).還要注意 ___next__
方法 實際上是 next
(無下劃線)在 Python 2 中.
1 Note that, to be an iterator, an object must implement the iterator protocol. This only means that both the __next__
and __iter__
methods must be correctly implemented (generators come with this functionality "for free", so you don't need to worry about it when using them). Also note that the ___next__
method is actually next
(no underscores) in Python 2.
2請參閱此答案了解創建可迭代類的不同方法.
2 See this answer for the different ways to create iterable classes.
這篇關于Python for 循環和迭代器行為的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!