問題描述
我從來沒有真正理解這兩個索引之間的區別,誰能解釋一下區別是什么(性能方面,索引結構在數據庫中的外觀,存儲方面等)?
I've never really understood the difference between these two indexes, can someone please explain what the difference is (performance-wise, how the index structure will look like in db, storage-wise etc)?
包含索引
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID);
'普通'索引
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode, AddressLine1, AddressLine2, City, StateProvinceID);
推薦答案
索引的內部存儲采用 B-Tree 結構,由索引頁"(根頁和所有中間頁)和索引數據頁"(僅葉頁).
The internal storage of indexes uses a B-Tree structure and consists of "index pages" (the root and all intermediate pages) and "index data pages" (the leaf pages only).
注意不要將索引數據頁"與存儲大部分實際數據列的數據頁"(聚集索引的葉頁)混淆.
Note do not confuse "index data pages" with the "data pages" (leaf pages of clustered indexes) which store most of the columns of actual data.
- 只有索引列存儲在索引頁上.
- 通過在
INCLUDE
部分放置一些列,每個索引鍵存儲在每個頁面上的數據更少. - 意味著需要更少的頁面來保存索引鍵.(更輕松地將這些常用頁面緩存在內存中更長時間.)
- 樹中的級別可能更少.(在這種情況下,性能優勢會更大,因為每個樹級別遍歷都是另一次磁盤訪問.)
- Only the index columns are stored on the index pages.
- By placing some columns in the
INCLUDE
section, less data per index key is stored on each page. - Meaning fewer pages are needed to hold the index keys. (Making it easier to cache these frequently used pages in memory for longer.)
- And possibly fewer levels in the tree. (In such a case performance benefits can be much bigger because every tree level traversal is another disk access.)
- 如果索引具有
INCLUDE
列,則該數據在查詢需要時立即可用. - 如果查詢需要在索引鍵或
INCLUDE
列中不可用的列,則需要對聚集索引中的正確行(或堆,如果沒有聚集索引)進行額外的書簽查找"已定義索引). - If the index has
INCLUDE
columns, that data is immediately available should the query need it. - If the query requires columns not available in either the index keys or the
INCLUDE
columns, then an additional "bookmark lookup" is required to the correct row in the clustered index (or heap if no clustered index defined). - 如果您的索引的鍵和查詢中的過濾器選擇性不夠,那么該索引將被忽略(無論您的
INCLUDE
列中有什么內容). - 您創建的每個索引都有 INSERT 和 UPDATE 語句的開銷;對于更大"的索引更是如此.(更大的也適用于
INCLUDE
列.) - 因此,雖然理論上您可以創建大量包含列的大索引以匹配訪問路徑的所有排列:這會適得其反.
- If the keys of your index and filters in your query are not selective enough, then the index will be ignored (regardless of what's in your
INCLUDE
columns). - Every index you create has overhead for INSERT and UPDATE statements; more so for "bigger" indexes. (Bigger applies to
INCLUDE
columns as well.) - So while you could in theory create a multitude of big indexes with include columns to match all the permutations of access paths: it would be very counter-productive.
- 擴展索引的鍵以包含索引/過濾器中不需要的列是一種常見的索引調整技巧".(稱為覆蓋索引.)
- 這些列通常在輸出列中需要,或者作為連接到其他表的參考列.
- 這將避免臭名昭著的書簽查找",但缺點是使索引比嚴格需要的更寬".
- 事實上,索引中較早的列通常已經確定了唯一行,這意味著如果不是為了避免書簽查找",額外包含的列將是完全多余的" 好處.
INCLUDE
列基本上可以更有效地獲得相同的好處.- It was a common index tuning 'trick' to expand the keys of an index to include columns that weren't needed in the index/filter. (Known as a covering index.)
- These columns were commonly required in output columns or as reference columns for joins to other tables.
- This would avoid the infamous "bookmark lookups", but had the disadvantage of making the index 'wider' than strictly necessary.
- In fact very often the earlier columns in the index would already identify a unique row meaning the extra included columns would be completely redundant if not for the "avoiding bookmark lookups" benefit.
INCLUDE
columns basically allow the same benefit more efficiently.
當使用索引時,索引鍵用于通過索引頁導航到正確的索引數據頁.
When an index is used, the index key is used to navigate through the index pages to the correct index data page.
一些注意事項,希望能解決您的一些困惑:
Some things to note that hopefully addresses some of your confusion:
值得注意的是,在 INCLUDE
列被添加為一項功能之前:
It's worth noting that before INCLUDE
columns were added as a feature:
注意 需要指出的一點很重要.如果您養成總是將查詢編寫為 SELECT * ...
的懶惰習慣,那么您通常從索引中的 INCLUDE
列中獲得零收益.通過返回所有列,您基本上可以確保在任何情況下都需要進行書簽查找.
NB Something very important to point out. You generally get zero benefit out of
INCLUDE
columns in your indexes if you're in the lazy habit of always writing your queries asSELECT * ...
. By returning all columns you're basically ensuring a bookmark lookup is required in any case.
這篇關于包含列的索引,有什么區別?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!