問題描述
我想知道以下內容:
- 如何從我的數據庫中的多個表中獲取數據?
- 有哪些方法可以做到這一點?
- 什么是聯接和聯合?它們之間有何不同?
- 與其他的相比,我應該什么時候使用它們?
我打算在我的(例如 - PHP)應用程序中使用它,但不想對數據庫運行多個查詢,我有哪些選項可以在單個查詢中從多個表中獲取數據?>
注意:我寫這篇文章是因為我希望能夠鏈接到關于我在 PHP 隊列中經常遇到的眾多問題的寫得很好的指南,所以當我發布一個回答.
答案涵蓋以下內容:
- 第 1 部分 - 聯接和工會
- 第 2 部分 - 子查詢
- 第 3 部分 - 技巧和高效的代碼
- 第 4 部分 - From 子句中的子查詢
- 第 5 部分 - 約翰的詭計混雜
第 1 部分 - 聯接和聯合
這個答案涵蓋:
- 第 1 部分
- 使用內連接連接兩個或多個表(請參閱維基百科條目了解更多信息)
- 如何使用聯合查詢
- 左外連接和右外連接(這個 stackOverflow 答案非常適合描述連接類型)
- 交叉查詢(以及如果您的數據庫不支持它們,如何重現它們) - 這是 SQL-Server 的一個功能 (查看信息)和 莉>
- 第 2 部分
- 子查詢 - 它們是什么,可以在哪里使用以及需要注意什么
- 笛卡爾加入 AKA - 哦,痛苦!
有多種方法可以從數據庫中的多個表中檢索數據.在這個答案中,我將使用 ANSI-92 連接語法.這可能與其他一些使用舊的 ANSI-89 語法的教程不同(如果你習慣了 89,可能看起來不那么直觀 - 但我只能說嘗試一下),因為它是當查詢開始變得更復雜時,更容易理解.為什么要使用它?有性能提升嗎?簡短的回答是否定的,但是一旦你習慣了它就更容易閱讀了.使用這種語法閱讀其他人編寫的查詢會更容易.
我還將使用一個小型車場的概念,它有一個數據庫來跟蹤它有哪些可用的汽車.所有者已聘請您作為他的 IT 計算機人員,并希望您能夠毫不猶豫地將他要求的數據交給他.
我制作了一些最終表將使用的查找表.這將為我們提供一個合理的模型來工作.首先,我將針對具有以下結構的示例數據庫運行查詢.我會試著想想在開始時犯的常見錯誤,并解釋它們出了什么問題——當然也會展示如何糾正它們.
第一個表格只是一個顏色列表,以便我們知道車場里有什么顏色.
mysql>創建表顏色(id int(3) not null auto_increment 主鍵,->顏色變量(15),油漆變量(10));查詢正常,0 行受影響(0.01 秒)mysql>從顏色顯示列;+-------+-------------+------+-----+---------+-----------------+|領域 |類型 |空 |鑰匙 |默認 |額外 |+-------+-------------+------+-----+---------+-----------------+|身份證 |整數(3) |否 |PRI |空 |自動增量||顏色 |varchar(15) |是 ||空 |||油漆|varchar(10) |是 ||空 ||+-------+-------------+------+-----+---------+-----------------+3 行(0.01 秒)mysql>插入顏色(顏色,油漆)值('紅色','金屬'),->('綠色', '光澤'), ('藍色', '金屬'),->('白色' '光澤'), ('黑色' '光澤');查詢正常,5 行受影響(0.00 秒)記錄:5 重復:0 警告:0mysql>從顏色中選擇*;+----+-------+----------+|身份證 |顏色 |油漆|+----+-------+----------+|1 |紅色 |金屬 ||2 |綠色 |光澤 ||3 |藍色 |金屬 ||4 |白色 |光澤 ||5 |黑色 |光澤 |+----+-------+----------+5 行(0.00 秒)
品牌表標識了caryard 可能銷售的不同品牌汽車.
mysql>創建表品牌(id int(3) not null auto_increment 主鍵,->品牌 varchar(15));查詢正常,0 行受影響(0.01 秒)mysql>顯示品牌欄目;+-------+-------------+------+-----+---------+-----------------+|領域 |類型 |空 |鑰匙 |默認 |額外 |+-------+-------------+------+-----+---------+-----------------+|身份證 |整數(3) |否 |PRI |空 |自動增量||品牌 |varchar(15) |是 ||空 ||+-------+-------------+------+-----+---------+-----------------+2 行(0.01 秒)mysql>插入品牌 (brand) 值 ('Ford'), ('Toyota'),->('Nissan'), ('Smart'), ('BMW');查詢正常,5 行受影響(0.00 秒)記錄:5 重復:0 警告:0mysql>從品牌中選擇*;+----+--------+|身份證 |品牌 |+----+--------+|1 |福特 ||2 |豐田 ||3 |日產 ||4 |智能 ||5 |寶馬 |+----+--------+5 行(0.00 秒)
模型表將涵蓋不同類型的汽車,使用不同的汽車類型而不是實際的汽車模型會更簡單.
mysql>創建表模型(id int(3) not null auto_increment 主鍵,->模型 varchar(15));查詢正常,0 行受影響(0.01 秒)mysql>顯示模型中的列;+-------+-------------+------+-----+---------+-----------------+|領域 |類型 |空 |鑰匙 |默認 |額外 |+-------+-------------+------+-----+---------+-----------------+|身份證 |整數(3) |否 |PRI |空 |自動增量||模型 |varchar(15) |是 ||空 ||+-------+-------------+------+-----+---------+-----------------+2 行(0.00 秒)mysql>插入模型 (model) 值 ('Sports'), ('Sedan'), ('4WD'), ('Luxury');查詢正常,4 行受影響(0.00 秒)記錄:4 重復:0 警告:0mysql>從模型中選擇 *;+----+--------+|身份證 |模型 |+----+--------+|1 |體育 ||2 |轎車 ||3 |四驅 ||4 |豪華 |+----+--------+4 行(0.00 秒)
最后,將所有這些其他表捆綁在一起,將所有東西捆綁在一起的表.ID 字段實際上是用于識別汽車的唯一批號.
mysql>創建表汽車(id int(3) not null auto_increment 主鍵,->顏色 int(3), 品牌 int(3), 型號 int(3));查詢正常,0 行受影響(0.01 秒)mysql>顯示來自汽車的列;+-------+--------+------+-----+---------+----------------+|領域 |類型 |空 |鑰匙 |默認 |額外 |+-------+--------+------+-----+---------+----------------+|身份證 |整數(3) |否 |PRI |空 |自動增量||顏色 |整數(3) |是 ||空 |||品牌 |整數(3) |是 ||空 |||模型 |整數(3) |是 ||空 ||+-------+--------+------+-----+---------+----------------+4 行(0.00 秒)mysql>插入汽車(顏色,品牌,型號)值(1,2,1),(3,1,2),(5,3,1),->(4,4,2), (2,2,3), (3,5,4), (4,1,3), (2,2,1), (5,2,3), (4),5,1);查詢正常,10 行受影響(0.00 秒)記錄:10 重復:0 警告:0mysql>從汽車中選擇 *;+----+-------+-------+-------+|身份證 |顏色 |品牌 |模型 |+----+-------+-------+-------+|1 |1 |2 |1 ||2 |3 |1 |2 ||3 |5 |3 |1 ||4 |4 |4 |2 ||5 |2 |2 |3 ||6 |3 |5 |4 ||7 |4 |1 |3 ||8 |2 |2 |1 ||9 |5 |2 |3 ||10 |4 |5 |1 |+----+-------+-------+-------+10 行(0.00 秒)
這將為我們提供足夠的數據(我希望)來涵蓋以下不同類型連接的示例,并提供足夠的數據使它們值得.
因此,老板想知道他擁有的所有跑車的 ID.
這是一個簡單的兩表連接.我們有一個表來標識模型和包含可用庫存的表.可以看到,cars
表的model
列中的數據與cars
表的models
列相關> 我們有桌子.現在,我們知道模型表的 1
用于 Sports
的 ID,所以讓我們編寫連接.
選擇ID,模型從汽車加入模型型號=ID
所以這個查詢看起來不錯吧?我們已經確定了兩個表并包含了我們需要的信息,并使用了一個連接來正確識別要連接的列.
ERROR 1052 (23000): 字段列表中的ID"列不明確
哦,不!我們的第一個查詢出錯了!是的,它是一個李子.您會看到,查詢確實獲得了正確的列,但其中一些列存在于兩個表中,因此數據庫對我們的實際含義和位置感到困惑.有兩種解決方案可以解決這個問題.第一個很好很簡單,我們可以使用 tableName.columnName
來告訴數據庫我們的意思,就像這樣:
選擇汽車.ID,模型.model從汽車加入模型在cars.model=models.ID+----+--------+|身份證 |模型 |+----+--------+|1 |體育 ||3 |體育 ||8 |體育 ||10 |體育 ||2 |轎車 ||4 |轎車 ||5 |四驅 ||7 |四驅 ||9 |四驅 ||6 |豪華 |+----+--------+10 行(0.00 秒)
另一個可能更常用,稱為表別名.這個例子中的表格有漂亮而簡短的名字,但是輸入類似 KPI_DAILY_SALES_BY_DEPARTMENT
的東西可能會很快變老,所以一個簡單的方法是給表格起這樣的昵稱:
選擇援助,b.模型從汽車加入模型 ba.model=b.ID
現在,回到請求.正如你所看到的,我們有我們需要的信息,但我們也有沒有被要求的信息,所以我們需要在語句中包含一個 where 子句,以只獲取被要求的跑車.由于我更喜歡??表別名方法而不是一遍又一遍地使用表名,所以從現在開始我將堅持使用它.
顯然,我們需要在查詢中添加一個 where 子句.我們可以通過 ID=1
或 model='Sports'
來識別跑車.由于 ID 已編入索引和主鍵(而且它的鍵入次數較少),因此讓我們在查詢中使用它.
選擇援助,b.模型從汽車加入模型 ba.model=b.ID在哪里b.ID=1+----+--------+|身份證 |模型 |+----+--------+|1 |體育 ||3 |體育 ||8 |體育 ||10 |體育 |+----+--------+4 行(0.00 秒)
賓果游戲!老板很高興.當然,作為老板,從不滿足于他的要求,他查看信息,然后說我也想要顏色.
好的,所以我們已經編寫了查詢的很大一部分,但是我們需要使用第三個表,即顏色.現在,我們的主信息表 cars
存儲了汽車顏色 ID,這會鏈接回顏色 ID 列.因此,以與原始類似的方式,我們可以加入第三個表:
選擇援助,b.模型從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID在哪里b.ID=1+----+--------+|身份證 |模型 |+----+--------+|1 |體育 ||3 |體育 ||8 |體育 ||10 |體育 |+----+--------+4 行(0.00 秒)
該死,雖然表已正確連接并且相關列已鏈接,但我們忘記從我們剛剛鏈接的新表中提取實際信息.
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID在哪里b.ID=1+----+--------+-------+|身份證 |模型 |顏色 |+----+--------+-------+|1 |體育 |紅色 ||8 |體育 |綠色 ||10 |體育 |白色 ||3 |體育 |黑色 |+----+--------+-------+4 行(0.00 秒)
是的,那是我們暫時離開的老板.現在,更詳細地解釋其中的一些內容.如您所見,我們語句中的 from
子句鏈接了我們的主表(我經常使用包含信息的表,而不是查找表或維度表.查詢將與所有表一樣正常工作切換了,但是當我們在幾個月后回到這個查詢來閱讀它時就沒那么有意義了,所以通常最好嘗試編寫一個很好且易于理解的查詢 - 直觀地布局它,使用 nice縮進,以便一切都盡可能清楚.如果您繼續教導他人,請嘗試將這些特征灌輸到他們的查詢中 - 特別是如果您要對它們進行故障排除時.
完全有可能以這種方式鏈接越來越多的表格.
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=1
雖然我忘記在 join
語句中包含一個我們可能想要連接多個列的表,但這里有一個示例.如果 models
表具有品牌特定的型號,因此也有一個名為 brand
的列,該列鏈接回 上的
字段,可以這樣做:brands
表ID
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.IDb.brand=d.ID在哪里b.ID=1
您可以看到,上面的查詢不僅將連接表鏈接到主 cars
表,還指定了已連接表之間的連接.如果沒有這樣做,則結果稱為笛卡爾連接 - 這是 dba 代表不好.笛卡爾連接是一種返回行的連接,因為信息沒有告訴數據庫如何限制結果,因此查詢返回所有符合條件的行.
所以,舉一個笛卡爾連接的例子,讓我們運行以下查詢:
選擇援助,b.模型從汽車加入模型 b+----+--------+|身份證 |模型 |+----+--------+|1 |體育 ||1 |轎車 ||1 |四驅 ||1 |豪華 ||2 |體育 ||2 |轎車 ||2 |四驅 ||2 |豪華 ||3 |體育 ||3 |轎車 ||3 |四驅 ||3 |豪華 ||4 |體育 ||4 |轎車 ||4 |四驅 ||4 |豪華 ||5 |體育 ||5 |轎車 ||5 |四驅 ||5 |豪華 ||6 |體育 ||6 |轎車 ||6 |四驅 ||6 |豪華 ||7 |體育 ||7 |轎車 ||7 |四驅 ||7 |豪華 ||8 |體育 ||8 |轎車 ||8 |四驅 ||8 |豪華 ||9 |體育 ||9 |轎車 ||9 |四驅 ||9 |豪華 ||10 |體育 ||10 |轎車 ||10 |四驅 ||10 |豪華 |+----+--------+40 行(0.00 秒)
天哪,太丑了.然而,就數據庫而言,它正是所要求的.在查詢中,我們要求 cars
的 ID
和 models
的 model
.但是,因為我們沒有指定如何連接這些表,數據庫已經將第一個表中的每一行與每一行進行了匹配.第二張桌子.
好的,所以老板回來了,他又想了解更多信息.我想要相同的列表,但還包括四驅車.
然而,這給了我們一個很好的借口來看看兩種不同的方法來實現這一點.我們可以在 where 子句中添加另一個條件,如下所示:
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=1或 b.ID=3
雖然上面的方法工作得很好,但讓我們換個角度看,這是展示 union
查詢如何工作的一個很好的借口.
我們知道以下將返回所有跑車:
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=1
以下將返回所有 4WD:
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=3
因此,通過在它們之間添加 union all
子句,第二個查詢的結果將附加到第一個查詢的結果中.
選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=1聯合所有選擇援助,b.模型,c.顏色從汽車加入模型 ba.model=b.ID加入顏色 ca.color=c.ID加盟品牌da.brand=d.ID在哪里b.ID=3+----+--------+-------+|身份證 |模型 |顏色 |+----+--------+-------+|1 |體育 |紅色 ||8 |體育 |綠色 ||10 |體育 |白色 ||3 |體育 |黑色 ||5 |四驅 |綠色 ||7 |四驅 |白色 ||9 |四驅 |黑色 |+----+--------+-------+7 行(0.00 秒)
如您所見,首先返回第一個查詢的結果,然后是第二個查詢的結果.
在這個例子中,簡單地使用第一個查詢當然會容易得多,但是 union
查詢對于特定情況可能非常有用.它們是從不容易連接在一起的表中返回特定結果的好方法 - 或者就此而言完全不相關的表.但是,有一些規則需要遵循.
- 第一個查詢中的列類型必須與下面所有其他查詢中的列類型相匹配.
- 第一個查詢中的列名稱將用于標識整個結果集.
- 每個查詢中的列數必須相同.
現在,您可能想知道 區別在于使用union
和union all
.union
查詢將刪除重復項,而 union all
不會.這確實意味著在使用 union
而不是 union all
時,性能會受到很小的影響,但結果可能是值得的 - 我不會在此推測這類事情不過.
關于這個注釋,這里可能需要注意一些額外的注釋.
- 如果我們想對結果進行排序,我們可以使用
order by
但您不能再使用別名.在上面的查詢中,附加一個order by a.ID
會導致錯誤 - 就結果而言,該列被稱為ID
而不是a.ID
- 即使在兩個查詢中使用了相同的別名. - 我們只能有一個
order by
語句,而且必須是最后一個語句.
對于接下來的示例,我將向我們的表中添加一些額外的行.
我已將 Holden
添加到品牌表中.我還在 cars
中添加了一行,其 color
值為 12
- 在顏色表中沒有引用.
好的,老板又回來了,大聲喊叫 - *我想統計我們攜帶的每個品牌以及其中的汽車數量!` - 典型的,我們只是討論了一個有趣的部分,然后老板想要更多的工作.
Rightyo,所以我們需要做的第一件事就是獲得可能品牌的完整列表.
選擇一個品牌從品牌+--------+|品牌 |+--------+|福特 ||豐田 ||日產 ||智能 ||寶馬 ||霍爾頓 |+--------+6 行(0.00 秒)
現在,當我們將其加入到我們的汽車表中時,我們得到以下結果:
選擇一個品牌從品牌加入汽車 ba.ID=b.brand通過...分組一個品牌+--------+|品牌 |+--------+|寶馬 ||福特 ||日產 ||智能 ||豐田 |+--------+5 行(0.00 秒)
這當然是一個問題 - 我們沒有看到任何提及我添加的可愛的 Holden
品牌.
這是因為連接在兩個表中查找匹配的行.由于Holden
類型的汽車中沒有數據,因此不會返回.這是我們可以使用 outer
連接的地方.這將返回所有一個表中的結果,無論它們在另一個表中是否匹配:
選擇一個品牌從品牌左外連接車 ba.ID=b.brand通過...分組一個品牌+--------+|品牌 |+--------+|寶馬 ||福特 ||霍爾頓 ||日產 ||智能 ||豐田 |+--------+6 行(0.00 秒)
既然我們有了這個,我們可以添加一個可愛的聚合函數來計算并讓老板暫時擺脫困境.
選擇一個品牌,count(b.id) 作為 countOfBrand從品牌左外連接車 ba.ID=b.brand通過...分組一個品牌+--------+--------------+|品牌 |countOfBrand |+--------+--------------+|寶馬 |2 ||福特 |2 ||霍爾頓 |0 ||日產 |1 ||智能 |1 ||豐田 |5 |+--------+--------------+6 行(0.00 秒)
這樣,老板就躲開了.
現在,為了更詳細地解釋這一點,外連接可以是 left
或 right
類型.Left 或 Right 定義完全包含哪個表.left outer join
將包含左側表中的所有行,而(你猜對了)right external join
將包含右側表中的所有結果進入結果.
某些數據庫將允許完全外連接
,這將從兩個表中帶回結果(無論是否匹配),但并非所有數據庫都支持這種方式.
現在,我可能認為此時此刻,您想知道是否可以在查詢中合并連接類型 - 答案是肯定的,您絕對可以.
選擇b.品牌,c.顏色,count(a.id) 作為 countOfBrand從汽車右外連接品牌 bb.ID=a.brand加入顏色 ca.color=c.ID通過...分組一個品牌,c.顏色+--------+-------+--------------+|品牌 |顏色 |countOfBrand |+--------+-------+--------------+|福特 |藍色 |1 ||福特 |白色 |1 ||豐田 |黑色 |1 ||豐田 |綠色 |2 ||豐田 |紅色 |1 ||日產 |黑色 |1 ||智能 |白色 |1 ||寶馬 |藍色 |1 ||寶馬 |白色 |1 |+--------+-------+--------------+9 行(0.00 秒)
那么,為什么這不是預期的結果?這是因為雖然我們選擇了從汽車到品牌的外部連接,但在連接到顏色中并沒有指定 - 所以特定的連接只會帶回兩個表中匹配的結果.
以下查詢可用于獲得我們預期的結果:
選擇一個品牌,c.顏色,count(b.id) 作為 countOfBrand從品牌左外連接車 ba.ID=b.brand左外連接顏色 c在 b.color=c.ID 上通過...分組一個品牌,c.顏色+--------+-------+--------------+|品牌 |顏色 |countOfBrand |+--------+-------+--------------+|寶馬 |藍色 |1 ||寶馬 |白色 |1 ||福特 |藍色 |1 ||福特 |白色 |1 ||霍爾頓 |空 |0 ||日產 |黑色 |1 ||智能 |白色 |1 ||豐田 |空 |1 ||豐田 |黑色 |1 ||豐田 |綠色 |2 ||豐田 |紅色 |1 |+--------+-------+--------------+11 行(0.00 秒)
如我們所見,我們在查詢中有兩個外部聯接,結果按預期通過.
現在,你問的那些其他類型的連接怎么樣?交叉路口呢?
好吧,并非所有數據庫都支持 intersection
,但幾乎所有數據庫都允許您通過連接(或至少結構良好的 where 語句)創建交集.
Intersection 是一種連接類型,有點類似于上述的 union
- 但區別在于它僅返回相同的數據行(并且我確實意味著相同)在聯合加入的各種單獨查詢之間.只會返回在各方面都相同的行.
一個簡單的例子是這樣的:
選擇*從顏色在哪里ID>2相交選擇*從顏色在哪里id<4
雖然普通的 union
查詢將返回表的所有行(第一個查詢返回 ID>2
上的任何內容,第二個返回任何具有 ID<;4
) 這將導致一個完整的集合,交叉查詢將只返回匹配 id=3
的行,因為它滿足兩個條件.
現在,如果您的數據庫不支持 intersect
查詢,則可以使用以下查詢輕松完成上述操作:
選擇援助,a.顏色,a.油漆從顏色 a加入顏色 ba.ID=b.ID在哪里a.ID>2b.ID<4+----+-------+----------+|身份證 |顏色 |油漆|+----+-------+----------+|3 |藍色 |金屬 |+----+-------+----------+1 行(0.00 秒)
如果您希望使用本身不支持交集查詢的數據庫在兩個不同的表之間執行交集,則需要在表的每一列上創建一個連接.>
I would like to know the following:
- how to get data from multiple tables in my database?
- what types of methods are there to do this?
- what are joins and unions and how are they different from one another?
- When should I use each one compared to the others?
I am planning to use this in my (for example - PHP) application, but don't want to run multiple queries against the database, what options do I have to get data from multiple tables in a single query?
Note: I am writing this as I would like to be able to link to a well written guide on the numerous questions that I constantly come across in the PHP queue, so I can link to this for further detail when I post an answer.
The answers cover off the following:
- Part 1 - Joins and Unions
- Part 2 - Subqueries
- Part 3 - Tricks and Efficient Code
- Part 4 - Subqueries in the From Clause
- Part 5 - Mixed Bag of John's Tricks
Part 1 - Joins and Unions
This answer covers:
- Part 1
- Joining two or more tables using an inner join (See the wikipedia entry for additional info)
- How to use a union query
- Left and Right Outer Joins (this stackOverflow answer is excellent to describe types of joins)
- Intersect queries (and how to reproduce them if your database doesn't support them) - this is a function of SQL-Server (see info) and part of the reason I wrote this whole thing in the first place.
- Part 2
- Subqueries - what they are, where they can be used and what to watch out for
- Cartesian joins AKA - Oh, the misery!
There are a number of ways to retrieve data from multiple tables in a database. In this answer, I will be using ANSI-92 join syntax. This may be different to a number of other tutorials out there which use the older ANSI-89 syntax (and if you are used to 89, may seem much less intuitive - but all I can say is to try it) as it is much easier to understand when the queries start getting more complex. Why use it? Is there a performance gain? The short answer is no, but it is easier to read once you get used to it. It is easier to read queries written by other folks using this syntax.
I am also going to use the concept of a small caryard which has a database to keep track of what cars it has available. The owner has hired you as his IT Computer guy and expects you to be able to drop him the data that he asks for at the drop of a hat.
I have made a number of lookup tables that will be used by the final table. This will give us a reasonable model to work from. To start off, I will be running my queries against an example database that has the following structure. I will try to think of common mistakes that are made when starting out and explain what goes wrong with them - as well as of course showing how to correct them.
The first table is simply a color listing so that we know what colors we have in the car yard.
mysql> create table colors(id int(3) not null auto_increment primary key,
-> color varchar(15), paint varchar(10));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from colors;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| color | varchar(15) | YES | | NULL | |
| paint | varchar(10) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.01 sec)
mysql> insert into colors (color, paint) values ('Red', 'Metallic'),
-> ('Green', 'Gloss'), ('Blue', 'Metallic'),
-> ('White' 'Gloss'), ('Black' 'Gloss');
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> select * from colors;
+----+-------+----------+
| id | color | paint |
+----+-------+----------+
| 1 | Red | Metallic |
| 2 | Green | Gloss |
| 3 | Blue | Metallic |
| 4 | White | Gloss |
| 5 | Black | Gloss |
+----+-------+----------+
5 rows in set (0.00 sec)
The brands table identifies the different brands of the cars out caryard could possibly sell.
mysql> create table brands (id int(3) not null auto_increment primary key,
-> brand varchar(15));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from brands;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| brand | varchar(15) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.01 sec)
mysql> insert into brands (brand) values ('Ford'), ('Toyota'),
-> ('Nissan'), ('Smart'), ('BMW');
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> select * from brands;
+----+--------+
| id | brand |
+----+--------+
| 1 | Ford |
| 2 | Toyota |
| 3 | Nissan |
| 4 | Smart |
| 5 | BMW |
+----+--------+
5 rows in set (0.00 sec)
The model table will cover off different types of cars, it is going to be simpler for this to use different car types rather than actual car models.
mysql> create table models (id int(3) not null auto_increment primary key,
-> model varchar(15));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from models;
+-------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| model | varchar(15) | YES | | NULL | |
+-------+-------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)
mysql> insert into models (model) values ('Sports'), ('Sedan'), ('4WD'), ('Luxury');
Query OK, 4 rows affected (0.00 sec)
Records: 4 Duplicates: 0 Warnings: 0
mysql> select * from models;
+----+--------+
| id | model |
+----+--------+
| 1 | Sports |
| 2 | Sedan |
| 3 | 4WD |
| 4 | Luxury |
+----+--------+
4 rows in set (0.00 sec)
And finally, to tie up all these other tables, the table that ties everything together. The ID field is actually the unique lot number used to identify cars.
mysql> create table cars (id int(3) not null auto_increment primary key,
-> color int(3), brand int(3), model int(3));
Query OK, 0 rows affected (0.01 sec)
mysql> show columns from cars;
+-------+--------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------+------+-----+---------+----------------+
| id | int(3) | NO | PRI | NULL | auto_increment |
| color | int(3) | YES | | NULL | |
| brand | int(3) | YES | | NULL | |
| model | int(3) | YES | | NULL | |
+-------+--------+------+-----+---------+----------------+
4 rows in set (0.00 sec)
mysql> insert into cars (color, brand, model) values (1,2,1), (3,1,2), (5,3,1),
-> (4,4,2), (2,2,3), (3,5,4), (4,1,3), (2,2,1), (5,2,3), (4,5,1);
Query OK, 10 rows affected (0.00 sec)
Records: 10 Duplicates: 0 Warnings: 0
mysql> select * from cars;
+----+-------+-------+-------+
| id | color | brand | model |
+----+-------+-------+-------+
| 1 | 1 | 2 | 1 |
| 2 | 3 | 1 | 2 |
| 3 | 5 | 3 | 1 |
| 4 | 4 | 4 | 2 |
| 5 | 2 | 2 | 3 |
| 6 | 3 | 5 | 4 |
| 7 | 4 | 1 | 3 |
| 8 | 2 | 2 | 1 |
| 9 | 5 | 2 | 3 |
| 10 | 4 | 5 | 1 |
+----+-------+-------+-------+
10 rows in set (0.00 sec)
This will give us enough data (I hope) to cover off the examples below of different types of joins and also give enough data to make them worthwhile.
So getting into the grit of it, the boss wants to know The IDs of all the sports cars he has.
This is a simple two table join. We have a table that identifies the model and the table with the available stock in it. As you can see, the data in the model
column of the cars
table relates to the models
column of the cars
table we have. Now, we know that the models table has an ID of 1
for Sports
so lets write the join.
select
ID,
model
from
cars
join models
on model=ID
So this query looks good right? We have identified the two tables and contain the information we need and use a join that correctly identifies what columns to join on.
ERROR 1052 (23000): Column 'ID' in field list is ambiguous
Oh noes! An error in our first query! Yes, and it is a plum. You see, the query has indeed got the right columns, but some of them exist in both tables, so the database gets confused about what actual column we mean and where. There are two solutions to solve this. The first is nice and simple, we can use tableName.columnName
to tell the database exactly what we mean, like this:
select
cars.ID,
models.model
from
cars
join models
on cars.model=models.ID
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
| 2 | Sedan |
| 4 | Sedan |
| 5 | 4WD |
| 7 | 4WD |
| 9 | 4WD |
| 6 | Luxury |
+----+--------+
10 rows in set (0.00 sec)
The other is probably more often used and is called table aliasing. The tables in this example have nice and short simple names, but typing out something like KPI_DAILY_SALES_BY_DEPARTMENT
would probably get old quickly, so a simple way is to nickname the table like this:
select
a.ID,
b.model
from
cars a
join models b
on a.model=b.ID
Now, back to the request. As you can see we have the information we need, but we also have information that wasn't asked for, so we need to include a where clause in the statement to only get the Sports cars as was asked. As I prefer the table alias method rather than using the table names over and over, I will stick to it from this point onwards.
Clearly, we need to add a where clause to our query. We can identify Sports cars either by ID=1
or model='Sports'
. As the ID is indexed and the primary key (and it happens to be less typing), lets use that in our query.
select
a.ID,
b.model
from
cars a
join models b
on a.model=b.ID
where
b.ID=1
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)
Bingo! The boss is happy. Of course, being a boss and never being happy with what he asked for, he looks at the information, then says I want the colors as well.
Okay, so we have a good part of our query already written, but we need to use a third table which is colors. Now, our main information table cars
stores the car color ID and this links back to the colors ID column. So, in a similar manner to the original, we can join a third table:
select
a.ID,
b.model
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
where
b.ID=1
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 3 | Sports |
| 8 | Sports |
| 10 | Sports |
+----+--------+
4 rows in set (0.00 sec)
Damn, although the table was correctly joined and the related columns were linked, we forgot to pull in the actual information from the new table that we just linked.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
where
b.ID=1
+----+--------+-------+
| ID | model | color |
+----+--------+-------+
| 1 | Sports | Red |
| 8 | Sports | Green |
| 10 | Sports | White |
| 3 | Sports | Black |
+----+--------+-------+
4 rows in set (0.00 sec)
Right, that's the boss off our back for a moment. Now, to explain some of this in a little more detail. As you can see, the from
clause in our statement links our main table (I often use a table that contains information rather than a lookup or dimension table. The query would work just as well with the tables all switched around, but make less sense when we come back to this query to read it in a few months time, so it is often best to try to write a query that will be nice and easy to understand - lay it out intuitively, use nice indenting so that everything is as clear as it can be. If you go on to teach others, try to instill these characteristics in their queries - especially if you will be troubleshooting them.
It is entirely possible to keep linking more and more tables in this manner.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
While I forgot to include a table where we might want to join more than one column in the join
statement, here is an example. If the models
table had brand-specific models and therefore also had a column called brand
which linked back to the brands
table on the ID
field, it could be done as this:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
and b.brand=d.ID
where
b.ID=1
You can see, the query above not only links the joined tables to the main cars
table, but also specifies joins between the already joined tables. If this wasn't done, the result is called a cartesian join - which is dba speak for bad. A cartesian join is one where rows are returned because the information doesn't tell the database how to limit the results, so the query returns all the rows that fit the criteria.
So, to give an example of a cartesian join, lets run the following query:
select
a.ID,
b.model
from
cars a
join models b
+----+--------+
| ID | model |
+----+--------+
| 1 | Sports |
| 1 | Sedan |
| 1 | 4WD |
| 1 | Luxury |
| 2 | Sports |
| 2 | Sedan |
| 2 | 4WD |
| 2 | Luxury |
| 3 | Sports |
| 3 | Sedan |
| 3 | 4WD |
| 3 | Luxury |
| 4 | Sports |
| 4 | Sedan |
| 4 | 4WD |
| 4 | Luxury |
| 5 | Sports |
| 5 | Sedan |
| 5 | 4WD |
| 5 | Luxury |
| 6 | Sports |
| 6 | Sedan |
| 6 | 4WD |
| 6 | Luxury |
| 7 | Sports |
| 7 | Sedan |
| 7 | 4WD |
| 7 | Luxury |
| 8 | Sports |
| 8 | Sedan |
| 8 | 4WD |
| 8 | Luxury |
| 9 | Sports |
| 9 | Sedan |
| 9 | 4WD |
| 9 | Luxury |
| 10 | Sports |
| 10 | Sedan |
| 10 | 4WD |
| 10 | Luxury |
+----+--------+
40 rows in set (0.00 sec)
Good god, that's ugly. However, as far as the database is concerned, it is exactly what was asked for. In the query, we asked for for the ID
from cars
and the model
from models
. However, because we didn't specify how to join the tables, the database has matched every row from the first table with every row from the second table.
Okay, so the boss is back, and he wants more information again. I want the same list, but also include 4WDs in it.
This however, gives us a great excuse to look at two different ways to accomplish this. We could add another condition to the where clause like this:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
or b.ID=3
While the above will work perfectly well, lets look at it differently, this is a great excuse to show how a union
query will work.
We know that the following will return all the Sports cars:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
And the following would return all the 4WDs:
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=3
So by adding a union all
clause between them, the results of the second query will be appended to the results of the first query.
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=1
union all
select
a.ID,
b.model,
c.color
from
cars a
join models b
on a.model=b.ID
join colors c
on a.color=c.ID
join brands d
on a.brand=d.ID
where
b.ID=3
+----+--------+-------+
| ID | model | color |
+----+--------+-------+
| 1 | Sports | Red |
| 8 | Sports | Green |
| 10 | Sports | White |
| 3 | Sports | Black |
| 5 | 4WD | Green |
| 7 | 4WD | White |
| 9 | 4WD | Black |
+----+--------+-------+
7 rows in set (0.00 sec)
As you can see, the results of the first query are returned first, followed by the results of the second query.
In this example, it would of course have been much easier to simply use the first query, but union
queries can be great for specific cases. They are a great way to return specific results from tables from tables that aren't easily joined together - or for that matter completely unrelated tables. There are a few rules to follow however.
- The column types from the first query must match the column types from every other query below.
- The names of the columns from the first query will be used to identify the entire set of results.
- The number of columns in each query must be the same.
Now, you might be wondering what the difference is between using union
and union all
. A union
query will remove duplicates, while a union all
will not. This does mean that there is a small performance hit when using union
over union all
but the results may be worth it - I won't speculate on that sort of thing in this though.
On this note, it might be worth noting some additional notes here.
- If we wanted to order the results, we can use an
order by
but you can't use the alias anymore. In the query above, appending anorder by a.ID
would result in an error - as far as the results are concerned, the column is calledID
rather thana.ID
- even though the same alias has been used in both queries. - We can only have one
order by
statement, and it must be as the last statement.
For the next examples, I am adding a few extra rows to our tables.
I have added Holden
to the brands table.
I have also added a row into cars
that has the color
value of 12
- which has no reference in the colors table.
Okay, the boss is back again, barking requests out - *I want a count of each brand we carry and the number of cars in it!` - Typical, we just get to an interesting section of our discussion and the boss wants more work.
Rightyo, so the first thing we need to do is get a complete listing of possible brands.
select
a.brand
from
brands a
+--------+
| brand |
+--------+
| Ford |
| Toyota |
| Nissan |
| Smart |
| BMW |
| Holden |
+--------+
6 rows in set (0.00 sec)
Now, when we join this to our cars table we get the following result:
select
a.brand
from
brands a
join cars b
on a.ID=b.brand
group by
a.brand
+--------+
| brand |
+--------+
| BMW |
| Ford |
| Nissan |
| Smart |
| Toyota |
+--------+
5 rows in set (0.00 sec)
Which is of course a problem - we aren't seeing any mention of the lovely Holden
brand I added.
This is because a join looks for matching rows in both tables. As there is no data in cars that is of type Holden
it isn't returned. This is where we can use an outer
join. This will return all the results from one table whether they are matched in the other table or not:
select
a.brand
from
brands a
left outer join cars b
on a.ID=b.brand
group by
a.brand
+--------+
| brand |
+--------+
| BMW |
| Ford |
| Holden |
| Nissan |
| Smart |
| Toyota |
+--------+
6 rows in set (0.00 sec)
Now that we have that, we can add a lovely aggregate function to get a count and get the boss off our backs for a moment.
select
a.brand,
count(b.id) as countOfBrand
from
brands a
left outer join cars b
on a.ID=b.brand
group by
a.brand
+--------+--------------+
| brand | countOfBrand |
+--------+--------------+
| BMW | 2 |
| Ford | 2 |
| Holden | 0 |
| Nissan | 1 |
| Smart | 1 |
| Toyota | 5 |
+--------+--------------+
6 rows in set (0.00 sec)
And with that, away the boss skulks.
Now, to explain this in some more detail, outer joins can be of the left
or right
type. The Left or Right defines which table is fully included. A left outer join
will include all the rows from the table on the left, while (you guessed it) a right outer join
brings all the results from the table on the right into the results.
Some databases will allow a full outer join
which will bring back results (whether matched or not) from both tables, but this isn't supported in all databases.
Now, I probably figure at this point in time, you are wondering whether or not you can merge join types in a query - and the answer is yes, you absolutely can.
select
b.brand,
c.color,
count(a.id) as countOfBrand
from
cars a
right outer join brands b
on b.ID=a.brand
join colors c
on a.color=c.ID
group by
a.brand,
c.color
+--------+-------+--------------+
| brand | color | countOfBrand |
+--------+-------+--------------+
| Ford | Blue | 1 |
| Ford | White | 1 |
| Toyota | Black | 1 |
| Toyota | Green | 2 |
| Toyota | Red | 1 |
| Nissan | Black | 1 |
| Smart | White | 1 |
| BMW | Blue | 1 |
| BMW | White | 1 |
+--------+-------+--------------+
9 rows in set (0.00 sec)
So, why is that not the results that were expected? It is because although we have selected the outer join from cars to brands, it wasn't specified in the join to colors - so that particular join will only bring back results that match in both tables.
Here is the query that would work to get the results that we expected:
select
a.brand,
c.color,
count(b.id) as countOfBrand
from
brands a
left outer join cars b
on a.ID=b.brand
left outer join colors c
on b.color=c.ID
group by
a.brand,
c.color
+--------+-------+--------------+
| brand | color | countOfBrand |
+--------+-------+--------------+
| BMW | Blue | 1 |
| BMW | White | 1 |
| Ford | Blue | 1 |
| Ford | White | 1 |
| Holden | NULL | 0 |
| Nissan | Black | 1 |
| Smart | White | 1 |
| Toyota | NULL | 1 |
| Toyota | Black | 1 |
| Toyota | Green | 2 |
| Toyota | Red | 1 |
+--------+-------+--------------+
11 rows in set (0.00 sec)
As we can see, we have two outer joins in the query and the results are coming through as expected.
Now, how about those other types of joins you ask? What about Intersections?
Well, not all databases support the intersection
but pretty much all databases will allow you to create an intersection through a join (or a well structured where statement at the least).
An Intersection is a type of join somewhat similar to a union
as described above - but the difference is that it only returns rows of data that are identical (and I do mean identical) between the various individual queries joined by the union. Only rows that are identical in every regard will be returned.
A simple example would be as such:
select
*
from
colors
where
ID>2
intersect
select
*
from
colors
where
id<4
While a normal union
query would return all the rows of the table (the first query returning anything over ID>2
and the second anything having ID<4
) which would result in a full set, an intersect query would only return the row matching id=3
as it meets both criteria.
Now, if your database doesn't support an intersect
query, the above can be easily accomlished with the following query:
select
a.ID,
a.color,
a.paint
from
colors a
join colors b
on a.ID=b.ID
where
a.ID>2
and b.ID<4
+----+-------+----------+
| ID | color | paint |
+----+-------+----------+
| 3 | Blue | Metallic |
+----+-------+----------+
1 row in set (0.00 sec)
If you wish to perform an intersection across two different tables using a database that doesn't inherently support an intersection query, you will need to create a join on every column of the tables.
這篇關于SQL查詢從多個表返回數據的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!