問(wèn)題描述
如何使用 Python 腳本創(chuàng)建一個(gè)新表,該腳本使用兩個(gè)現(xiàn)有表作為輸入?例如,通過(guò)使用
如何更改該設(shè)置以引用多個(gè)表?
<小時(shí)>樣本數(shù)據(jù)
這里有兩個(gè)可以存儲(chǔ)為 CSV 文件并使用 Home > 加載的表格.獲取數(shù)據(jù) >文本/CSV
表 1
日期,值12108-10-12,12108-10-13,22108-10-14,32108-10-15,42108-10-16,5
表2
日期,值22108-10-12,102108-10-13,112108-10-14,122108-10-15,132108-10-16,14
這是針對(duì) R 腳本描述的相同挑戰(zhàn)
詳情:
必須非常仔細(xì)地遵循上面的列表才能使事情正常進(jìn)行.所以這里是所有骯臟的小細(xì)節(jié):
1. 使用 Get Data
將表格作為 CSV 文件加載到 Power BI Desktop 中.
2.點(diǎn)擊編輯查詢
.
3.在Table1
中,點(diǎn)擊Date列
旁邊的符號(hào),選擇Text
并點(diǎn)擊替換當(dāng)前
4. 對(duì) Table2
5.在Home
選項(xiàng)卡上,點(diǎn)擊輸入數(shù)據(jù)
6.在出現(xiàn)的框中,除了點(diǎn)擊OK
之外別無(wú)他法.
7. 這將在 Queries
下插入一個(gè)名為 Table3
的空表,這正是我們想要的:
8.進(jìn)入Transform
標(biāo)簽并點(diǎn)擊Run Python Script
:
9. 這將打開 Run Python Script
編輯器.您可以從這里開始編寫腳本,但這會(huì)使接下來(lái)的步驟變得不必要地復(fù)雜.所以什么都不做,只點(diǎn)擊OK
:
10. 在公式欄中,您將看到公式 = Python.Execute("# 'dataset' 保存此腳本的輸入數(shù)據(jù)#(lf)",[dataset=#"更改類型"])
.請(qǐng)注意,您在 Applied Steps 下有一個(gè)名為 Run Python Script
的新步驟:
11. 上面的截圖中有幾個(gè)有趣的細(xì)節(jié),但首先我們要分解函數(shù) = Python.Execute("# 'dataset' 的參數(shù)此腳本的輸入數(shù)據(jù)#(lf)",[dataset=#"Changed Type"])
.
"# 'dataset'" 部分保存此腳本的輸入數(shù)據(jù)#(lf)"
只是插入您可以在 Python 腳本編輯器中看到的注釋
. 所以它并不重要,但你也不能把它留空.我喜歡使用更短的東西,比如 "# Python:"
.
[dataset=#"Changed Type"]
部分是一個(gè)指針,指向處于Changed Type
Table3>.因此,如果您在插入 Python 腳本之前所做的最后一件事不是更改數(shù)據(jù)類型,那么這部分看起來(lái)會(huì)有所不同.然后使用 dataset
作為 pandas 數(shù)據(jù)框,可以在您的 python 腳本中使用該表.考慮到這一點(diǎn),我們可以對(duì)公式進(jìn)行一些非常有用的更改:
12. 將公式欄更改為 = Python.Execute("# Python:",[df1=Table1, df2=Table2])
并點(diǎn)擊 輸入代碼>.這將使
Table1
和 Table2
可用于您的 Python 腳本作為兩個(gè)分別名為 df1
和 df2
的 pandas 數(shù)據(jù)幀.
13.點(diǎn)擊Applied Steps
下Run Python script
旁邊的齒輪(還是一朵花?)圖標(biāo):
14. 插入以下代碼段:
代碼:
將 pandas 導(dǎo)入為 pddf3 = pd.merge(df1, df2, how = 'left', on = ['Date'])df3['Value3'] = df1['Value1']*df2['Value2']
這將在 Date 列
上連接 df1
和 df2
,并插入一個(gè)名為 Value3
的新計(jì)算列.不太花哨,但通過(guò)此設(shè)置,您可以任何在 Power BI 世界中使用您的數(shù)據(jù)并借助 Python 的強(qiáng)大功能.
15.點(diǎn)擊OK
,你會(huì)看到:
您將看到 df3
在藍(lán)色方塊中的輸入數(shù)據(jù)框 df1
和 df2
下列出.如果您已在 Python 腳本中指定任何其他數(shù)據(jù)框作為計(jì)算步驟,它們也會(huì)在此處列出.要將其變成 Power BI 可訪問(wèn)的表格,只需單擊綠色箭頭所示的 Table
.
16. 就是這樣:
請(qǐng)注意,Date 列
的數(shù)據(jù)類型默認(rèn)設(shè)置為Date
,但您可以將其更改為Text
,如前所述.
點(diǎn)擊首頁(yè)>關(guān)閉并應(yīng)用
退出 Power Query 編輯器
并返回到 Power BI Desktop 中所有開始的位置.
How can you create a new table with a Python script that uses two existing tables as input? For example by performing a left join
using pandas merge?
Some details:
Using Home > Edit queries
you can utilize Python under Transform > Run Python Script
. This opens a Run Python Script
dialog box where your're told that '#dataset' holds the input data for this script
. And you'll find the same phrase if you just click OK
and look at the formula bar:
= Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
This also adds a new step under Applied Steps
called Run Python script
where you can edit the Python script by clicking the gear symbol on the right:
How can you change that setup to reference more than one table?
Sample data
Here are two tables that can be stored as CSV files and loaded using Home > Get Data > Text/CSV
Table1
Date,Value1
2108-10-12,1
2108-10-13,2
2108-10-14,3
2108-10-15,4
2108-10-16,5
Table2
Date,Value2
2108-10-12,10
2108-10-13,11
2108-10-14,12
2108-10-15,13
2108-10-16,14
This is the same challenge that has been described for R scripts here. That setup should work for Python too. However, I've found that that approach has one drawback: It stores the new joined or calculated table as an edited version of one of the previous tables. The following suggestion will demonstrate how you can produce a completely new calculated table without altering the input tables (except changing the data type of the Date columns from Date
to Text
because of this.)
Short answer:
In the Power Query editor
, follow these steps:
Change the data type of the
Date columns
in both columns toText
.Click
Enter Data
. Only clickOK
.Activate the new
Table3
and useTransform > Run Python Script
. Only clickOK
.Activate the formula bar and replace what's in it with
= Python.Execute("# Python:",[df1=Table1, df2=Table2])
. ClickEnter
.If you're prompted to do so, click
Edit Permission
andRun
in the next step.Under
Applied Steps
, in the new step namedRun Python Script
, click the gear icon to open theRun Python Script
editor.Insert the snippet below and click
OK
.
Code:
import pandas as pd
df3 = pd.merge(df1, df2, how = 'left', on = ['Date'])
df3['Value3'] = df1['Value1']*df2['Value2']
Next to df3
, click Table
, and that's it:
The details:
The list above will have to be followed very carefully to get things working. So here are all of the dirty little details:
1. Load the tables as CSV files in Power BI Desktop using Get Data
.
2. Click Edit Queries
.
3. In Table1
, Click the symbol next to the Date column
, select Text
and click Replace Current
4. Do the same for Table2
5. On the Home
tab, click Enter Data
6. In the appearing box, do nothing else than clicking OK
.
7. This will insert an empty table named Table3
under Queries
, and that's exactly what we want:
8. Go to the Transform
tab and click Run Python Script
:
9. This opens the Run Python Script
editor. And you can start writing you scripts right here, but that will make things unnecessarily complicated in the next steps. So do nothing but click OK
:
10. In the formula bar you will se the formula = Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
. And notice that you've got a new step under Applied Steps named Run Python Script
:
11. There are several interesting details in the screenshot above, but first we're going to break down the arguments of the function = Python.Execute("# 'dataset' holds the input data for this script#(lf)",[dataset=#"Changed Type"])
.
The part "# 'dataset'" holds the input data for this script#(lf)"
simply inserts the comment that you can see in the Python Script Editor
. So it's not important, but you can't just leave it blank either. I like to use something shorter like "# Python:"
.
The part [dataset=#"Changed Type"]
is a pointer to the empty Table3
in the state that it is under Changed Type
. So if the last thing that you do before inserting a Python Script is something else than changing data types, this part will look different. The table is then made available in your python script using dataset
as a pandas data frame. With this in mind, we can make som very useful changes to the formula:
12. Change the formula bar to = Python.Execute("# Python:",[df1=Table1, df2=Table2])
and hit Enter
. This will make Table1
and Table2
available for your Python scripts as two pandas dataframes named df1
and df2
, respectively.
13. Click the gear (or is it a flower?) icon next to Run Python script
under Applied Steps
:
14. Insert the following snippet:
Code:
import pandas as pd
df3 = pd.merge(df1, df2, how = 'left', on = ['Date'])
df3['Value3'] = df1['Value1']*df2['Value2']
This will join df1
and df2
on the Date column
, and insert a new calculated column named Value3
. Not too fancy, but with this setup you can do anything you want with your data in the world of Power BI and with the power of Python.
15. Click OK
and you'll se this:
You'll see df3
listed under the input dataframes df1
and df2
in the blue square. If you've assigned any other dataframes as a step in your calculations in the Python script, they will be listed here too. In order to turn it into an accessible table for Power BI, just click Table
as indicated by the green arrow.
16. And that's it:
Note that the data type of the Date column
is set to Date
by default, but you can change that to Text
as explained earlier.
Click Home > Close&Apply
to exit the Power Query Editor
and go back to where it all started in Power BI Desktop.
這篇關(guān)于Power BI:如何在 Power Query 編輯器中將 Python 與多個(gè)表一起使用?的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!