問題描述
我正在使用 pyspark 框架更新 mysql 數據庫,并在 AWS Glue 服務上運行.
I am working on updating a mysql database using pyspark framework, and running on AWS Glue services.
我有一個如下的數據框:
I have a dataframe as follows:
我有一個主鍵 ZIP_CODE,我需要確保沒有重復鍵或主鍵異常,因此我使用 INSERT INTO .... ON DUPLICATE KEYS.
I have a primary key ZIP_CODE, and I need to ensure, there is no duplicate keys, or primary key exceptions, and hence am using INSERT INTO .... ON DUPLICATE KEYS.
而且由于我有不止一行要插入/更新,所以我在 python 中使用了數組來循環記錄,并對數據庫執行 INSERT.代碼如下:
And since I have more than one rows to insert/update, I have used for array in python to loop through the records, and perform INSERT into database. The code is as follows:
在運行上述插入查詢函數時,我收到以下錯誤消息,無法獲得有關錯誤的任何線索.請幫忙.
When running the above insert query function, I am getting the following error message, couldn't get any clue on the error. Please help.
如果我只是嘗試打印一行的值,則會按如下方式打印值:
If i simply try to print the values for one row, am getting the values printed as follows:
謝謝.我正在研究 AWS Glue/Pyspark,所以我需要使用原生 Python 庫.
Thanks. I am working on AWS Glue/Pyspark, so I need to use native python libraries.
推薦答案
以下插入查詢有效,帶有 for 循環.
The following insert query works, with a for loop.
結果輸出:
謝謝.希望對大家有所參考.
Thanks. Hope this will be of reference to others.
這篇關于使用 INSERT INTO table ON DUPLICATE KEY 時出錯,使用 for 循環數組的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!