問(wèn)題描述
我正在使用 pyspark 框架更新 mysql 數(shù)據(jù)庫(kù),并在 AWS Glue 服務(wù)上運(yùn)行.
I am working on updating a mysql database using pyspark framework, and running on AWS Glue services.
我有一個(gè)如下的數(shù)據(jù)框:
I have a dataframe as follows:
我有一個(gè)主鍵 ZIP_CODE,我需要確保沒(méi)有重復(fù)鍵或主鍵異常,因此我使用 INSERT INTO .... ON DUPLICATE KEYS.
I have a primary key ZIP_CODE, and I need to ensure, there is no duplicate keys, or primary key exceptions, and hence am using INSERT INTO .... ON DUPLICATE KEYS.
而且由于我有不止一行要插入/更新,所以我在 python 中使用了數(shù)組來(lái)循環(huán)記錄,并對(duì)數(shù)據(jù)庫(kù)執(zhí)行 INSERT.代碼如下:
And since I have more than one rows to insert/update, I have used for array in python to loop through the records, and perform INSERT into database. The code is as follows:
在運(yùn)行上述插入查詢(xún)函數(shù)時(shí),我收到以下錯(cuò)誤消息,無(wú)法獲得有關(guān)錯(cuò)誤的任何線(xiàn)索.請(qǐng)幫忙.
When running the above insert query function, I am getting the following error message, couldn't get any clue on the error. Please help.
如果我只是嘗試打印一行的值,則會(huì)按如下方式打印值:
If i simply try to print the values for one row, am getting the values printed as follows:
謝謝.我正在研究 AWS Glue/Pyspark,所以我需要使用原生 Python 庫(kù).
Thanks. I am working on AWS Glue/Pyspark, so I need to use native python libraries.
推薦答案
以下插入查詢(xún)有效,帶有 for 循環(huán).
The following insert query works, with a for loop.
結(jié)果輸出:
謝謝.希望對(duì)大家有所參考.
Thanks. Hope this will be of reference to others.
這篇關(guān)于使用 INSERT INTO table ON DUPLICATE KEY 時(shí)出錯(cuò),使用 for 循環(huán)數(shù)組的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!