在线小视频,久久精品1,国产精品国产自产拍高清av

本文介紹了python中的Hadoop Streaming Job失敗錯誤的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

來自本指南，我已經成功運行了示例練習.但是在運行我的 mapreduce 作業時，我收到以下錯誤
ERROR streaming.StreamJob:作業不成功！ 2016 年 10 月 12 日 17:13:38 信息流.StreamJob:killJob... 流式傳輸作業失??！
來自日志文件的錯誤

From this guide, I have successfully run the sample exercise. But on running my mapreduce job, I am getting the following error
ERROR streaming.StreamJob: Job not Successful! 10/12/16 17:13:38 INFO streaming.StreamJob: killJob... Streaming Job Failed!
Error from the log file

java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:311)
at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:545)
at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:132)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)

映射器.py

import sys

i=0

for line in sys.stdin:
    i+=1
    count={}
    for word in line.strip().split():
        count[word]=count.get(word,0)+1
    for word,weight in count.items():
        print '%s	%s:%s' % (word,str(i),str(weight))

reducer.py

Reducer.py

import sys

keymap={}
o_tweet="2323"
id_list=[]
for line in sys.stdin:
    tweet,tw=line.strip().split()
    #print tweet,o_tweet,tweet_id,id_list
    tweet_id,w=tw.split(':')
    w=int(w)
    if tweet.__eq__(o_tweet):
        for i,wt in id_list:
            print '%s:%s	%s' % (tweet_id,i,str(w+wt))
        id_list.append((tweet_id,w))
    else:
        id_list=[(tweet_id,w)]
        o_tweet=tweet

[edit] 運行作業的命令:

[edit] command to run the job:

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-0.20.0-streaming.jar -file /home/hadoop/mapper.py -mapper /home/hadoop/mapper.py -file /home/hadoop/reducer.py -reducer /home/hadoop/reducer.py -input my-input/* -output my-output

輸入是任意隨機序列的句子.

Input is any random sequence of sentences.

謝謝，

推薦答案

你的 -mapper 和 -reducer 應該只是腳本名稱.

Your -mapper and -reducer should just be the script name.

hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-0.20.0-streaming.jar -file /home/hadoop/mapper.py -mapper mapper.py -file /home/hadoop/reducer.py -reducer reducer.py -input my-input/* -output my-output

當您的腳本位于 hdfs 內另一個文件夾中的作業中時，該作業與執行為."的嘗試任務相關.(僅供參考，如果您想要添加另一個文件，例如查找表，您可以在 Python 中打開它，就好像它與您的腳本在同一目錄中一樣，而您的腳本在 M/R 作業中)

When your scripts are in the job that is in another folder within hdfs which is relative to the attempt task executing as "." (FYI if you ever want to ad another -file such as a look up table you can open it in Python as if it was in the same dir as your scripts while your script is in M/R job)

還要確保你有 chmod a+x mapper.py 和 chmod a+x reducer.py

also make sure you have chmod a+x mapper.py and chmod a+x reducer.py

這篇關于python中的Hadoop Streaming Job失敗錯誤的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

python中的Hadoop Streaming Job失敗錯誤

問題描述

推薦答案

相關文檔推薦