問(wèn)題描述
我正在嘗試運(yùn)行一個(gè) hadoop-streaming python 作業(yè).
I am trying to run a hadoop-streaming python job.
bin/hadoop jar contrib/streaming/hadoop-0.20.1-streaming.jar
-D stream.non.zero.exit.is.failure=true
-input /ixml
-output /oxml
-mapper scripts/mapper.py
-file scripts/mapper.py
-inputreader "StreamXmlRecordReader,begin=channel,end=/channel"
-jobconf mapred.reduce.tasks=0
我確保 mapper.py 擁有所有權(quán)限.它錯(cuò)誤地說(shuō)
I made sure mapper.py has all the permissions. It errors out saying
Caused by: java.io.IOException: Cannot run program "mapper.py":
error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:460)
at org.apache.hadoop.streaming.PipeMapRed.configure(PipeMapRed.java:214)
... 19 more
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.(UNIXProcess.java:53)
at java.lang.ProcessImpl.start(ProcessImpl.java:91)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:453)
我嘗試將 mapper.py 復(fù)制到 hdfs 并提供相同的 hdfs://localhost/mapper.py 鏈接,但這也不起作用!有關(guān)如何修復(fù)此錯(cuò)誤的任何想法?.
I tried copying mapper.py to hdfs and give the same hdfs://localhost/mapper.py link, that does not work too! Any thoughts on how to fix this bug?.
推薦答案
查看 HadoopStreaming wiki 上的示例頁(yè)面,看來(lái)你應(yīng)該改一下
Looking at the example on the HadoopStreaming wiki page, it seems that you should change
-mapper scripts/mapper.py
-file scripts/mapper.py
到
-mapper mapper.py
-file scripts/mapper.py
因?yàn)閭魉偷奈募M(jìn)入工作目錄".您可能還需要直接指定 python 解釋器:
since "shipped files go to the working directory". You might also need to specify the python interpreter directly:
-mapper /path/to/python mapper.py
-file scripts/mapper.py
這篇關(guān)于Hadoop Streaming - 找不到文件錯(cuò)誤的文章就介紹到這了,希望我們推薦的答案對(duì)大家有所幫助,也希望大家多多支持html5模板網(wǎng)!