在步骤中,EMR群集创建失败

我第一次尝试使用Lambda函数创建EMR群集失败,并显示以下错误。我打算使用script-runner.jar来启动位于S3存储桶中的python脚本。有人可以帮助我了解此错误吗?我到底想念什么?

2019-11-21T20:34:59.990Z INFO Ensure step 1 jar file s3a://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar
INFO Failed to download: s3a://<region>.elasticmapreduce/libs/script-runner/script-runner.jar
java.io.IOException: Unable to download 's3a://<region>.elasticmapreduce/libs/script-runner/script-runner.jar'. Only s3 + local files are supported
    at aws157.instancecontroller.util.S3Wrapper.fetchHadoopFileToLocal(S3Wrapper.java:353)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner$Runner.<init>(HadoopJarStepRunner.java:243)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:152)
    at aws157.instancecontroller.master.steprunner.HadoopJarStepRunner.createRunner(HadoopJarStepRunner.java:146)
    at aws157.instancecontroller.master.steprunner.StepExecutor.runStep(StepExecutor.java:136)
    at aws157.instancecontroller.master.steprunner.StepExecutor.run(StepExecutor.java:70)
    at aws157.instancecontroller.master.steprunner.StepExecutionmanager.enqueueStep(StepExecutionmanager.java:246)
    at aws157.instancecontroller.master.steprunner.StepExecutionmanager.doRun(StepExecutionmanager.java:193)
    at aws157.instancecontroller.master.steprunner.StepExecutionmanager.access$000(StepExecutionmanager.java:33)
    at aws157.instancecontroller.master.steprunner.StepExecutionmanager$1.run(StepExecutionmanager.java:94)

我的松散编写的lambda函数如下:

#!/usr/bin/python
# -*- coding: utf-8 -*-
import json
import boto3
import datetime


def lambda_handler(event,context):
    print ('Creating EMR')
    connection = boto3.client('emr',region_name='us-east-1')
    print (event)

    cluster_id = connection.run_job_flow(
        Name='MyTest',VisibleToAllUsers=True,JobFlowRole='EMR_EC2_DefaultRole',ServiceRole='EMR_DefaultRole',LogUri='s3://bucket-emr/logs',ReleaseLabel='emr-5.21.0',Applications=[{'Name': 'Hadoop'},{'Name': 'Spark'}],Instances={
            'InstanceGroups': [{
                'Name': 'Master nodes','Market': 'ON_DEMAND','InstanceRole': 'MASTER','InstanceType': 'm3.xlarge','InstanceCount': 1,},{
                'Name': 'Slave nodes','Market': 'SPOT','InstanceRole': 'CORE','InstanceCount': 2,}],'KeepJobFlowAliveWhenNoSteps': True,'Ec2KeyName': 'keys-kvp','Ec2SubnetId': 'subnet-dsb65490','EmrManagedMasterSecurityGroup': 'sg-0daa54d041d1033','EmrManagedSlaveSecurityGroup': 'sg-0daa54d041d1033',Configurations=[{
            "Classification":"spark-env","Properties":{},"Configurations":[{
                "Classification":"export","Properties":{
                    "PYSPARK_PYTHON":"python36","PYSPARK_DRIVER_PYTHON":"python36"
                }
            }]
            }],Steps=[{
            'Name': 'mystep','actionOnFailure': 'TERMINATE_CLUSTER','HadoopJarStep': {
                'Jar': 's3a://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar','Args': [
                    '/home/hadoop/spark/bin/spark-submit','--deploy-mode','cluster','--master','yarn','s3a://inscape-script/wordcount.py',]
            }
        }]
        )

    return 'Started cluster {}'.format(cluster_id)

在创建集群时我缺少什么?提前致谢。

kubikiri 回答:在步骤中,EMR群集创建失败

您是否可以尝试将“ Jar”参数更改为此,

'Jar': 's3://us-east-1.elasticmapreduce/libs/script-runner/script-runner.jar',

https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hadoop-script.html

您还可以通过将“ Jar”参数更改为

来尝试使用命令运行器
/var/lib/aws/emr/step-runner/hadoop-jars/command-runner.jar
本文链接:https://www.f2er.com/3054908.html

大家都在问