几个小时后,Google Cloud DataFlow作业会发出警报

使用2.11.0版本运行DataFlow流作业。 几个小时后,我收到以下身份验证错误:

File "streaming_twitter.py",line 188,in <lambda> 
File "streaming_twitter.py",line 102,in estimate 
File "streaming_twitter.py",line 84,in estimate_aiplatform 
File "streaming_twitter.py",line 42,in get_service 
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_helpers.py",line 130,in positional_wrapper return wrapped(*args,**kwargs) 
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py",line 227,in build credentials=credentials) 
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_helpers.py",line 363,in build_from_document credentials = _auth.default_credentials() 
File "/usr/local/lib/python2.7/dist-packages/googleapiclient/_auth.py",in default_credentials credentials,_ = google.auth.default() 
File "/usr/local/lib/python2.7/dist-packages/google/auth/_default.py",line 306,in default raise exceptions.DefaultCredentialsError(_HELP_MESSAGE) DefaultCredentialsError: Could not automatically determine credentials. Please set GOOGLE_APPLICATION_CREDENTIALS or explicitly create credentials and re-run the application. 

此数据流作业对AI平台预测执行API请求 似乎是身份验证令牌即将到期。

代码段:

def get_service():
    # If it hasn't been instantiated yet: do it now
    return discovery.build('ml','v1',discoveryServiceUrl=DISCOVERY_SERVICE,cache_discovery=True)

我尝试在服务功能中添加以下几行:

    os.environ[
        "GOOGLE_APPLICATION_CREDENTIALS"] = "/tmp/key.json"

但是我得到了

DefaultCredentialsError: File "/tmp/key.json" was not found. [while running 'generatedPtransform-930']

我认为是因为文件不在DataFlow机器中。 另一个选择是在构建方法中使用developerKey参数,但AI Platform预测似乎不支持该参数,我得到了错误:

Expected OAuth 2 access token,login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project."> [while running 'generatedPtransform-22624']

要了解如何解决它以及最佳实践是什么?

有什么建议吗?

tiantianwlx 回答:几个小时后,Google Cloud DataFlow作业会发出警报

设置os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = '/tmp/key.json'仅在DirectRunner本地进行。一旦部署到像Dataflow这样的分布式运行程序,每个工作人员将无法找到 local 文件/tmp/key.json

如果希望每个工作人员使用一个特定的服务帐户,则可以告诉Beam使用哪个服务帐户来标识工作人员。

首先,grant the roles/dataflow.worker role to the service account您希望您的工作人员使用。无需下载服务帐户密钥文件:)

然后,如果让PipelineOptions解析命令行参数,则可以简单地使用service_account_email option,并在运行管道时像--service_account_email your-email@your-project.iam.gserviceaccount.com一样指定它。

您的GOOGLE_APPLICATION_CREDENTIALS指向的服务帐户仅用于开始作业,但是每个工作人员都使用service_account_email指定的服务帐户。如果未传递service_account_email,则默认为GOOGLE_APPLICATION_CREDENTIALS文件中的电子邮件。

本文链接:https://www.f2er.com/3154720.html

大家都在问