AWS EKS Pod execution role

Ngchiwa Ng
3 min readJan 1, 2020

--

This article is talking about how aws sdk get privilege to access aws service in EKS

the story …

Recently, we migrate our application from ECS to EKS. and we found our application calls aws service to be slower, and we try to understand the reason

Our testing code

//test.js
;(async () => {
const params = {
Message: 'hello message',
TopicArn: <YOUR Topic Arn>
}
const AWS = require('aws-sdk')
AWS.config.update({
region: 'ap-southeast-1'
})
console.time('init')
const sns = new AWS.SNS()
console.timeEnd('init')
console.time('pub')
await sns.publish(params).promise() //1st publish message
console.timeEnd('pub')
console.time('pub2')
await sns.publish(params).promise()//2nd publish message
console.timeEnd('pub2')
})()

and running our code in Pod (EKS)

node test.js
init: 4.223ms
pub: 4495.597ms //EKS result
pub2: 40.573ms

Why first publish 1st message spend over 4 sec in EKS?

and this code running in ECS

init: 6.839ms
pub: 154.057ms //ECS result
pub2: 34.925ms

and we add console.log in aws-sdk to find out the answer

node_modules/aws-sdk/lib/http/node.jsAWS.NodeHttpClient = AWS.util.inherit({
handleRequest: function handleRequest(httpRequest, httpOptions, callback, errCallback) {
console.log("endpoint: ", httpRequest.endpoint.href) //ADD code for tracing

run testing code again in ECS

init: 4.824ms
endpoint: http://169.254.170.2/v2/credentials/xxx //<
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 182.565ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 29.538ms

run testing code again in EKS

init: 4.211ms
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/meta-data/iam/security-credentials/
endpoint: http://169.254.169.254/latest/meta-data/iam/security-credentials/XXXRole
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 4441.124ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 33.226ms

it is very different behavior between ECS and EKS

ECS just call http://169.254.170.2/v2/credentials/xxxx to get credentials

EKS calls 4 times /latest/api/token and last call latest/meta-data/iam/security-credentials/<EC2 Role> to get credentials, when call /latest/api/token fail, it will sleep 1 sec and retry.

for more understanding why different , we need know more about CredentialProviderChain

it is the chain to get credential

ECS get credential, will success in ECSCredentials

EKS get credential

it gets the credentials from AWS.EC2MetaCredentials

How to fix it

fix it from getting token in AWS.TokenFileWebIdentityCredentials

more detail : https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html

in this case, because we do not specify service account role for pod, so it will get credential in EC2MetadataCredials.

when we adopt web identify , the result of running test code,

node sns.js
init: 4.540ms
endpoint: https://sts.amazonaws.com/
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 2768.026ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 31.137ms

it seems faster. no more 4 times /latest/api/token :)

Tips:

if you want to be more faster, no new more aws-sdk object, use singleton, you can see that calling 2nd publish message just spend 31 ms, no need get credential again.

endpoint:  https://sns.ap-southeast-1.amazonaws.com/
pub2: 31.137ms

thank you :)

ref:

--

--

Ngchiwa Ng
Ngchiwa Ng

Written by Ngchiwa Ng

Backend/iOS Engineer, rock the world

No responses yet