AWS EKS Pod execution role
This article is talking about how aws sdk get privilege to access aws service in EKS
the story …
Recently, we migrate our application from ECS to EKS. and we found our application calls aws service to be slower, and we try to understand the reason
Our testing code
//test.js
;(async () => {
const params = {
Message: 'hello message',
TopicArn: <YOUR Topic Arn>
}
const AWS = require('aws-sdk')
AWS.config.update({
region: 'ap-southeast-1'
}) console.time('init')
const sns = new AWS.SNS()
console.timeEnd('init') console.time('pub')
await sns.publish(params).promise() //1st publish message
console.timeEnd('pub')
console.time('pub2')
await sns.publish(params).promise()//2nd publish message
console.timeEnd('pub2')})()
and running our code in Pod (EKS)
node test.js
init: 4.223ms
pub: 4495.597ms //EKS result
pub2: 40.573ms
Why first publish 1st message spend over 4 sec in EKS?
and this code running in ECS
init: 6.839ms
pub: 154.057ms //ECS result
pub2: 34.925ms
and we add console.log in aws-sdk to find out the answer
node_modules/aws-sdk/lib/http/node.jsAWS.NodeHttpClient = AWS.util.inherit({
handleRequest: function handleRequest(httpRequest, httpOptions, callback, errCallback) {
console.log("endpoint: ", httpRequest.endpoint.href) //ADD code for tracing
run testing code again in ECS
init: 4.824ms
endpoint: http://169.254.170.2/v2/credentials/xxx //<
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 182.565ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 29.538ms
run testing code again in EKS
init: 4.211ms
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/api/token
endpoint: http://169.254.169.254/latest/meta-data/iam/security-credentials/
endpoint: http://169.254.169.254/latest/meta-data/iam/security-credentials/XXXRole
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 4441.124ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 33.226ms
it is very different behavior between ECS and EKS
ECS just call http://169.254.170.2/v2/credentials/xxxx to get credentials
EKS calls 4 times /latest/api/token and last call latest/meta-data/iam/security-credentials/<EC2 Role> to get credentials, when call /latest/api/token fail, it will sleep 1 sec and retry.
for more understanding why different , we need know more about CredentialProviderChain
it is the chain to get credential
ECS get credential, will success in ECSCredentials
EKS get credential
it gets the credentials from AWS.EC2MetaCredentials
How to fix it
fix it from getting token in AWS.TokenFileWebIdentityCredentials
more detail : https://docs.aws.amazon.com/eks/latest/userguide/specify-service-account-role.html
in this case, because we do not specify service account role for pod, so it will get credential in EC2MetadataCredials.
when we adopt web identify , the result of running test code,
node sns.js
init: 4.540ms
endpoint: https://sts.amazonaws.com/
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub: 2768.026ms
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 31.137ms
it seems faster. no more 4 times /latest/api/token :)
Tips:
if you want to be more faster, no new more aws-sdk object, use singleton, you can see that calling 2nd publish message just spend 31 ms, no need get credential again.
endpoint: https://sns.ap-southeast-1.amazonaws.com/
pub2: 31.137ms
thank you :)
ref: