How to use AWS user role on ECS to access datafusion S3 object store instead of passing keys explicitly

2 weeks ago 27
ARTICLE AD BOX

In our current architecture, a Python microservice running on AWS reads S3 Parquet files using DataFusion. Authentication is handled via AWS access keys passed through environment variables, as shown below.

def get_s3_datafusion_context(bucket_name: str, table_name: str, file_key: str) -> SessionContext: from datafusion.object_store import AmazonS3 s3 = AmazonS3( bucket_name=bucket_name, region=settings.AWS_DEFAULT_REGION, access_key_id=settings.AWS_ACCESS_KEY_ID, secret_access_key=settings.AWS_SECRET_ACCESS_KEY, ) ctx = SessionContext() ctx.register_object_store("s3://", s3, None) table_path = f"s3://{bucket_name}/{file_key}" ctx.register_parquet(table_name, table_path) return ctx

However, as a security concern we are asked to use the IAM service role attached to the EC2 instead of directly passing keys through environment variables. How could I change the above function to achieve that target.

Note: I've tried removing the access_key_id and secret_access_key from parameters. But still it seem to expect a non-empty value for the respective parameter

Read Entire Article