Monday, 6 February 2017

Using AWS Lambda for updating S3 objects Metadata


We have a client’s website hosted on AWS infrastructure.The client is a mass media company with presence all over the world. When we tested the website page on PageSpeed Insight Google’s platform https://developers.google.com/speed/pagespeed/insights/, we found that there were some cacheable resources which had no expiration set for the browser, which was a factor for increasing page load time.
Hence the tool was suggesting to Leverage browser caching which would minimise the page load time.
The website was being delivered through CloudFront. When browser caching has to be set, the request should come from the origin headers. In our case the Origin for the static assets was S3 bucket. The header should come with a value set for max-age for Cache-control parameter.
When you have a lot of static content residing within a S3 bucket and you want to have some parameter being set as part of the object metadata, each time updating it would be a manual task. When files to be managed are large in number updating metadata with each upload becomes tedious.
The requirement was to set the following:
S3→ bucket_name→ Object_name → Properties→ Metadata → Add more metadata
Key : Cache-Control    Value : max-age = 604800

One work around for above problem could be using Amazon Lambda.


Amazon Lambda
AWS Lambda is a compute service that lets you run code without provisioning or managing servers.AWS Lambda executes your code only when needed and scales automatically, from a few requests per day to thousands per second. You pay only for the compute time you consume - there is no charge when your code is not running. With AWS Lambda, you can run code for virtually any type of application or backend service - all with zero administration. AWS Lambda runs your code on a high-availability compute infrastructure and performs all of the administration of the compute resources, including server and operating system maintenance, capacity provisioning and automatic scaling, code monitoring and logging.

Let’s have a look at how lambda works :




Implementation for above challenge using Lambda :


  1. Sign-in to your AWS account. Navigate to Compute→ Lambda
  2. Create a new Lambda function. First add a trigger for your function. Lambda can be integrated with AWS services as shown in image below. As per the trigger event choose from the services and proceed for the further configurations.I would choose S3.
  1. Specify the bucket name for which metadata to object has to be added. Choose the action for example Creation of object in my case.  We can configure the path for which objects the function should execute along with the extension
  1. Upload function to Lambda . Lambda supports code written in Node.js (JavaScript), Python, and Java (Java 8 compatible)


  1. Select the Handler and a role. Provide the role with required permissions.


  1. Review and Create the function.

Testing :
  1. Upload a file to the S3 bucket which acts as trigger for Lambda
  2. After Upload check the metadata added.
S3→ Bucket_name → Properties → Metadata


Logs & Graphs :
  1. Navigate to CloudWatch → Logs → /aws/lambda/your_Lambda_Function_name

  1. The summary Lambda execution is present within the logs.

More Use Cases for Lambda :

  1. Log Analysis
  2. Serverless Websites
  3. Automating Tasks like backups
  4. Data transformations/conversions on the fly


Potential Benefits:

  1. Serverless
  2. Event-driven
  3. Sub-second Billing
  4. Completely Automated Administration
  5. Built-in Fault Tolerance

Conclusion :
    Lambda is a service which can be use for event driven actions. You pay only for the compute time you use. You can run your piece of code from any type of back-end service with fully automated work-flow. We were able to automate the process to add/update the metadata to objects uploaded.Now the client just needs to upload the objects and Lambda takes care of the latter part without any additional manual work.

Resources :
For more resources visit :

1 comment:

Amazon Macie: Identifying Sensitive Information in S3 Objects

Amazon Macie: An Overview Amazon Macie is an AWS service designed to help detect sensitive information, such as Personally Identifiable Info...