Sherzod's blog : AWS Sig V4

There are cases in AWS when you need to sign your HTTP requests to pass through an AWS IAM gate, especially when those HTTP requests are destined to an API endpoint. For example, if there is a private API gateway going through a private link - passing through IAM policy checks at VPC endpoint and API level (i.e. API resource policy) - then you may need to sign those requests accordingly to successfully execute your API.

As part of the case, the API Gateway VPC endpoint has a policy that only allows execution to an IAM entity that is part of the AWS organization, by using IAM global condition key aws:PrincipalOrgID. Without that, that VPC endpoint would happily pass any canonical HTTP request to the API Gateway layer.

We are going to use combination of Terraform and Python to run this setup.

Creating the testbed

To see all this play out, we need six things:

VPC with private subnets and optionally a NAT Gateway
VPC endpoint for the API Gateway created in private subnet(s).
REST API of PRIVATE type that has resources for GET and POST methods
Lambda function to execute via the REST API
EC2 bastion host configured for SSM Session Manager that will serve as execution environment within that private subnet
Python scripts (for GET and POST requests) that signs the HTTP requests to the private REST API, which will be run from the EC2 bastion

1. VPC

This is a standard VPC with private subnet that will host the endpoint (or endpoints if you decide to create ones for running SSM Session Manager) for API Gateway. It will also host the Lambda behind the REST API, plus an EC2 bastion. Terraform modules registry has a popular module to quickly setup one, I used an existing VPC in this case.

2. VPC Endpoint for API

I put together a simple Terraform configuration that stands up a VPC Endpoint for the API Gateway service (execute-api) and an EC2 bastion (step #5). The default_policy.json.tpl in vpce module contains a policy with IAM condition that expects a principal from the same organization. (Real cases might have further constraints, this condition is enough for our proof-of-concept). Check out this repository to see the tangible config.

.
├── main.tf
├── modules
│   ├── ec2-bastion
│   │   ├── main.tf
│   │   ├── output.tf
│   │   ├── providers.tf
│   │   └── variables.tf
│   └── vpce
│       ├── default_policy.json.tpl
│       ├── main.tf
│       ├── output.tf
│       ├── providers.tf
│       └── variables.tf
├── output.tf
├── providers.tf
└── variables.tf

3. The REST API

There are various ways to stand up a REST API in AWS these days, I think the easiest and quickest way is to use AWS Chalice. Upon creating a sample project, it gives you a boilerplate app code for your endpoints and sets up the Lambda layer as well (that last bit alone will save you some work). The working version was very brief (lt 20 lines):

from chalice import Chalice

app = Chalice(app_name='hoot')

@app.route('/')
def index():
    return {'hello': 'world'}

@app.route('/greet', methods=["POST"])
def greet():
    try:
        request_data = app.current_request.json_body
        if request_data and 'name' in request_data:
            return {'hello': request_data['name']}
    except Exception as e:
        print(e)
    return {'hello': 'stranger'}

... and the deployment configuration, .chalice/config, didn't took much work to make API as private

{
  "version": "2.0",
  "app_name": "hoot",
  "api_gateway_endpoint_type": "PRIVATE",
  "stages": {
    "dev": {
      "api_gateway_stage": "api",
      "subnet_ids": ["subnet-0f616cfdedd17b617", "subnet-086c890182a284389"],
      "security_group_ids": ["sg-0a89986f0238188e9"],
      "api_gateway_endpoint_vpce": "vpce-0bc94abee8e95ccd6"
    }
  }
}

Sure, a production grade code would be bigger, but for a rapid development I don't look anywhere else.

A working example can be found at this GitLab repo. Follow the instructions there and you can pretty much jump to step 6.

4.

Look at that, Chalice already created the Lambda function for us. Moving on.

5.

The Terraform config I ran on step 2 already created the needed bastion. It serves as environment that has access to the VPC endpoint and can run python scripts. I used SSM Session Manager to log into the box and run the test Python scripts from there.

If there is a route to VPC (via corp network, VPN, etc), then bastion isn't needed and local tools can be used to test the API endpoint.

6.

At this point, after Chalice does its thing, we have an API endpoint we can work with. We can start an SSM session into the bastion and issue curl https://<api-id>.execute-api.us-east-1.amazonaws.com/api/ in the console, which should return this message:

{
    "Message": "User: anonymous is not authorized to perform: execute-api:Invoke on resource: arn:aws:execute-api:us-east-1:205656158400:<api-id>/api/GET/"
}

We are being blocked by the IAM policy on the VPC endpoint that allows requests by an IAM principal that is part of the org.

I put together two scripts that do that - one for GET request and one for POST. Files can be found at this location, and can be simply pasted into the SSM session opened in browser (AWS Management Console > EC2 > Select the instance > Connect). There should be get.py and post.py after this.

Run

Both scripts need requests package. Install it using a preferred method, I choose to setup a local virtual env and install the package there

$ cd $HOME
$ python3 -m venv venv
$ source venv/bin/activate
(venv ) $ pip install requests

Scripts read following environment variables:

GWY_ID - API ID
AWS_REGION (us-east-1 by default)
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_SESSION_TOKEN (only post.py reads it)

Set those variables before running the scripts.

$ export GWY_ID=<api-id>
$ export AWS_ACCESS_KEY_ID=AKIA1234P6ZZZZUGABCD
...

Let's run the get.py

(venv) sh-4.2$ python get.py

BEGIN REQUEST++++++++++++++++++++++++++++++++++++
Request URL = https://<api-id>.execute-api.us-east-1.amazonaws.com/api

RESPONSE++++++++++++++++++++++++++++++++++++
Response code: 200

{"hello":"world"}

This looks better than "not authorized" message. Let's run the post.py next. This one uses temporary credentials, so set the env vars for it accordingly. Temp credentials can be retrieved from bastion itself:

$ curl http://169.254.169.254/latest/meta-data/iam/security-credentials/<ec2-role-name>
{
  "Code" : "Success",
  "LastUpdated" : "2022-09-06T12:08:39Z",
  "Type" : "AWS-HMAC",
  "AccessKeyId" : "hidden",
  "SecretAccessKey" : "hidden",
  "Token" : "hidden",
  "Expiration" : "2022-09-06T18:09:09Z"
}

Plug the values into AWS_* variables, including AWS_SESSION_TOKEN and run the post.py script

(venv) sh-4.2$ python post.py

BEGIN REQUEST++++++++++++++++++++++++++++++++++++
Request URL = https://<api-id>.execute-api.us-east-1.amazonaws.com/api/greet

RESPONSE++++++++++++++++++++++++++++++++++++
Response code: 200

{"hello":"buddy"}

Going through this exercise helped me understand AWS authorization, global IAM condition keys and VPC endpoint policies. While one doesn't think twice about using that endpoint that might have a default star policy, a slight change to it might require a prep on client's side.