Learning the black box of AWS — A guide for junior developers

My attempt to demystify some of the AWS terms/services

Learning the black box of AWS — A guide for junior developers

Since I was a fledgling software engineer who had no prior exposure to the industry, the learning curve was steep during the first six months of my career.

One year in, I thought I had pretty much leveled up in terms of the basic knowledge one should have, as a software engineer. However, it only took monitoring alerts and DB connections for me to realize I had in fact, miles to go.

How did alert monitoring make me doubt my knowledge?

I had no detailed description of what CloudWatch was, what EC2 was, even though it was literally everywhere. In the back of my head, there was this nagging doubt that I had little understanding of the system or the infrastructure.

As a result of my company’s AWS certification programs, I stumbled upon them by chance. It took 20+ hours of videos, several more of studying cheat sheets, making notes, hands-on coding with the AWS trial account, to get a grip on what was actually going on behind the scenes of running an application.

How, in fact, the application was actually running.

Now, for the full understanding of AWS services, there is no better way than to prepare for the AWS certifications. Even the simplest certification (if such a thing exists) needs up to a month of intensive training. It is difficult for everyone to carve out so much time from their busy lives. And not everyone would be motivated to do it if they didn’t first grasp what AWS is.

Here is my attempt to demystify some of the AWS terms/services one would definitely have encountered, if not worked on, even six months in the tech career.

First things first, what is AWS?

It’s a cloud platform that offers services to handle the infra side of things (to name one of the many things) in software. What infra side? The Computing, The Storage, The Network, The Security, among others. According to AWS, they aim to eliminate the undifferentiated heavy lifting tasks for us. Think of DB storage or running your application in the cloud, the most popular choice is AWS. It offers over 200 different services, providing infrastructure(IAAS), software(SAAS) and platform(PAAS) As A Service.

Availability Zone, seems familiar, what is it exactly?

They are a collection of data centres, where the actual compute, storage, network and database resources are hosted, which are served to us. Each AZ is typically grouped with one or more than one AZ within a geographical region(called AWS region) by low latency connections. However, they have separate power and network so as to minimise impact of failure of a single AZ. The low latency connections ensure data is replicated in all these AZs so that in the event of failure of data centres of one AZ, your data would still be available.

What’s this daily buzz over EC2 instances in tech?

Elastic Compute Cloud (EC2) instances are virtual servers that can be provisioned on demand. We can customise the servers according to our needs, from the Operating System, the instance type and size, which ultimately boils down to the amount of compute, memory and networking capacity we would like.

We can resize these instances, have a boiler plate for the type of servers to be launched and choose when they are to be launched or shut down.

Based on the compute, memory, networking capacity of EC2 instances, we can choose from General Purpose, Compute Optimised, Storage Optimised, Memory Optimised, Accelerated Computing instances, among others.

On the basis of pricing for these instances(which essentially impacts availability of the application), we can choose from OnDemand, Spot Instances, Savings Plan and Reserved Instances.

Mind your Q’s!

Simple Queuing Service (SQS) are fully managed queuing services in the cloud. We can have a service push messages to SQS, which is a queue in the cloud, and have another service(consumer) read from the queue, so that these services are isolated and independent in terms of latency. So the queue acts as a temporary repository for the messages.

SQS deletes messages once consumed. Also, not all SQS are FIFO since SQS is essentially a list of queues behind the scenes. This also means messages can be duplicated and the consumer should be prepared for that.

It is not strictly real-time since consumers must poll for messages in SQS. Polling is the way we retrieve the messages from the queue. By default, SQS uses short polling, which returns messages even if the queue being read from is empty, rather than long polling, which waits for messages in the queue to arrive or until the long poll timeout expires.

The message size for SQS is 1byte to 256KB, with an extension for SQS Extended Client Library for Java(only) to 2GB. The message retention, which is the duration for which messages remain in SQS, is customisable from 1 minute to 14 days, the default being 4 days.

The most popular queue types SQS offers are the Standard and FIFO queues.

Standard queues allow unlimited transactions per second while ensuring that every message is delivered at least once. However, it has its own limitations, messages can be delivered out of order and more than once.

FIFO queues, on the other hand, maintain the ordering of messages, but are, however, limited to 300 transactions per second.

There are other types of queues as well, including dead letter queues and delay queues.

Tired of random notifications? But how do they even work?

Simple Notification Service(SNS) lets us send notifications via text message, webhooks, emails, lambdas, SQS, mobile notifications. SNS is a highly available, durable, and secure pub/sub messaging service in the cloud that enables us to decouple microservices, serverless apps, and distributed systems.

Publish/subscribe, or Pub/sub as it is commonly known, is a messaging system in which message senders, or publishers, transmit messages to a topic, which is a communication channel. The subscribers or receivers can join a particular topic. So when a message appears within a topic, it is immediately forwarded to all the subscribers of that topic.

So in this case, the subscribers don’t have to poll for messages, it is immediately delivered to them once it is available.

SNS topics group multiple subscriptions together, and format the message according to the subscription’s protocol.

Both SQS and SNS are a means of asynchronous communication between services.

The No-brainer for NoSQL, DynamoDB!

DynamoDB is a key-value pair and document NoSQL DB that guarantees consistent reads and writes at any scale. NoSQL does not use the tabular relations used in relational databases and does not use SQL for querying the results. Instead, the data is stored using many different methods, including column-oriented, document-oriented, graph-based, and in KeyValue stores.

DynamoDB supports an assortment of features like in-memory caching, backup and restore, fully managed, multi-region, multi-master, built-in security, durability, and high availability.

It supports strongly consistent reads. The read/write operations per second are specified by us. Data is distributed across at least three zones to ensure high availability. The trade-off is that when data is updated, it is updated to all copies(across AZs), so data can be inconsistent if read from a copy that has not yet been updated. You can workaround this concern by specifying your read consistency.

Eventual Consistent Reads: Reads are faster as while copies (to different AZs) are being updated, it is still possible to read, but consistency cannot be guaranteed. All copies eventually become consistent after 1 sec.

Strongly Consistent Reads: When copies are being updated and we attempt to read them, it won’t return until all the copies are consistent. Consistency is guaranteed. There is a longer wait time for reads.

Run your function in the cloud, with AWS Lambda, but what does it actually mean?

With AWS Lambda, you can run your code without provisioning servers. To run your code, you don’t need to setup any virtual machines (like EC2). Servers automatically start and stop when needed. The functions are essentially serverless (since you do not need to manage servers) and are billed on a per-invocation basis. They scale automatically according to the traffic. Lambda comes with support for Java, Go, PowerShell, Node.js, C#, Python, and Ruby code. Lambda now has a memory limit of 10GB, up from its previous limit of 3008MB since December ’20.

Lambdas are code runners in the cloud and are often used as extensions/plugins to other services. For example, we can define a lambda to process S3 objects (like resize images) before they are stored.

With lambdas, however, there are certain limitations associated with it, that restrict its use cases. Lambda has a timeout of 15min, which is the amount of time the lambda allows a function to run. Furthermore, Lambda has cold starts(overhead in function invocation), which implies that it takes time for the lambda to provision and start the server, copy code and run the function.

A way of putting additional code and content onto lambda is with Lamda layers. It supports zipping an archive containing libraries and run time dependencies. Lambda can have up to 5 layers, but there is a hard limit of 250MB unzipped size.

CloudWatch is watching you, erm, your services!

CloudWatch gathers logs and metrics, as well as events over time for AWS Cloud resources and AWS-hosted applications, and displays them as statistics for your infrastructure.

Those statistics can be used to discover insights and create alarms, which can then be used to trigger automated actions to respond to changes in your AWS resources.

You can centrally manage all the data collected from your cloud-based infrastructure with CloudWatch. For example, when the CPU use exceeds a certain level, you may set CloudWatch to send out notifications or start a new instance.

CloudWatch is integrated with more than 70 AWS services like CloudWatch Logs, CloudWatch Events, CloudWatch Alarms etc.

Time to talk about the 3 S’s

Simple Storage Service (S3), is an object-based storage service in the cloud that is built for durability and scalability. It stores data as objects(within resources that are called buckets) as opposed to file or block storage. We need to create an S3 bucket in one of the AWS Regions before uploading data (images, videos, documents, etc.). S3 has unlimited storage, we can store, retrieve and delete data from buckets at any time from anywhere.

We can manage who has access to the bucket and its objects (for instance, who can create, remove, and retrieve items), as well as see access logs for the bucket and its objects. We may also select the AWS Region in which a bucket is kept to reduce latency, save expenses, or meet regulatory requirements.

All S3 objects typically have

  1. Key: which is the name of the object

  2. Value: in bytes

  3. Metadata: additional info

  4. Version Id: version of the object

Objects can be up to 5 TB in size, however at a time, for a single file upload, up to 5GB is supported, to upload more, we need multi-part file upload. S3 buckets can store objects or folders that store other objects. All S3 bucket names must be unique across all regions.

S3 supports many types of storage classes- Standard, Intelligent Tiering, Standard IA, One Zone, Glacier, Glacier Deep Archive

All S3 buckets are private by default. We can set the default encryption behaviour for an S3 bucket so that all new objects are encrypted when they are stored in the bucket. The objects are encrypted using server-side encryption with either Amazon S3-managed keys (SSE-S3) or AWS KMS keys stored in AWS Key Management Service (AWS KMS) (SSE-KMS).

S3 supports pre-signed URLs which grant temporary access to objects to either upload or download object data. It provides access temporarily to private data.

To protect against the deletion of files in S3, we can turn on versioning and have MFA delete which requires an MFA code in order to delete objects from a bucket. Only the bucket owner logged in as root user in AWS can delete objects in this case.

Who You Are, Who I Am, that’s IAM

Identity Access Management(IAM) is used to manage the access of AWS users and resources. It controls who can use resources by authenticating (signing in) and authorizing(issuing permissions).

But why is IAM required?

IAM provides the following advantages:

Shared AWS account access: There is no need to share passwords or access keys to allow others to use your AWS account. You can achieve this with IAM roles.

Separate permission levels: Different users can have different permissions for different resources. Some users might be given full access to EC2 and other services provided by AWS. Other users can be given read-only access to some S3 buckets, but not to others.

With identity federation, you can grant temporary access to users who already have passwords somewhere else for example, in your corporate network or with an internet identity provider.

Integrated with many AWS services: IAM is integrated with services such as EC2, CloudTrail, CloudWatch, AutoScaling, Lambda, S3, DynamoDB, etc.

It’s free: If it’s free, why not use it?

IAM terminologies:

Users are AWS users who can log in to the console or interact with AWS resources programmatically. A user consists of a name and credentials.

User Groups just group up IAM users and assign common permission to them, so they share the permission level of the group. Eg admin group etc.

Roles associate permissions/policies to a role and assign it to a user/group.

Policies are JSON docs that grant permission to user, group, role to access services. Policies are attached to IAM identities.

Infra’s Silver Lining: CloudFormation?

CloudFormation provides a template language that allows AWS resources to be provisioned via code. It means we can provision our EC2 instances, SQS queues, databases, etc. via just a piece of code. It is the AWS Infrastructure As Code(IAC) tool. For CloudFormation, the templates are written in JSON or YAML.

What makes up a template:

  1. Template metadata: additional information about the template

  2. Description: describes the purpose of the template

  3. Parameters: values to be passed to the template at runtime (eg. instance type)

  4. Mappings: Lookup table that maps keys to values, e.g. for each region, the image id string differs, mapping region keys to different image ids

  5. Conditions: if-else statements within the template for the resources

  6. Transform: to custom template

  7. Resources: (mandatory) resources we want to create/provision eg-IAM, EC2 instance

  8. Outputs: values that are returned, eg-the IP addresses of new servers

AWS Quickstarts are a collection of pre-built cloud formation templates. Once we have created and deployed the cloud formation template, CloudFormation can be used to update the stack(of resources built). Leave it to Cloud formation to adjust, remove, and rebuild resources if changes are made to the existing template and then published.

And that concludes the non-exhaustive list of basic AWS services. I hope this article helped to shed some light on the black box that AWS is, especially to junior developers.

I’d recommend heading over to freeCodeCamp and checking out the AWS certified challenge if this made you interested/motivated enough to earn some certifications for yourself! If you’re not up for a month’s prep, then you can check out AWS fundamentals course on Coursera to learn about AWS at your pace.

P.S. Apologies for the bad puns you’d have encountered throughout the article.

See-you-nara!

Did you find this article valuable?

Support Sampriti Mitra's Blog by becoming a sponsor. Any amount is appreciated!