Using Amazon Web Services (AWS) with the Command Line — Data Pipelines in the Cloud (II)
Welcome back to the Data Pipelines in the Cloud series! On the first part, I introduced Airflow as a tool for orchestrating data pipelines, explained why and when to use it, and provided a brief tutorial on how to code and execute a minimal Airflow pipeline on your local environment using the Linux subsystem for Windows.
In this second part, we’ll get closer to what an actual Airflow pipeline in production may look like. For this, we’ll learn how to use Amazon’s cloud services (AWS) through the Command Line and set up a relational database in the cloud and a bucket for object storage1. I’ll guide you through uploading a sample CSV file to the bucket, which we’ll later use as input for a simple Airflow DAG that performs a meaningful transformation on this data.
Data Pipelines in the Cloud Series
Setting Up Your AWS Environment
If you’re reading this, there is some chance you already have an AWS account up and running. In that case, just jump to the next section. If not, just follow these steps:
Creating an AWS account
You can do this by going to the AWS homepage, clicking the “Create an AWS Account” button on the top-right corner, and then following the instructions on screen. You’ll need to provide a payment method, although almost everything in this tutorial should be covered by the free credits given to new users. If everything went well you should see the following screen at the end of the process.
Creating a regular user account (to avoid using the root user)
The username and password you just set up will give you root user access, that means unrestricted permissions to do anything on AWS. It’s considered good practice to use these credentials only for creating a new, more limited, regular user account within your AWS environment. You should then use this new account to carry out your everyday operations (and, of course, set up two-factor authentication). Here is how you can do that:
Go to console.aws.amazon.com, select root user and log in with your current credentials.
Once in the AWS Management Console, type “IAM” in the search box on the top and select the first result. From here, click on the button Add MFA (MFA=Multi-factor authentication) and follow the instructions.
- After setting up MFA for the root user, go back to the IAM dashboard (you can do that using Search box just like before) and then click on the ‘Users’ link located in the left sidebar, just below ‘Access management’ and ‘User groups’. Once in the Users section, press the button “Create user”.
- In the Specify user details page, write a user name for your admin account (you’ll use this name to log in) and mark the check-box ✅ Provide user access to the AWS Management Console. Then pick the option I want to create an IAM user in the blue box below2.
- Next, on the Set Permissions page, choose the option “Attach policies directly”, then write AdministratorAccess into the search box located just below “Permissions policies” and select the policy named AdministratorAccess.
- If all goes well, the details of the new user account will appear on the next page. Save all this information (including the console sign-in URL) in a secure location, like a password manager3.
- Finally, go to the console sign-in URL and enter the username and console password provided in the previous step. Once you’re in the Management Console of the admin user, you should go to IAM again and set up MFA for this account as well.
Now we’re done with the account set up! Let’s move on to setting up AWS’s Command Line Interface so we can do stuff on the cloud directly from the terminal.
Setting up the AWS Command Line Interface (CLI)
One cool thing about AWS is that everything we do there (like every button you click) is just an API call under the hood4. These API calls or requests can be sent to AWS either through the web UI we’ve been using until now or programmatically through commands sent from a terminal.
This is important because, using the Terminal, we can spare all the button-pressing and text-box-filling on the web UI and instead just paste text commands, hit Enter, and execute the same actions more quickly and with less chance of human error. We can even use full scripts that automatically perform many actions on AWS based on some schedule or custom triggers.
Let sink for one moment how ridiculously cool and futuristic it is to live in an era where you can literally turn on and use massive servers that 1) are not yours, 2) are prohibitively expensive to buy, and 3) may be located on the other end of the world – all for a cheap hourly rate and by just typing some words on your machine, as if they were in your own room 🤯.
Yes, it is possible to learn this power. I’ll now walk you through the whole process, from installing the AWS Command Line Interface, which is the tool required to send commands to AWS from the Terminal, to setting up the AWS services we need, like the S3 bucket and the relational database. I’ll explain each command so you can learn what they do instead of just copying and pasting them.
Note: I’ll assume you’re using Ubuntu Bash, but most commands should be the same for other Linux distros and macOS. If you’re a Windows user, please check the instructions in Part 1 on how to set up and access Ubuntu through the Windows Subsystem for Linux.
Let’s start by downloading and installing the latest version of the AWS CLI. To do this, open your bash and enter the following commands:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
Here:
curl
is a tool for downloading (or sending) files through URLs (curl = cURL = Client for URL). The first argument we pass is the URL of the file we want to download. The-o
, which is an alias for—output
, allows us to specify the name of the file in our system after downloading it.unzip
uncompresses the awscliv2.zip file we just downloaded, andsudo ./aws/install
uses administrator (super user) rights to execute the AWS CLI installer from the resulting uncompressed folder.
Then, we can check we correctly installed the AWS CLI by executing the following:
aws --version
Which should return something like aws-cli/2.XX.XX
5.
Configuring the AWS CLI with an AWS Account
After installing this tool, let’s set it up with our AWS account. For this, we first need to obtain access keys that enable programmatic access to AWS.
Access keys have two components: an Access Key ID (which looks like this AKIAIOSFODNN7EXAMPLE
) and a Secret Access Key (which looks like this wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
). The Secret Access Key, being extremely sensitive, is only visible once on the AWS website during key creation. It’s crucial to store this key immediately in a secure location, such as a good password manager.
Here is how we obtain these keys:
Go to the IAM console located at console.aws.amazon.com/iam.
Click on “Users” on the left sidebar and then click on your user name within the table of users that will appear.
Select the “Security credentials” tab and then scroll down until you see the section “Access keys”. There, click on “Create access key”.
You’ll then be asked for your use case. Choose Command Line Interface (CLI).
You’ll be asked for an optional tag. You can leave this blank or write something like tutorial-airflow. It really doesn’t matter6.
Finally, your Access Key ID and Secret Access Key will be displayed. You’ll have the option to download the keys as a CSV, but I recommend that you save them immediatly in a password manager. Whatever you do, don’t share your secret access key with anyone and never put it in a file that could be committed to Git.
Once you have your keys, it’s time to use it to configure the AWS CLI to work with your account. For this, go back to the terminal and execute:
aws configure
You’ll be prompted to write the Access Key ID and Secret Access Key you got previously. Additionally, AWS will ask for your Default region name (us-east-1
is OK) and Default output format (json
is fine too).
You can verify your account was correctly configured by running these commands:
cat ~/.aws/config
cat ~/.aws/credentials
Which will show the contents of the config
and credentials
files that AWS edits each time it configures an account.
The output should display the same values (access keys and options) that you just passed to aws configure
under a section named [default]
. That means your account is configured as the default profile (good!). The default profile is important because it’s the one used to execute the AWS CLI commands, well, by default 7.
Finally, test the AWS connection with this command:
aws sts get-caller-identity
If everything is set up correctly, you should see a JSON with your account’s details as output.
By the way, if you made it to this point, congratulations! You’re now able to connect to and interact with AWS directly from your command line! (i.e., you can harness the power of distant servers just by typing on your computer!).
Avoiding Bill Surprises: Turning Off What We Don’t Use
Before we continue, a cautionary note. As I’ve been mentioning, having control over cloud resources is a great power, so it comes with a great responsibility. A great financial responsibility, to be specific: cloud services are like utilities where you pay for what you use (the famous pay-as-you-go model). This is usually regarded as a big advantage of the cloud: it eliminates the need for a huge upfront cost of purchasing servers and other hardware just to get your project up and running. And this is especially true when there is uncertainty about the amount and type of computing and storage needed in the future (i.e. almost always).
However, since AWS (intriguingly) doesn’t allow users to set hard limits on their spending, there is always the risk of leaving stuff turned on by mistake and then getting an astronomic bill at the end of the month8. So, yes, it’s extremely important to avoid oopsies on this front9.
Fortunately, this is another area where programmatic control of AWS (using AWS through the command line or SDK) comes in handy. If scripts and CLI commands were used to turn on cloud resources, similar scripts could be employed to shut them down once they’re no longer needed, effectively minimizing the risk of expensive human slips.
How Much Will I Have to Pay to Follow This Tutorial?
If your heart skips a beat when reading words like astronomic bill, you may also be wondering exactly how much money you will have to pay to follow the upcoming part of this tutorial. The answer is less than 1 cent (0.01 USD) under two assumptions: 1) your AWS Account is new, and 2) you turn off everything we use after completing the tutorial.
The cost is very low because almost everything we’re going to do is covered by the AWS Free Tier, which allows new users to practice and learn how to use many AWS services at no charge. The cost is not exactly zero because, at some point, we have to do something10 that is not covered by the free tier. The cost of doing this is 0.005 USD per hour (half a cent per hour)11.
If you’re okay with this, let’s go ahead implementing our cloud data pipeline!
Our Data Pipeline (or What Are We Trying to Do Anyway?)
Now, it’s time to examine our pipeline’s input data and the transformations we’ll apply to it. The details of all of this will be covered in the next post, but for now, suffice to say that our input is a sample of focused work sessions that have a Start and End datetime and the pipeline will perform the following transformations:
- Read the source data.
- Aggregate it at a daily level and pivot it.
- Add the new transformed records to a database table, handling possible duplicates.
As you can see, the data is rather simple, and the transformations are straightforward (akin to a short pandas
or dplyr
data wrangling script), but the challenge is to perform them in an automated and unsupervised way under Airflow’s orchestration while also interacting with cloud APIs and services.
Provisioning AWS Services for Our Pipeline
Let’s explore the AWS services we can use to implement this pipeline.
An S3 Bucket to Store the Raw Data
We need a place to upload and store the raw input data (.csv files). AWS S3 (Simple Storage Service) is commonly used for storing unstructured or raw data like this due to its cost-effectiveness, ease of use and seamless integration with other services. S3 is actually simple because it is a managed service. This means all of the underlying computing and networking required to actually store and retrieve the files is abstracted away from us (🥳). We don’t need to send instructions to turn on machines and configure their networking properties or manage the underlying storage infrastructure. S3 does it all for us.
So, let’s get started by creating a bucket (a container for objects in S3) and uploading the input data to it.
First, we must define a name for the bucket. This name must be unique across ALL the S3 buckets in the world (yes, really). Also, in real-world scenarios, it is important to use a name that is extremely difficult to guess in order to minimize the risk of unauthorized access to your bucket.
For this tutorial, we’ll settle for meeting the uniqueness requirement. To ensure we don’t run into any naming conflicts, we’ll use a dynamically generated name. This way, many people can run the commands without causing name collisions.
Here is how to do it:
bucket_name="tutorial-airflow-$(date +%Y-%m-%d-%H-%M-%S)"
aws s3 mb s3://$bucket_name
echo "Bucket created: $bucket_name"
Explanation of each command:
bucket_name=...
creates a variable in the Terminal that appends the current datetime to the prefixtutorial-airflow-
. This ensures uniqueness inbucket_name
as long as no two people execute the code at precisely the same time (very unlikely).aws s3 mb s3://$bucket_name
uses thes3 mb
(mb = make bucket) command of the AWS CLI to create a bucket named as the previously defined variable.echo...
displays the bucket’s name as defined by the variable$bucket_name
. This step is not mandatory, but it helps us verify the bucket name (I also recommend writing down the bucket name elsewhere, just in case).
A way to verify that the bucket was created successfully is to run the following command, which will show you a list of all the buckets in your account:
aws s3 ls
Let’s move on to uploading the data:
curl -o input_data.csv https://raw.githubusercontent.com/franciscoyira/aws_airflow_tutorial/master/raw_data/input_data.csv
aws s3 cp input_data.csv s3://$bucket_name/input_data/input_data.csv
Explanation of each command:
The first line downloads the input data from my GitHub to your local file system using the
curl
command we have already seen.The second line uses the
cp
(copy) command of the AWS CLI to copy theinput_data.csv
we just downloaded to our S3 bucket. It’s worth mentioning that everything that comes after$bucket_name
is the unique identifier of the object we’re uploading (in this case, the identifier isinput_data/input_data.csv
).
You can verify that the file was correctly uploaded in two ways:
You should see a confirmation message that starts with
upload: ./input_data.csv to s3://...
) after executing theaws s3 cp
command.You can run the
ls
command ons3://$bucket_name/input_data/
and the output should display theinput_data.csv
file.
aws s3 ls s3://$bucket_name/input_data/
A PostgreSQL Database on Amazon RDS to Store the Output
Our pipeline will store its output in a PostgreSQL relational database, similar to many real-world data pipelines where the final users of the data consume the processed information using SQL and similar tools.
To set up a PostgreSQL database in the cloud, we’ll use AWS RDS (Relational Database Service). This managed service facilitates deploying and operating SQL-based databases like MySQL, MariaDB, Oracle, and Microsoft SQL Server. We’ll use PostgreSQL mainly because it is a widely used database included on the AWS Free Tier.
As in S3, here managed means that many low-level procedures that are necessary for getting the database up and running are abstracted away from us. Unfortunately, RDS is not as managed as S3, in the sense we still have to tinker with some technical details. At a minimum, we have to pick the following:
A database engine (PostgreSQL, MariaDB, Oracle, etc) and its version.
Type and size of the compute instance. This refers to the hardware (literally the physical machine) where the database software will run. AWS offers a gazillion different compute instances for RDS, differentiated mainly by memory, processor, network performance, and price. Since we want to stick to the Free Tier, the choice is simpler for us: we’ll go with db.t3.micro, a basic instance with 1GB of RAM, 2 vCPUs (virtual processors), and low network performance.
Security groups and network configuration. We have to assign the computer on which the database runs to some sort of firewall named security group and add rules to allow traffic from our IP to go through this firewall.
Provisioning the Database
Here are the Bash commands that set up the PostgreSQL database we want:
GROUP_ID=$(aws ec2 create-security-group --group-name airflow-rds-sg --description "Security group for RDS accessible by Airflow" --query 'GroupId' --output text)
MY_IP=$(curl ipinfo.io/ip)
aws ec2 authorize-security-group-ingress --group-id ${GROUP_ID} --protocol tcp --port 5432 --cidr ${MY_IP}/32
MY_PASSWORD=$(openssl rand -base64 24 | tr -dc 'a-zA-Z0-9' | fold -w 12 | head -n 1)
echo $MY_PASSWORD
aws rds create-db-instance \
--db-instance-identifier airflow-postgres \
--db-instance-class db.t3.micro \
--engine postgres \
--publicly-accessible \
--allocated-storage 20 \
--db-name airflow_tutorial \
--master-username postgres \
--master-user-password ${MY_PASSWORD} \
--vpc-security-group-ids ${GROUP_ID}
Here’s a breakdown of what each line does:
In the first line, the command between the parentheses (
aws ec2...
) creates our security group or firewall. The option--group-name airflow-rds-sg
specifies “airflow-rds-sg” as the name of the security group. The flags--query 'GroupId' --output text
ensure the command returns the Security Group ID as output in the terminal. Then, everything is wrapped inGROUP_ID=$(...)
to save that ID as a variable. We need to specify the Group ID when creating the database, so it’s better to pass it as a variable than to copy and paste manually.The second line retrieves our own IP address with the command
curl ipinfo.io/ip
12 and saves it as a Bash variable namedMY_IP
.The next line (
aws ec2 authorize...
) modifies our security group to allow incoming traffic from our own IP through the port 5432, which is the port used by PostgreSQL. Note the use of the previously defined variables${GROUP_ID}
and${MY_IP}
. Without them, we would have to manually copy and paste the Group ID and IP to run this line (cumbersome and error-prone).Then, we execute
MY_PASSWORD=$(openssl rand -base64 12)
to randomly generate a password and store it as a variable. We’ll need this password to access the database, so we display it on the screen usingecho $MY_PASSWORD
. Copy and save it in a secure location.The last command (
aws rds create-db-instance...
) is the one that provisions the database itself. Although it appears to be composed of several instructions, it is actually a single command split into multiple lines with\
for enhanced readability (the backslash concatenates the lines before sending the instruction to the terminal). Let’s break down the parameters ofaws rds create-db-instance
to understand what they do:db-instance-identifier
is the name of the computer (“compute instance”) on which the database software will run. We set “airflow-postgres” but you can choose any name as long as it doesn’t contain special characters13.db-instance-class
is the type of compute instance we’ll use. To avoid charges, we choose a basic instance calleddb.t3.micro
, which is available on the ✨Free Tier✨.engine
allows us to choose between database engines like MySQL, MariaDB, Oracle, and so on. As mentioned before, we’ll usepostgres
.publicly-accessible
assigns a public IP address to our database, allowing us to reach it from our own computer14.allocated-storage
. How many GB of storage will our database have? We have chosen 20 GB, which is the lowest value that ourdb-instance-class
supports.db-name
is the name of the database itself (not the compute instance). It can be any name adhering to some basic guidelines15 or even be omitted altogether (then it will default to “postgres”). We’ll use “airflow_tutorial” for now.master-username
andmaster-user-password
are the credentials used to access the database after it’s launched. We’re setting the username to “postgres” and the password to the random string we created earlier withMY_PASSWORD=$(openssl rand -base64 12)
.vpc-security-group-ids
links the database compute instance to a specific security group, which controls who can reach the instance at all. We link the instance to the ID of the security group we created earlier withGROUP_ID=$(aws ec2 create-security-group...
. This group is already configured to allow incoming traffic from our local IP.
You can run the following command to check that the database instance was successfully created:
aws rds describe-db-instances --db-instance-identifier airflow-postgres
The output should be a rather large JSON describing the properties and state of the instance we just created.
Testing the Connection
Can we connect to the database already? Not yet! There are still a couple of steps left.
First, download the Certificate Authority (CA) that is required to establish a secure connection to the server.
curl -o us-east-1-bundle.pem https://truststore.pki.rds.amazonaws.com/us-east-1/us-east-1-bundle.pem
Second, install two more packages: the PostgreSQL client (postgresql-client
), which allows us to run queries on our database, and a command-line tool called jq
which can extract values from a JSON output. This latter package is optional but recommended as it allows us to store database parameters as variables, so we don’t have to copy and paste them manually.
sudo apt install jq postgresql-client
We can now turn on the database instance and connect to it:
output=$(aws rds describe-db-instances --db-instance-identifier airflow-postgres --query 'DBInstances[*].Endpoint.[Address,Port]')
ENDPOINT_URL=$(echo $output | jq -r '.[0][0]')
PORT=$(echo $output | jq -r '.[0][1]')
psql "host=${ENDPOINT_URL} port=${PORT} dbname=airflow_tutorial user=postgres password=${MY_PASSWORD} sslmode=verify-ca sslrootcert=us-east-1-bundle.pem"
Here is a breakdown of each of the commands above:
In the first line,
aws rds describe-db-instances...
returns a JSON with a lot of information about our instance. We use--query 'DBInstances[*].Endpoint.[Address,Port]'
to filter this JSON and keep only the Address or Endpoint (the URL we’ll use to reach the database) and the Port. The filtered JSON is then saved in theoutput
variable.In the following two lines, we use the
jq
tool we just installed to parse theoutput
JSON and extract the Endpoint and Port as separate variables.echo $output
prints the whole JSON, which looks like this.
[ [ "airflow-postgres.qrwgo0b3axh0.us-east-1.rds.amazonaws.com", 5432 ] ]
- Then
jq
extracts the first element of the innermost level of brackets ('.[0][0]'
) to get theENDPOINT_URL
(airflow-postgres.qrwgo0b3axh0.us-east-1.rds.amazonaws.com), and the second element ('.[0][1]'
) to get the port (5432). The option-r
makesjq
to return raw text instead of more JSONs.
- Lastly, in the last line we finally connect to the database by invoking the PostgreSQL client (
psql
) and passing it a connection string with variables we defined earlier,sslmode
for secure communication, and the path to the downloaded certificate for server identity verification.
If you successfully connected to the database, you should see airflow_tutorial =>
on the Terminal after executing the last command. This means the database is ready to accept SQL queries 🎉🎉🎉🎉🎉.
You can then go ahead and test it with a simple query like SELECT 'HELLO WORLD' AS test;
Congratulations! You’ve successfully provisioned and connected to a relational database in the cloud.
This is a big step towards implementing our data pipeline, as the database is the final destination of the data we’ll process.
Wrap Up: Shutting Down Resources to Minimize Costs
We’re done setting up the AWS services required for our Airflow pipeline to work.
In Part 3 of this series, I’ll show you how to set up the pipeline itself, assuming you already have Airflow and Docker installed and running on your local machine. If you haven’t set them up yet, follow the instructions in Part 1 to do so. Also make sure to subscribe to the blog so you get notified when Part 3 is published.
Before wrapping up, though, let’s make sure to turn off and delete the resources we’ve just created to avoid unnecessary costs.
To do that, first exit the Postgres client with \q
.
Then delete the RDS instance and the S3 bucket:
aws rds delete-db-instance --db-instance-identifier airflow-postgres --skip-final-snapshot
aws s3 rb s3://$bucket_name --force
Wait a couple minutes for the deletion of the instance to complete and then delete the security group:
aws ec2 delete-security-group --group-id $GROUP_ID
Feedback Survey
I hope you enjoyed this post and found it useful. If you have any feedback, please let me know by filling out the survey below. It will take less than 2 minutes, and it will help me improve the content of this blog. Thank you!
If you’re not familiar with object storage buckets, you can think of them, for the effects of this post, as just folders where you can store any kind of data and allow access to it from the Internet in a very efficient way. They’re often used for storing large amounts of unstructured data (e.g. photos, videos, pages of a static website) that doesn’t change very often. Here is a video with a more detailed explanation.↩︎
AWS will strongly encourage you to use the ‘Identity Centre’ option, but that feature is primarily designed for organizations, making it a bit of an overkill for this tutorial. If you decide to go with that option, you’ll have to go through a bunch of extra steps that are outside the scope of this post and provide little benefit for a “playground” account.↩︎
As an extra tip, note that using the sign-in URL is just a short-cut to avoid having to write the Account ID each time you log in. The account ID is the 12-digit number you see right before .signin.aws.amazon.com. You may as well just go to signin.aws.amazon.com and then manually type the Account ID, along with the User name and Password.↩︎
An API call is a request sent by a computer program to a server to access a specific service or data. These requests must adhere to specific protocols in order to be understood by the server.↩︎
The latest stable version is 2.15.41. Your version should be equal to or higher than that if you’re installing AWS CLI now. As long as your version starts with 2, you should be fine.↩︎
Tags are only relevant when you become a heavy user of AWS and have so many resources created that you need tags to remember what each one was for. If you’re following these steps for the first time, that’s not your case. Also, you can always go back and add tags to previously untagged resources. So, yeah, it really doesn’t hurt to leave this blank.↩︎
You can also add multiple AWS accounts with distinct profile names, or even create several profiles linked to the same account but configured with different options, like different regions. For this, you would run
aws configure –profile YOUR_PROFILE_NAME
. This profile information would also be stored in theconfig
andcredentials
files but under another section delimited by[YOUR_PROFILE_NAME]
. Then, you can append–profile YOUR_PROFILE_NAME
to theaws
commands to tell the AWS CLI to execute them using that profile and not the default profile. That is, you would writeaws [COMMAND] –profile YOUR_PROFILE_NAME
instead of justaws [COMMAND]
.Also, if you want to rename an existing profile for any reason, you can just change the names between squared brackets […] in the
config
andcredentials
files.↩︎I was curious about whether there is another explanation to this besides just bad faith on AWS’ side, and, allegedly, the problem is that calculating cloud spending in real-time is a very complex problem that would require a disproportionate amount of compute resources and that faces other non-trivial implementation challenges. I don’t know enough to take sides on whether that’s true or not, but here are some interesting threads on HackerNews about the topic if you want to dig deeper: 1 and 2.↩︎
The pay-as-you-go model also means that it is crucial to become familiar with cost management tools and cost optimization concepts. For many cloud projects, there are several ways to architect and implement a solution that provides the same functionality, but the final costs differ widely between these options. And then there is the aspect of monitoring. More could be written about these topics, but it exceeds the scope of this post.↩︎
Spoiler: we have expose a Public IP address in order to query our cloud relational database from our local Airflow instance.↩︎
I’m fully aware that going from 0 USD to 0.001 USD crosses some psychological threshold, and I spent A LOT of time looking for ways to tweak the tutorial, so its cost was precisely zero. The best I could come up with is having the public IP address turned off most of the time and turning it on programmatically by the Airflow pipeline itself (and then turning it off automatically by the same pipeline). Technically, the AWS Free Tier provides one free Public IP address for an EC2 (compute) instance, but I couldn’t find a way to make the relational database use that IP address instead of provisioning a new one (if you know a way to do this, please let me know!).↩︎
ipinfo.io/ip is not a special command but literally just a public website that returns your IP as plain text. It works on your web browser too.↩︎
From the AWS documentation:
db-instance-identifier
must contain from 1 to 63 letters, numbers, or hyphens. First character must be a letter. Can’t end with a hyphen or contain two consecutive hyphens.↩︎This will work as long as the security group is configured to allow incoming traffic from our computer too.↩︎
For PostgreSQL,
db-name
has the same constrains as thedb-instance-identifier
: Must contain 1 to 63 letters, numbers, or underscores. Must begin with a letter. Subsequent characters can be letters, underscores, or digits (0-9). Can’t be a word reserved by the specified database engine.↩︎