Doing your Database on AWS (Part 1)

If you would like to read the other parts in this article series please go to:

The venerable database has undergone plenty of changes over the years. A computer database is, in its simplest meaning, a collection of digital data.

Database management was one of the earliest uses of computers, with IBM’s hierarchical IMS in the 1960s being one of the first. A decade later, the next step in database evolution brought us relational databases, most of which use the Structured Query Language (SQL) and are modeled around tables.

Relational databases are great for structured data that is easily organized into fields, but not so great for the deluge of unstructured data that organizations and individuals collect today. Beginning in the 2000s, unstructured databases began to emerge, with NoSQL databases that are fast and don’t require fixed table schemas. Many NoSQL databases store data in XML format.

There are other database models, such as the entity-relationship model, object model, array model, semantic model, document model, star schema model and more. Companies today often use a combination of different types of databases for different types of data, and the advent of Big Data – a combination of structured and unstructured data generated in large volume – has brought new challenges to database management.

With cloud computing increasing in popularity, it should come as no surprise that organizations are looking at taking their databases to the cloud, and cloud providers are gearing up to offer the options that customers want. Amazon Web Services can provide the resources for both relational and NoSQL database needs through its RDS and DynamoDB services, and if you’ve got Big Data, AWS has Redshift, a petabyte-scalable data warehousing service. In this article, we’ll look more closely at each of these and how your organization can use them to best benefit your business.

Benefits of cloud-based database solutions

Databases, perhaps more than most applications, have a tendency to grow quickly. If you run your databases on premises, that means you may need to invest in new hardware in order to keep up with the growth. Putting your database in the cloud allows you to take advantage of the highly scalable nature of the cloud provider’s expansive infrastructure. At the same time, you benefit from the economies of scale so that you get better performance at lower cost.

A cloud provider can offer automatic failover in case of hardware failure so that if problems do occur, you are assured that recovery will be fast and automated, with no action required on your part.

Cloud databases are sometimes referred to as DBaaS (Database as a Service). By putting the infrastructure management in the hands of the cloud provider, you can focus on your applications without worrying about the physical layer. Databases in an IaaS environment allow you to maintain full control over your databases and the data they contain, and cloud services provide integrated monitoring, backup, redundant storage and security mechanisms to protect the integrity of your data.

Of course, DBaaS isn’t perfect and it’s important to consider the down side, as well, when making a decision to take your databases off premises. The disadvantages are those common to any cloud deployment. Although the major cloud providers have strong security mechanisms in place – stronger than those of most organizations – there are still inherent security and privacy issues involved whenever you place sensitive or regulated data in the public cloud. Access to your database is also, obviously, dependent on your Internet connection, so it’s important for your organization to have reliable redundant connections in place to avoid down time and loss of productivity.

Amazon Relational Database Service (RDS)

Amazon’s RDS service is the solution for those who want to deploy a traditional relational database on AWS cloud services. You can set up a database based on Microsoft SQL Server, MySQL, Oracle, PostgreSQL or Amazon Aurora in the AWS cloud. You can provision your choice of database engine on your database instance and the Amazon RDS service will automatically keep the database up to date with security patches. RDS also automatically backs up the database for you.

Database storage options

As with all Amazon Web Services, RDS integrates with other AWS components to provide the resources that your database needs. An important element of any database is the storage system, and you have a number of choices with RDS:

  • General Purpose Storage (SSD)
  • Provisioned IOPS (SSD)
  • Magnetic Storage

When you create or make changes to a database instance in Amazon RDS, you will need to choose the storage type that you want to use along with the amount of storage capacity you require. The good news is that it is possible for you to change your storage type later. To do that, you’ll need to modify the database instance. You might experience a short (up to 120 seconds) outage when you make the change. Storage is on Amazon Elastic Block Storage (EBS) volumes.

The storage type you choose depends on the performance you need and how much you want to pay. Magnetic storage is the least expensive but it is shared storage; that is, multiple customers use this storage and that can affect the performance. If your storage needs are light, though, you can save money by using magnetic storage.

General purpose SSD storage is, of course, faster than storage on traditional magnetic disks. Amazon rates it at 3 IOPS/GB with 3000 IOPS (Input/Output Operations per Second) burst speeds. The capacity can range from 5 GB to 3 TB. This works fine for small/medium sized databases and is a good cost effective solution that balances price and performance.

Provisioned IOPS storage is what you need if you’re serious about database performance. These volumes top out at 1 TB or 3 TB, depending on the database instance type. MySQL, Oracle and PostgreSQL offer the larger capacity whereas Microsoft SQL Server (Standard or Enterprise edition) only supports up to 1 TB. You can specify the amount of dedicated IOPS you need, up to 4000 IOPS.

If you find that you need more storage capacity, you can add storage space. This can take a while, anywhere from several hours to several days, depending on the storage type, storage size and database load. You will still be able to use the database while storage is being added, but you might see the performance slow down during this time.

Database platform options

Amazon RDS can be deployed on either of the AWS Elastic Cloud Compute (EC-2) platforms: EC2-Classic and EC2-VPC. Most new customers will be using EC2-VPC. Either way, you can create a Virtual Private Cloud (VPC) and locate your database instance in it. A VPC is just a virtual network, isolated from the other virtual networks that exist in the Amazon cloud.

There are, however, some differences between the two. EC2-VPC comes with a default VPC. When you create a new database instance, it will be located in the default VPC unless you select to locate it in a different VPC that you have created. EC2-Classic does not come with a default VPC so you need to create one to put your database instance into.

The type of security that you will use for access to your VPC is also dependent on which of the two platforms you’re using. On the EC2-VPC platform, you need to create an EC2 or VPC security group to provide access to your database instance.

If you don’t know which platform your AWS account uses, that information is available in the home page of the EC2 console or the RDS console. Supported platforms will say only “VPC” if you’re using EC2-VPC. If you’re using EC2 Classic, that field will say “EC2, VPC.”

Summary

At this point, you should have some basic understanding of the benefits (as well as the drawbacks) of cloud-based databases and a high level overview of Amazon’s relational database offering in the AWS cloud, RDS. We have discussed the storage options that are available for RDS databases and introduced you to the two database platforms and the concept of running your database inside a Virtual Private Cloud (VPC) within the Amazon cloud.

The topic of deploying and managing a database in the AWS cloud is a complex one, and we have barely touched the basics. This multi-part series will delve much deeper into those complexities. Our next installment, Part 2, will pick up where we left off here and go into some detail about how to work with a database instance in a VPC, including some best practices. We will also talk about how you can move a database instance into a VPC if it isn’t in one already.

If you would like to read the other parts in this article series please go to:

About The Author

Leave a Comment

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Scroll to Top