AWS Anti-Patterns

From Federal Burro of Information
Jump to navigationJump to search

Graciously provided by a work Colleague. Not my own work. David (talk) 15:01, 3 November 2017 (UTC)

S3

Anti-Patterns Amazon S3 is optimal for storing numerous classes of information that are relatively static and benefit from its durability, availability, and elasticity features. However, in a number of situations Amazon S3 is not the optimal solution. Amazon S3 has the following anti-patterns:

  • File system—Amazon S3 uses a flat namespace and isn’t meant to serve as a standalone, POSIX-compliant file system. However, by using delimiters (commonly either the ‘/’ or ‘\’ character) you are able construct your keys to emulate the hierarchical folder structure of file system within a given bucket.
  • Structured data with query—Amazon S3 doesn’t offer query capabilities: to retrieve a specific object you need to already know the bucket name and key. Thus, you can’t use Amazon S3 as a database by itself. Instead, pair Amazon S3 with a database to index and query metadata about Amazon S3 buckets and objects.
  • Rapidly changing data—Data that must be updated very frequently might be better served by a storage solution with lower read / write latencies, such as Amazon EBS volumes, Amazon RDS or other relational databases, or Amazon DynamoDB.
  • Backup and archival storage—Data that requires long-term encrypted archival storage with infrequent read access may be stored more cost-effectively in Amazon Glacier.
  • Dynamic website hosting—While Amazon S3 is ideal for websites with only static content, dynamic websites that depend on database interaction or use server-side scripting should be hosted on Amazon EC2.

Cloud Front

Anti-Patterns

Amazon CloudFront is optimal for delivery of popular static or dynamic. However, in a number of situations Amazon CloudFront is not the optimal solution. Amazon CloudFront has the following anti-patterns:

  • Programmatic cache invalidation—While Amazon CloudFront supports cache invalidation, AWS recommends using object versioning rather than programmatic cache invalidation.
  • Infrequently requested data—It may be better to serve infrequently-accessed data directly from the origin server, avoiding the additional cost of origin fetches for data that is not likely to be reused at the edge.

AWS Storage Gateway

Anti-Patterns

AWS Storage Gateway has the following anti-patterns:

  • Database storage—Amazon EC2 instances using Amazon EBS volumes are a natural choice for database storage and workloads.

SQS

Anti-Patterns

Amazon SQS has the following anti-patterns:

  • Binary or large messages—Amazon SQS messages must be text, and can be a maximum of 64 KB in length. If the data you need to store in a queue exceeds this length, or is binary, it is best to use Amazon S3 or Amazon RDS to store the large or binary data, and store a pointer to the data in Amazon SQS.
  • Long-term storage—If message data must be stored for longer than 14 days, Amazon S3 or some other storage mechanism is more appropriate.
  • High-speed message queuing or very short tasks—If your application requires a very high-speed message send and receive response from a single producer or consumer, use of Amazon DynamoDB or a message-queuing system hosted on Amazon EC2 may be more appropriate.

RDS

Anti-Patterns Amazon RDS is a great solution for cloud-based fully-managed relational database, but in a number of scenarios it may not be the right choice. Amazon RDS has the following anti-patterns:

  • Index and query-focused data—Many cloud-based solutions don’t require advanced features found in a relational database, such as joins and complex transactions. If your application is more oriented toward indexing and querying data, you may find Amazon DynamoDB to be more appropriate for your needs.
  • Numerous BLOBs—While all of the database engines provided by Amazon RDS support binary large objects (BLOBs), if your application makes heavy use of them (audio files, videos, images, and so on), you may find Amazon S3 to be a better choice.
  • Automated scalability—As stated previously, Amazon RDS provides pushbutton scaling. If you need fullyautomated scaling, Amazon DynamoDB may be a better choice.
  • Other database platforms—At this time, Amazon RDS provides a MySQL, Oracle, and SQL Server databases. If you need another database platform (such as IBM DB2, Informix, PostgreSQL, or Sybase) you need to deploy a self-managed database on an Amazon EC2 instance by using a relational database AMI, or by installing database software on an Amazon EC2 instance.
  • Complete control—If your application requires complete, OS-level control of the database server, with full root or admin login privileges (for example, to install additional third-party software on the same server), a selfmanaged database on Amazon EC2 may be a better match.

DynamoDB

Anti-Patterns Amazon DynamoDB has the following anti-patterns:

  • Prewritten application tied to a traditional relational database—If you are attempting to port an existing application to the AWS cloud, and need to continue using a relational database, you may elect to use either Amazon RDS (MySQL, Oracle, or SQL Server), or one of the many preconfigured Amazon EC2 database AMIs. You are also free to create your own Amazon EC2 instance, and install your own choice of database software.
  • Joins and/or complex transactions—While many solutions are able to leverage Amazon DynamoDB to support their users, it’s possible that your application may require joins, complex transactions, and other relational infrastructure provided by traditional database platforms. If this is the case, you may want to explore Amazon RDS or Amazon EC2 with a self-managed database.
  • BLOB data—If you plan on storing large (greater than 64 KB) BLOB data, such as digital video, images, or music, you’ll want to consider Amazon S3. However, Amazon DynamoDB still has a role to play in this scenario, for keeping track of metadata (e.g., item name, size, date created, owner, location, and so on) about your binary objects.
  • Large data with low I/O rate—Amazon DynamoDB uses SSD drives and is optimized for workloads with a high I/O rate per GB stored. If you plan to store very large amounts of data that are infrequently accessed, other storage options, such as Amazon S3, may be a better choice.

Elasticache

Anti-Patterns Amazon ElastiCache has the following anti-patterns:

  • Persistent data—If you need very fast access to data, but also need strong data durability (persistence), Amazon DynamoDB is probably a better choice.

Redshift

Anti-Patterns

Amazon Redshift has the following anti-patterns:

  • OLTP workloads—Amazon Redshift is a column-oriented database suited to data warehouse and analytics, where queries are typically performed over very large datasets. If your application involves online transaction processing, a traditional row-based database system, such as Amazon RDS, is a better match.
  • BLOB data—If you plan on storing binary (e.g., video, pictures, or music), you’ll want to consider Amazon S3.

also see Redshift Notes

EC2 Databases

Anti-Patterns

Running your own relational database on Amazon EC2 is a great solution for many users, but a number of scenarios exist where other solutions might be the better choice. Self-managed relational databases on Amazon EC2 have the following anti-patterns:

  • Index and query-focused data—Many cloud-based solutions don’t require advanced features found in a relational database, such as joins or complex transactions. If your application is more oriented toward indexing and querying data, you may find Amazon DynamoDB to be more appropriate for your needs, and significantly easier to manage.
  • Numerous BLOBs—Many relational databases support BLOBs (audio files, videos, images, and so on). If your application makes heavy use of them, you may find Amazon S3 to be a better choice. You can use a database to manage the metadata.
  • Automatic scaling—Users of relational databases on AWS can, in many cases, leverage the scalability and elasticity of the underlying AWS platform, but this requires system administrators or DBAs to perform a manual or scripted task. If you need pushbutton scaling or fully-automated scaling, you may opt for another storage choice such as Amazon DynamoDB or Amazon RDS.
  • MySQL, Oracle, SQL Server—If you are running a self-managed MySQL, Oracle, or SQL Server database on Amazon EC2, you should consider the automated backup, patching, Provisioned IOPS, replication, and pushbutton scaling features offered by a fully-managed Amazon RDS database.

EC2 Ephemeral Storage

Anti-Patterns

Amazon EC2 local instance store volumes are fast, free (that is, included in the price of the Amazon EC2 instance) “scratch volumes” best suited for storing temporary data that can easily be regenerated, or data that is replicated for durability. In many situations, however, other AWS storage options may be more appropriate. Amazon EC2 instance store volumes have the following anti-patterns:

  • Persistent storage—If you need persistent virtual disk storage similar to a physical disk drive for files or other data that must persist longer than the lifetime of a single Amazon EC2 instance, Amazon EBS volumes or Amazon S3 are more appropriate.
  • Relational database storage—In most cases, relational databases require storage that persists beyond the lifetime of a single Amazon EC2 instance, making Amazon EBS volumes the natural choice.
  • Shared storage—Instance store volumes are dedicated to a single Amazon EC2 instance, and cannot be shared with other systems or users. If you need storage that can be detached from one instance and attached to a different instance, or if you need the ability to share data easily, Amazon S3 or Amazon EBS volumes are the better choice.
  • Snapshots—If you need the convenience, long-term durability, availability, and shareability of point-in-time disk snapshots, Amazon EBS volumes are a better choice.

EBS Volumes

Anti-Patterns

As described previously, Amazon EBS is ideal for information that needs to be persisted beyond the life of a single Amazon EC2 instance. However, in certain situations other AWS storage options may be more appropriate. Amazon EBS has the following anti-patterns:

  • Temporary storage—If you are using Amazon EBS for temporary storage (such as scratch disks, buffers, queues, and caches), consider using local instance store volumes, Amazon SQS, or ElastiCache (Memcached or Redis).
  • Highly-durable storage—If you need very highly-durable storage, use Amazon S3 or Amazon Glacier. Amazon S3 standard storage is designed for 99.999999999% annual durability per object. In contrast, Amazon EBS volumes with less than 20 GB of modified data since the last snapshot are designed for between 99.5% and 99.9% annual durability; volumes with more modified data can be expected to have proportionally lower durability.
  • Static data or web content—If your data doesn’t change that often, Amazon S3 may represent a more costeffective and scalable solution for storing this fixed information. Also, web content served out of Amazon EBS requires a web server running on Amazon EC2, while you can deliver web content directly out of Amazon S3.

Import/Export

Anti-Patterns

  • AWS Import/Export is optimal for large data that would take too long to load over the Internet, so the anti-pattern is simply data that is more easily transferred over the Internet. If your data can be transferred over the Internet in less than one week, AWS Import/Export may not be the ideal solution.

Glacier

Anti-Patterns Amazon Glacier has the following anti-patterns:  Rapidly changing data—Data that must be updated very frequently might be better served by a storage solution with lower read/write latencies, such as Amazon EBS or a database.  Real time access—Data stored in Amazon Glacier is not available in real time. Retrieval jobs typically require 3-5 hours to complete, so if you need immediate access to your data, Amazon S3 is a better choice.