AWS EFS Deep Dive: What is it and when to use it
Introduction
Amazon Elastic File System (EFS) is a fully-managed shared file storage service. EFS can be used easily with various AWS services such as EC2. It provides a lot of benefits such as scalability, availability, and durability.
Benefits
The main benefits of EFS are summarized below:
Fully Managed
Before EFS was launched, customers had to provision their file servers along with Storage volumes and also ensure that the data was being replicated to ensure durability and availability of the data. EFS is a fully-managed service which means that AWS manages the underlying file storage infrastructure. EFS provides a pretty simple interface that allows customers to create a new file system with a few clicks.
Elastic & Scalable
EFS automatically scales the storage of your file system based on the number of files in your file system. It will grow as you add more files and it shrinks as files are removed. Customers don’t need to provision any storage in advance.
EFS can support a petabyte-scale file system and the throughput of the file system also scales with the capacity of the file system.
Highly-Available & Durable
EFS is designed to be highly-available and durable. Any data that you store in EFS is replicated within and across AZs for durability.
Shared Access
An EFS file system can be mounted by thousands of instances that can access the file system concurrently. An EFS file system can be accessed via:
- Within the same VPC
- AWS Direct Connect: EFS can be attached to servers running on-premises (on-prem servers)
- Intra-region VPC peering Connect to EC2 instances in different AZs
- Inter-region VPC peering: Connect to EC2 instances in other regions
- AWS Transit Gateway: Connect to EC2 instances in a different VPC
- Across accounts via Shared VPC: Connect to EC2 instances across different accounts
Performance Modes
EFS provides two performance modes when creating a new file system:
- General-purpose: This is the default performance mode for EFS file systems. AWS recommends this mode for the majority of workloads
- Max I/O: Max I/O mode is recommended for large-scale workloads
The table below summarizes the differences between the two modes:
General purpose | Max I/O | |
---|---|---|
When to use it? | Latency-sensitive applications, general-purpose workloads | Large-scale and data-heavy applications |
Advantages | Lowest latencies for file operations | Unlimited scalability for throughput/IOPS |
Limitations | Limit of 35,000 IOPS | Higher metadata latencies |
Throughput Modes
EFS also provides two throughput modes:
- Bursting Throughput: This is the default throughput mode where throughput for file operations scale with the amount of data stored in the file system. EFS provides a baseline rate of 50 KB/s per GB. EFS also provides burst credits that can be used to get a higher throughput for a limited time. [This article][efs-credits] explains how the burst credits work in more detail.
- Provisioned Throughput: This mode allows you to provision throughput for the file system regardless of the size of the file system. In this case, Amazon charges you for both the storage and the throughput provisioned.
Bursting throughput | Provisioned throughput | |
---|---|---|
When to use it? | Workloads with varying throughput | Workloads that need consistently high throughput |
Advantages | Throughput scales with storage | User-defined throughput |
Limitations | Fixed throughput to storage ratio | Throughput needs to be changed manually. Charged for both storage and throughput |
Storage Modes
Amazon EFS also offers two storage classes for data stored in EFS:
- Standard Storage: This is the default storage class. Standard storage is designed for frequently accessed files.
- Infrequent Access: This storage class is designed for files that are accessed less frequently. Data stored on the infrequent access storage class costs less than Standard.
Once customers enable lifecycle management for their file system, any files that were added and have not been accessed for 30 days are moved to EFS Infrequent Access from Standard storage.
Security
Amazon EFS provides the following security controls:
- Network traffic: Security groups and network ACLs can be used to control network traffic access.
- File and directory access: EFS supports POSIX permissions on the file system
- Data encryption: EFS supports two forms of encryption for file systems: encryption at rest and encryption of data in transit.
- Identity and Access Management: AWS IAM can be used to restrict access to the EFS file system. IAM supports both action-level and resource-level policies for EFS
Pricing
Amazon EFS uses a pay as you go model so customers only pay for the resources used. There are no minimum fees or upfront commitments. EFS pricing is determined by your choice of performance, throughput, and storage. This article provides a comprehensive overview of EFS pricing.
When to use it?
Amazon EFS is deeply integrated with other AWS services and can be used in conjunction with these services to build applications that need scalable and reliable file storage.
- EC2
- ECS
- EKS
- Fargate
- Sagemaker
Some of the common use-cases for Amazon EFS are:
EFS as container storage
EFS is integrated with container services like EKS, ECS and Fargate. Containerized applications frequently require a durable file storage system and EFS is a good fit for such use-cases. Examples of such applications would be Jupyter notebooks and Jenkins agents.
Example: Persistent storage for container logging using Amazon EFS
EFS as shared storage for scale-out apps
EFS can also be used as shared storage for applications that distribute the load across various instances. Examples of such applications would be content management systems like WordPress, Video Processing, and Machine Learning training.
Example: How Alpha Vertex uses EFS
AWS EFS vs EBS vs S3
The other most popular file storage services provided by AWS are EBS and S3. This article provides a comprehensive overview of the differences between the three and how to choose the right service for your use-case.
Conclusion
Amazon EFS is one of the popular storage services provided by Amazon. It is helpful to understand the key concepts and features of EFS so that you can use it for the right use-case and maximize its performance.