In this post, we’re going to look at how we can optimize S3 performance without using Transfer Acceleration to speed up the S3 transfers.
First, we will examine the S3 prefixes and what they are. We’ll then examine how to employ prefixes to optimize S3 performance. Also, We’ll then examine restrictions with KMS, which means when we’re using Amazon’s encryption service.
Second, we’ll then analyze S3 performance when talking about uploads in terms of our downloads.
Optimize S3 Performance with Prefixes
When we create a new S3 bucket, we define our bucket name, and later regarding the Path or URL, we have folders so that we can have a FolderA and a directory SubB and then last, we have our object name so that it could be documentA.pdf.
The S3 prefix is just the folders inside our buckets.
So, in the example above, the S3 prefix will be /FolderA/SubB.
S3 Prefix: /FolderB
In the example above, we could have /FolderB and have nothing inside it, no other sub-folder.
How can S3 prefixes give us a better performance?
The AWS S3 has remarkably low latency, and you can reach the first byte out of S3 within approximately 200 ms and even achieve an increased number of requests. For example, 3,500 requests for COPY, PUT, DELETE, POST, and 5,500 GET requests or head requests per second per prefix.
So, the more prefixes you have inside your S3 buckets, the more increased performance you’ll be capable of getting, and the essential number are to look at the 5,500 GET requests. For example, if we try to access an object in one specific S3 bucket, performing a GET request gets 5,500 requests per second per prefix. So, it means that if we wanted to get more satisfactory performance out of S3 in terms of GET, what we would do is we’d spread our reads across various prefixes or various folders.
Let’s see one example: if you’re utilizing 3 prefixes, you could reach 16,500 requests per second. And in our last sample, we looked at 4 different types of prefixes.
Or, if we were to use 4 different prefixes, we would get 5,500 requests times 4, which would provide us 22,000 requests per second.
So the fundamental idea to learn with this is the more folders and subfolders you have in your S3 bucket, the more satisfactory performance you can get from the S3 bucket for your project.
Limitations with S3
Suppose we use the Key Management Service (KMS), Amazon’s encryption service. In that case, if you enabled the SSE-KMS to encrypt and decrypt your objects on S3, you must remember that there are built-in limitations within KMS. For example, if we upload a file behind the scenes, we also call the KMS function “generate data key” in the KMS API. Also, the same situation happens when we download a file. We call the decrypt function from the KMS API. And the essential information, the built-in limits are region-specific. However, it will be around 5,500, 10,000, or 30,000 requests per second, and uploading and downloading will depend on your KMS quota. And nowadays, we can’t even ask for a quota expansion for KMS.
If you need performance and encryption simultaneously, you might want to think of just utilizing the native S3 encryption that’s built-in rather than utilizing KMS.
If you are trying to troubleshoot a KMS issue, it could just be that you’re reaching the KMS limit, and that could be what’s pushing your downloads or requests to be considerably slower.
Would you like to learn the difference between AWS KMS and CloudHSM?
Optimize S3 Performance on Upload
To improve the S3 upload process, we will examine multipart uploads suggested for files over one hundred megabytes. It’s needed for any files over five gigabytes in size. So, multipart uploads essentially permit you to parallelize your uploads. Also, this essentially permits you to improve your efficiency.
For example, for a big file, you’re cutting it into pieces, and then you’re uploading those pieces simultaneously. You’re doing parallel uploads, which improves your efficiency.
So, how can you get a better performance doing uploads? Simple, use multipart uploads. Also, we have a very comparable scenario for downloads.
Let’s see another advantage: if there’s a failure in the download process, it only affects one specific byte range. So that’s all it is: parallel downloads, and it’s just named S3 byte-range fetches. When we do uploads, it’s named S3 multipart uploads. So that’s all S3 byte-range fetches are. We can use them to accelerate your downloads and also can be used to download partial amounts of the file.
So, we have learned about Optimize S3 Performance using Prefix. It’s just the folders and subfolders within your S3 bucket. The more prefixes you utilize with your project using S3, the more satisfactory performance you’ll get. So you can consistently reach a high number of requests by spreading your data and your reads across different prefixes.
Also, we have learned that if you use SSE-KMS to encrypt your objects in S3, you must remember that there are built-in limits. So uploading and downloading data will count towards the KMS quota. Also, it is region-specific. And nowadays, you can’t request a quota expansion for KMS. And remember when you’re uploading objects to utilize multipart uploads to improve your performance when uploading your objects to S3.
So that is it for this article. If you have any questions, please let me know. Thank you.