Verify multiple S3 files exist from list in Excel

Very new to Postman and APIs in general. I have a list of files in Excel I would like to confirm exist in an S3 bucket before sending to the client (who will be downloading them from S3). Unfortunately, the list comes from a different application and the S3 bucket is a mirror. End-users of the source application have to “push” the files to the mirror (so, yes, technically not a mirror) and we’ve identified some files where that hasn’t happened properly… which is the reason I need to validate this list. I’ve spent hours looking for a solution but no luck. Is there a way to do this? Thanks!

Hey @mwright92 !

So from what I’m understanding, you’re trying to use Postman and the AWS S3 API to check if files exist in a bucket.

  1. Set up AWS SDK: Make sure you have the AWS SDK and it’s auth setup on your machine.
  2. Get your bucket name: Once you are authenticated, you should be able to access your S3 bucket.
  3. Use AWS S3 API with Postman: You can use the S3 API ‘ListObjects’ operation which returns some or all (up to 1000) of the objects in a bucket. You do not need to make any changes to the application.
  4. Create a GET request in Postman: The format of the GET request would look something like this: https://s3.amazonaws.com/your-bucket-name/?list-type=2&prefix=file-name. Replace your-bucket-name and file-name with your bucket’s name and the file name you want to check respectively.
  5. Parse the response: If the response contains the filename, then the file exists in the bucket. If not, then the file does not exist.
  6. Automate the process: If you have a long list of files, you can create a script to automate the process. The script would iterate through the file names, send a request to the S3 API, and check the response.

Remember to secure your credentials and not share them with anyone or push them to a public repository.

I hope this helps! Let me know if you have any further questions.

Hey Kevin!

Thanks for the reply! Your understanding is spot-on.

Unfortunately, I left out one critical piece of information from my original post. I’ve been able to do everything through step 5 already in Postman. I can individually verify a file exists. The problem is I have just over 200,000 files I need to confirm in S3. So, the scripting to automate the process is the part I’m stuck on.

Can that be done through Postman? If so, are there any help docs on how to do that? Thanks!

One solution would be to convert your Excel file into a CSV or JSON file to use as a data file in Postman’s Collection Runner. You can set up a test script in the “Tests” tab of your request to verify if each file exists, and then run the collection. The Collection Runner will iterate over each entry in the data file, essentially automating the process of checking each file’s existence in your S3 bucket. After the run you can see which files exist and which do not.

That said, this solution would send a request for each file in your data file, which, with 200k files, is just not really practical or very performant.

It may be more efficient to use the AWS SDK directly with something like Python or Node

Thanks, Kevin! I will play around with that option and see how long it takes to process a batch of files. The good news is this is a one-time effort, not recurring, so even if it takes a while, it would still be worth it. That said, I can’t imagine a separate request for 200k individual files will be quick or efficient. Appreciate the help!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.