Introduction

Pandas is a Python library that is used mainly for Data Analysis and Machine Learning.

In this tutorial, we will look at how you can download a file stored in AWS S3 using Pandas.

Table of contents

Pandas

Pandas supports the ability to read and write remote files directly from S3. Pandas uses the S3fs Python package to provide support for AWS S3. S3Fs provides a Pythonic file interface to S3.

You can install s3fs using pip:

pip install s3fs

Downloading a file from S3

A file stored in S3 can be accessed directly using the following commands:

import pandas as pd
pd.read_csv("s3://your-bucket/data.csv")