# Using an AWS S3 Store as a Source to Feed an MSK Topic

This article examines how you can load any data from AWS S3 into DeltaStream to enrich other DeltaStream objects, before writing the final data into any other supported data store.

### Before you begin

* You must already have an [Amazon Web Services](https://aws.amazon.com/) account

### Creating the stream

1. Create a stream from the S3 file.

```sql
CREATE STREAM bronze_taxi_json (
    "VendorID"             BIGINT,
    tpep_pickup_datetime    BIGINT,   -- epoch-seconds
    tpep_dropoff_datetime   BIGINT,
    passenger_count         DOUBLE,
    trip_distance           DOUBLE,
    "RatecodeID"            DOUBLE,
    store_and_fwd_flag      STRING,
    "PULocationID"          BIGINT,
    "DOLocationID"          BIGINT,
    payment_type            BIGINT,
    fare_amount             DOUBLE,
    extra                   DOUBLE,
    mta_tax                 DOUBLE,
    tip_amount              DOUBLE,
    tolls_amount            DOUBLE,
    improvement_surcharge   DOUBLE,
    total_amount            DOUBLE,
    congestion_surcharge    DOUBLE,
    airport_fee             DOUBLE)
WITH (
    'store' = 'yellow-taxi-s3',
    'timestamp'= 'tpep_pickup_datetime',
    'value.format'='JSONL',
    's3.uri' = 's3://s3-demo-bucket/yellow-taxi');
```

{% hint style="success" %}
**Tip** You do not need to create a new stream for each s3 file. Instead, define the stream for a folder. The stream reads all the existing files and waits for new files to read as they arrive. Simply have the URI point to the folder; pointing to the folder watches all existing and future files. This is the default behavior.
{% endhint %}

From this point forward, you can treat this stream as you would any other stream in DeltaStream.

{% hint style="warning" %}
**Important** The default query size is 2 GB. If the files in your S3 bucket are larger, you may experience memory errors. To avoid this you can modify file size to enlarge it so the system reads from your bucket. To do this, append the following clause to your query:

`WITH (`\
`'query.memory.size' = '3Gi'`\
`);`
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.deltastream.io/use-cases/using-an-aws-s3-store-as-a-source-to-feed-an-msk-topic.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
