Skip to content

S3 Data Load into Snowflake

This document details the process to integrate data from a S3 bucket location to a Snowflake console

Data Retrieval Process

  1. Specify the warehouse we're retrieving from through a compute function

    • You may need to create the warehouse before accessing it by adding it to your snowflake account or the command CREATE WAREHOUSE [WAREHOUSE NAME]

    • USE WAREHOUSE COMPUTE_WH

    compute ss

  2. Create the database schema if it does not already exist

    • A schema describes how data is organized in a relational database, like a blueprint. In this instance we're referring to logical database schema, and so we are defining the database name and outline.

    • Although a table outline is also a schema, we use the schema command to set up our database and the table command to create the schema outline

    • e.g. [DB name].[DB name]_raw_data

    schema ss

  3. Create a Snowflake table with a matching schema to the parquet file

    • Define the column name and type with actual name and type from parquet file

    table create ss

  4. Create a Snowflake stage to reference the S3 bucket

    • A Snowflake stage object is a reference to a storage location to aid the unloading of data.

    • Your AWS security key credentials will be needed

    • Set the URL to the S3 bucket location you're retrieving from

    • Set the Credentials to your AWS Key ID & Secret Key

    stage ss

  5. Copy the data from the created stage into the Snowflake table

    activation console ss