1. Source Configuration¶
You can find an overview of this section here
1.1. Introduction¶
This may seem optional, but we start our model implementation by specifying our data source configuratins in a .yml
file, coinciding with our model directory.
This is great for long-term reference & configurations.
1.1.1. Why should I Care?¶
1.1.1.1. My thoughts¶
Well, there are cases where you have a reference to table "CuteDB"."NiceSchema"."Customers"
in several locations of your Data Model. However, what happense when you migrate to a new databse due to poor naming conventions from "CuteDB"."NiceSchema"."Customers"
to "AwesomeDB"."NiceSchema"."Customers"
.
You can do a mass replacement in your repository, depending on your Code editor. Or, you can make the changes all in one data reference Before starting our model,
1.1.1.2. Quoting dbt Docs:¶
Sources make it possible to name and describe the data loaded into your warehouse by your Extract and Load tools.
By declaring these tables as sources in dbt, you can then:
Use
{{ source() }}
function, helping define the lineage of your dataTest your assumptions about your source data
Calculate the freshness of your source data
1.2. Using Sources¶
We configure our yml file for specifying source table references, in our upcoming modeling. This includes:
Snowflake specific table references
aliases
Note: We can add more to this file, like built-in test & documentation functionality, but we’ll cover that in a later topic
Now, let’s see how we can use Sources to specify our source path configurations
Here is a snippet of what I implemented. Everything in comments will be in #’s
Here’s a snippet of what we implement for our source configuration
version: 2
sources:
- name: customers # Customer alias for our table references in {{ source() }}
database: SNOWFLAKE_SAMPLE_DATA
schema: TPCDS_SF100TCL
loaded_at_field: _loaded_at
tables:
- name: CUSTOMER
- name: CUSTOMER_DEMOGRAPHICS
- name: CATALOG_SALES
1.3. Tips¶
1.3.1. Errors¶
Why do I receive an error:
The selector 'snowflake_demo_catalog_sales' does not match any nodes and will be ignored
?Ans: This is because either our profiles.yml file does not have the right configuration, or our models have not been explicitly mentiones
Try out a prefix of
dbt --warn-error run
in your dbt run (Available in 0.17+)
1.4. Other Questions¶
Why is my schema like
dev_{name}_tablename
?Recall our created profiles.yml file. We specified our default target to be
dev
. In thedev
configuration, we specify our table & view creation under a prefixdev_{name}
E.g. I had
schema: dev_raulingaverage
How does dbt know to find my tablename in the sources file?
Ans: We have our alias
customer
equivalent to{DATABASE}.{SCHEMA}
How can it access such table reference?
Ans: In this tutorial, I am running on
SYSADMIN
in Snowflake. However, in a production environment, you may have to coordinate with your Snowflake admin to provide permissions to your particular table reference :/