juan_luis_cano_rodriguez
07/22/2022, 2:33 PMthe Meltano convention is to name the model directory after the extractor using snake_case (i.e. tap_gitlab)
. for an ELT (E -> L -> T), wouldn't it be more appropriate to name it after the *L*oader (target)? after all,
Once your raw data has arrived in your data warehouse, its schema will likely need to be transformed to be more appropriate for analysis.(emphasis mine from https://docs.meltano.com/getting-started#transform-loaded-data-for-analysis)
juan_luis_cano_rodriguez
07/22/2022, 2:36 PMvisch
07/22/2022, 2:59 PMjuan_luis_cano_rodriguez
07/22/2022, 3:00 PMvisch
07/22/2022, 3:00 PMjuan_luis_cano_rodriguez
07/22/2022, 3:15 PMyou have N data sources, how do you put those into your target? You need to seperate them somehowhmm but that's the EL part, right? IIUC, the T part came after putting all the sources into the target
visch
07/22/2022, 3:15 PMjuan_luis_cano_rodriguez
07/22/2022, 3:16 PMalexander_butler
07/22/2022, 3:17 PMalexander_butler
07/22/2022, 3:18 PMvisch
07/22/2022, 3:18 PMmeltano models using dbt
, can you just show what you mean via code?juan_luis_cano_rodriguez
07/22/2022, 3:19 PMjuan_luis_cano_rodriguez
07/22/2022, 3:20 PMFor example, theI... need to re-read that a few more timesbelow configures dbt sources from the postgres tables where our tap-gitlab EL job output to/transform/models/tap_gitlab/source.yml
visch
07/22/2022, 3:20 PMjuan_luis_cano_rodriguez
07/22/2022, 3:22 PMvisch
07/22/2022, 3:22 PMjuan_luis_cano_rodriguez
07/22/2022, 3:23 PMvisch
07/22/2022, 3:23 PMjuan_luis_cano_rodriguez
07/22/2022, 3:24 PMmkdir /transform/models/tap_gitlab/
juan_luis_cano_rodriguez
07/22/2022, 3:24 PMtap_gitlab
and not, say, target_bigquery
. that's the root of the questionvisch
07/22/2022, 3:25 PMcat /transform/models/tap_gitlab/*
juan_luis_cano_rodriguez
07/22/2022, 3:29 PMsource.yml
with
config-version: 2
version: 2
sources:
- name: tap_gitlab
schema: public
tables:
- name: commits
- name: tags
and a commits_last_7d.sql
with
{{
config(
materialized='table'
)
}}
select *
from {{ source('tap_gitlab', 'commits') }}
where created_at::date >= current_date - interval '7 days'
but this is what I don't get. I thought that the transformation phase happened solely inside the warehouse/target. but the naming convention and the source.yml
seem to indicate that dbt is reading the data from the source.juan_luis_cano_rodriguez
07/22/2022, 3:30 PMvisch
07/22/2022, 3:30 PMprofiles.yml
that points to your DW.juan_luis_cano_rodriguez
07/22/2022, 3:30 PMvisch
07/22/2022, 3:30 PMconfig-version: 2
version: 2
sources:
- name: tap_gitlab
schema: public
tables:
- name: commits
- name: tags
Is a table in your DWvisch
07/22/2022, 3:31 PMjuan_luis_cano_rodriguez
07/22/2022, 3:31 PMtap_gitlab
?visch
07/22/2022, 3:31 PMvisch
07/22/2022, 3:31 PMalexander_butler
07/22/2022, 3:31 PMjuan_luis_cano_rodriguez
07/22/2022, 3:32 PMvisch
07/22/2022, 3:32 PMjuan_luis_cano_rodriguez
07/22/2022, 3:32 PMjuan_luis_cano_rodriguez
07/22/2022, 3:32 PMvisch
07/22/2022, 3:39 PM.meltano/transformers/dbt
./compiled has the generated sql as well if you want to peruze and trust it some more. Also running dbt in debug works