Hi all, looking for any suggestions as my case: I ...
# best-practices
v
Hi all, looking for any suggestions as my case: I have multiple DBT projects installed into main DBT project of meltano via package.yml. In package.yml, depends on which tap running that the corresponding DBT project installed, I used an environment variable here. For example:
Copy code
git: "{{ env_var('DBT_REPO') }}".
But there are two my DBT projects have the same model name, so if two these schedules run at the same time then I met the error
Copy code
dbt found two resources with the name xxx. Since these resources have the same name,
dbt will be unable to find the correct resource
t
That’s not really a Meltano issue and more of a dbt one. You can try using an alias for a given model, but I’d recommend having unique model names across projects. https://docs.getdbt.com/reference/resource-configs/alias
v
Thank you. I also used alias and another error occurred
Copy code
dbt found two resources with the database representation "xxx_yyy_dataset.Table_DDD".
  dbt cannot create two resources with identical database representations
It’s very easy to have the same model names as I have many DBT projects. I think Meltano can create an DBT project instance independently for every run to avoid this?
t
How would you expect this to look in the project? Currently we expect a Meltano project to have a single dbt project.
d
@vh, do your
dbt
projects all share the same models, but you just need to apply them with different source schemas? We did something similar to this, and defined variables in our
dbt
project to hold the schema name that should be used for a given
dbt run
. We then pass in those variables in a custom
BashOperator
task in our Airflow DAGs, rather than using Meltano’s built-in transform operation.
v
@dustin_miller In my case, I have the same model names but they are belong to different DBT projects and not be shared cross projects. When two or more schedules run at the same time, the main DBT project of Meltano will install these DBT projects into it. This leads the issue above.
@taylor Is it possible to each run, Meltano will clone a DBT project with fully filled variables/configurations at that run?
Currently, I can workaround by setting the schedules to run at different times
t
@vh I don’t quite understand what’s happening here. Are you installing multiple dbt packages or transforms?
d
I meant are the models themselves the same between them. In general, for multitenant sort of tasks, I would consider writing your own DAG to extend what Meltano generates, or just keeping the orchestration separated.
v
@taylor No, just only one transform and one dbt package in package.yml. The package repository will be determined at the transform runs via an environment variable. For example
schedules:
- name: transformer-1
extractor: tap-1
loader: target-1
transform: only
interval: '0 */1 * * *'
- name: transformer-2
extractor: tap-2
loader: target-2
transform: only
interval: '0 */1 * * *'
In package.yml
packages:
- git: "{{ env_var('TARGET_DBT_REPO') }}"
@dustin_miller thank you, you're right. Looking for a quick way from Meltano.