Hello all, Having some trouble with a dbt transfo...
# troubleshooting
d
Hello all, Having some trouble with a dbt transform. Long story short: I had a syntax error in my model (incorrect column name), and dbt failed as you would expect. I have since fixed the column name in my model repo, however the dbt transform seems to be still using the old model. Where are the the dbt packages downloaded, and how can I force it to refresh? I have tried
meltano invoke dbt deps
but still no worky. My log:
Copy code
bt                   | Running with dbt=0.16.1
dbt                   | Found 4 models, 0 tests, 0 snapshots, 0 analyses, 127 macros, 0 operations, 0 seed files, 0 sources
dbt                   |
dbt                   | 07:04:05 | Concurrency: 2 threads (target='postgres')
dbt                   | 07:04:05 |
dbt                   | 07:04:05 | 1 of 4 START table model analytics.site_metrics...................... [RUN]
dbt                   | 07:04:05 | 2 of 4 START table model analytics.sites............................. [RUN]
dbt                   | 07:04:06 | 1 of 4 OK created table model analytics.site_metrics................. [SELECT 106 in 0.22s]
dbt                   | 07:04:06 | 3 of 4 START table model analytics.project_metrics................... [RUN]
dbt                   | 07:04:06 | 2 of 4 OK created table model analytics.sites........................ [SELECT 65 in 0.22s]
dbt                   | 07:04:06 | 4 of 4 START table model analytics.projects.......................... [RUN]
dbt                   | 07:04:06 | 4 of 4 ERROR creating table model analytics.projects................. [ERROR in 0.07s]
dbt                   | 07:04:06 | 3 of 4 OK created table model analytics.project_metrics.............. [SELECT 27 in 0.09s]
dbt                   | 07:04:06 |
dbt                   | 07:04:06 | Finished running 4 table models in 0.53s.
dbt                   |
dbt                   | Completed with 1 error and 0 warnings:
dbt                   |
dbt                   | Database Error in model projects (models/base/projects.sql)
dbt                   |   column "project_group_id" does not exist
dbt                   |   LINE 22:         project_group_id
dbt                   |                    ^
dbt                   |   compiled SQL at target/run/tap_s3_csv/base/projects.sql
"project_group_id" should actually be
project_id
If I look in
<meltano-project>/transform/target/compiled/<my_tap>/base/projects.sql
it appears to tbe hte latest version:
Copy code
WITH sites AS (
    SELECT * FROM "postgres"."analytics"."sites"

),

projects AS (
    SELECT
        project_id,
        ...
Any ideas?
p
do you also see the latest version in
<meltano-project>/transform/target/run/<my_tap>/base/projects.sql
?
d
@prratek_ramchandani Yes, the latest version is in that path
d
@david_tout How about under
.meltano/transformers/dbt/target/run/...
?
If you run
meltano invoke dbt clean
ahead of
meltano invoke dbt deps
, does that fix it?
d
Hi @douwe_maan Under that path I only have a virtual env:
Copy code
ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ cd .meltano/transformers/dbt
ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql/.meltano/transformers/dbt$ ls
venv
d
OK, that's good then ๐Ÿ™‚
d
What am I supposed to provide for
dbt clean
?
Copy code
.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano invoke dbt clean
Running with dbt=0.16.1
Encountered an error:
Compilation Error
  Env var required but not provided: 'DBT_TARGET'
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano invoke dbt clean --target tap_s3_csv
Running with dbt=0.16.1
Encountered an error while reading profiles:
  ERROR Runtime Error
  The profile 'meltano' does not have a target named 'tap_s3_csv'. The valid target names for this profile are:
   - snowflake
   - postgres
   - bigquery
Defined profiles:
 - meltano

For more information on configuring profiles, please consult the dbt docs:

<https://docs.getdbt.com/docs/configure-your-profile>

Encountered an error:
Runtime Error
  Could not run dbt
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano invoke dbt clean --target postgres
Running with dbt=0.16.1
Encountered an error:
Compilation Error
  Env var required but not provided: 'PG_ADDRESS'
That almost sounds like it's going to clean out the target database?
d
Hmm
https://docs.getdbt.com/reference/commands/clean
dbt clean
is a utility function that deletes all folders specified in the
clean-targets
list specified in
dbt_project.yml
. This is useful for deleting the
dbt_modules
and
target
directories.
d
I might have something misconfigured...
d
So it's not as scary as it sounds, but it does apparently need all the connection information
d
In this case, we can hose the DB ๐Ÿ™‚ it's a personal sandbox
Can we tell meltano to provide that info from
meltano.yml
, or maybe I am better off doing a bunch of
export PG_ADDRESS
manually...
d
The reason the config is missing is because we don't have https://gitlab.com/meltano/meltano/-/issues/2546 yet
d
exports
it is
d
When dbt is run as part of
meltano elt
, it takes the configuration context from the loader, but
meltano invoke
doesn't know how to do that yet
d
No worries ๐Ÿ™‚
d
Note that
dbt clean
and
dbt deps
are both run as part of
meltano elt ... --transform=run
d
I will export the env vars and try again
d
Idea! Run
meltano elt tap target --transform=only
That will skip the tap and target but still inject the target config into dbt ๐Ÿ™‚
d
I've just do it via `export`ing each param manually, and that worked (well it invoked)
I'll try
dbt deps
now, and if that fails can retry with
--transform=only
OK I think I have spotted a possible problem
Copy code
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano invoke dbt clean --target postgres
Running with dbt=0.16.1
Checking compile_targets/*
 Cleaned compile_targets/*
Checking dbt_modules/*
 Cleaned dbt_modules/*
Checking logs/*
 Cleaned logs/*
Finished cleaning all paths.
Note
compile_targets
in there? Originally i was being too clever for my own good and messing around with paths. I've just seen that my
dbt_project.yml
file in my meltano project has:
Copy code
clean-targets:
- compile_targets
- dbt_modules
- logs

target-path: target
modules-path: dbt_modules
log-path: logs
D'oh
d
๐Ÿ˜„
d
Ok interestingly we stil have the same problem. I fixed that path issue
Copy code
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano elt tap-s3-csv target-postgres --transform=only
meltano | Extract & load skipped.
meltano | Running transformation...
dbt     | Running with dbt=0.16.1
dbt     | Checking target/*
dbt     |  Cleaned target/*
dbt     | Checking dbt_modules/*
dbt     |  Cleaned dbt_modules/*
dbt     | Checking logs/*
dbt     |  Cleaned logs/*
dbt     | Finished cleaning all paths.
dbt     | Running with dbt=0.16.1
dbt     | WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
dbt     |       This can introduce breaking changes into your project without warning!
dbt     |
dbt     | See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
dbt     | WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
dbt     |       This can introduce breaking changes into your project without warning!
dbt     |
dbt     | See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
dbt     | Installing <mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git@master
dbt     |   Installed from revision master
dbt     |
dbt     | Running with dbt=0.16.1
dbt     | Found 4 models, 0 tests, 0 snapshots, 0 analyses, 127 macros, 0 operations, 0 seed files, 0 sources
dbt     |
dbt     | 23:31:44 | Concurrency: 2 threads (target='postgres')
dbt     | 23:31:44 |
dbt     | 23:31:44 | 1 of 4 START table model analytics.site_metrics...................... [RUN]
dbt     | 23:31:44 | 2 of 4 START table model analytics.sites............................. [RUN]
dbt     | 23:31:44 | 2 of 4 OK created table model analytics.sites........................ [SELECT 65 in 0.13s]
dbt     | 23:31:44 | 1 of 4 OK created table model analytics.site_metrics................. [SELECT 107 in 0.13s]
dbt     | 23:31:44 | 3 of 4 START table model analytics.projects.......................... [RUN]
dbt     | 23:31:44 | 4 of 4 START table model analytics.project_metrics................... [RUN]
dbt     | 23:31:44 | 3 of 4 ERROR creating table model analytics.projects................. [ERROR in 0.04s]
dbt     | 23:31:44 | 4 of 4 OK created table model analytics.project_metrics.............. [SELECT 27 in 0.07s]
dbt     | 23:31:44 |
dbt     | 23:31:44 | Finished running 4 table models in 0.40s.
dbt     |
dbt     | Completed with 1 error and 0 warnings:
dbt     |
dbt     | Database Error in model projects (models/base/projects.sql)
dbt     |   column "project_group_id" does not exist
dbt     |   LINE 22:         project_group_id
dbt     |                    ^
dbt     |   compiled SQL at target/run/tap_s3_csv/base/projects.sql
dbt     |
dbt     | Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4
meltano | Transformation failed (1): Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4
meltano | ELT could not be completed: `dbt run` failed
ELT could not be completed: `dbt run` failed
Tried
dbt deps
as well. Same issue:
Copy code
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano invoke dbt deps --target postgres
Running with dbt=0.16.1
WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
        This can introduce breaking changes into your project without warning!

See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
        This can introduce breaking changes into your project without warning!

See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
Installing <mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git@master
  Installed from revision master
d
Can you do a directory-wide search inside your Meltano project for
project_group_id
? It may be under some other hidden
.
directory
(I'm off for the day, I'll try to help again tomorrow!)
d
sorry, call came in. Thanks for the help!
946am this morning here in Sydney, Aus, so will be bashing away for a while yet ๐Ÿ™‚
d
@aaronsteers will still be around for another hour I think so he may be able to help as well ๐Ÿ™‚
But right now the challenge is to find where that stale file is coming from
Can you revert
clean-targets
back to what it was in https://gitlab.com/meltano/files-dbt/-/blob/master/bundle/transform/dbt_project.yml#L20? Maybe that
../.meltano
bit is important
(And now I'm off, good luck!)
d
will do, thanks again
a
Just tagging in here. Iโ€™m caught up on this thread. Sounds like we have at or around
target/run/tap_s3_csv/base/projects.sql
that has a syntax or code error? Agreed with Douwe, making sure the clean-targets is set and working correctly makes sense. And you should also be able to go in and manually delete any offending files if the clean doesnโ€™t seem to work due to relative paths or similar issue. Do I understand correctly this is a sql transform you wrote and have since updated or deleted, or is this one of the out-of-box transforms?
d
The former. It's a really simple query, but I made a mistake and had an incorrect column name in it. I've since fixed it, but it's still quering the old name
@aaronsteers I've set the
dbt_project.yml
file back to the
../.meltano...
path, and re-run thr transform:
Copy code
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ meltano elt tap-s3-csv target-postgres --transform=only
meltano | Extract & load skipped.
meltano | Running transformation...
dbt     | Running with dbt=0.16.1
dbt     | Checking ../.meltano/transformers/dbt/target/*
dbt     | ERROR: not cleaning ../.meltano/transformers/dbt/target/* because it is protected
dbt     | Checking dbt_modules/*
dbt     |  Cleaned dbt_modules/*
dbt     | Checking logs/*
dbt     |  Cleaned logs/*
dbt     | Finished cleaning all paths.
dbt     | Running with dbt=0.16.1
dbt     | WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
dbt     |       This can introduce breaking changes into your project without warning!
dbt     |
dbt     | See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
dbt     | WARNING: The git package "<mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git" is not pinned.
dbt     |       This can introduce breaking changes into your project without warning!
dbt     |
dbt     | See <https://docs.getdbt.com/docs/package-management#section-specifying-package-versions>
dbt     | Installing <mailto:git@gitlab.com|git@gitlab.com>:nghyway/data/dbt-tap-s3-csv-prototype.git@master
dbt     |   Installed from revision master
dbt     |
dbt     | Running with dbt=0.16.1
dbt     | Found 4 models, 0 tests, 0 snapshots, 0 analyses, 127 macros, 0 operations, 0 seed files, 0 sources
dbt     |
dbt     | 23:58:14 | Concurrency: 2 threads (target='postgres')
dbt     | 23:58:14 |
dbt     | 23:58:14 | 1 of 4 START table model analytics.site_metrics...................... [RUN]
dbt     | 23:58:14 | 2 of 4 START table model analytics.sites............................. [RUN]
dbt     | 23:58:14 | 1 of 4 OK created table model analytics.site_metrics................. [SELECT 107 in 0.12s]
dbt     | 23:58:14 | 2 of 4 OK created table model analytics.sites........................ [SELECT 65 in 0.12s]
dbt     | 23:58:14 | 3 of 4 START table model analytics.project_metrics................... [RUN]
dbt     | 23:58:14 | 4 of 4 START table model analytics.projects.......................... [RUN]
dbt     | 23:58:15 | 4 of 4 ERROR creating table model analytics.projects................. [ERROR in 0.04s]
dbt     | 23:58:15 | 3 of 4 OK created table model analytics.project_metrics.............. [SELECT 27 in 0.08s]
dbt     | 23:58:15 |
dbt     | 23:58:15 | Finished running 4 table models in 0.40s.
dbt     |
dbt     | Completed with 1 error and 0 warnings:
dbt     |
dbt     | Database Error in model projects (models/base/projects.sql)
dbt     |   column "project_group_id" does not exist
dbt     |   LINE 22:         project_group_id
dbt     |                    ^
dbt     |   compiled SQL at ../.meltano/transformers/dbt/target/run/tap_s3_csv/base/projects.sql
dbt     |
dbt     | Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4
meltano | Transformation failed (1): Done. PASS=3 WARN=0 ERROR=1 SKIP=0 TOTAL=4
meltano | ELT could not be completed: `dbt run` failed
ELT could not be completed: `dbt run` failed
Looks like we're getting a permission error:
ERROR: not cleaning ../.meltano/transformers/dbt/target/* because it is protected
However, the files in there are updated and correct...
Here are the files:
Copy code
.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ cat .meltano/transformers/dbt/target/compiled/tap_s3_csv/base/projects.sql
WITH sites AS (
    SELECT * FROM "postgres"."analytics"."sites"

),

projects AS (
    SELECT
        project_id,
        sum(contract_sum)       AS contract_sum,
        avg(latitude)           AS latitude,
        avg(longitude)          AS longitude,
        sum(opportunity_sum)    AS opportunity_sum,
        max(updated_at)         AS updated_at
    FROM
        sites
    GROUP BY
        project_group_id

)

SELECT * FROM projects
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ cat .meltano/transformers/dbt/target/compiled/tap_s3_csv/basels /projects.sql                    run/tap_s3_csv/base/projects.sql  source ../.venv/bin/activate^C
(.venv) ssm-user@ip-10-47-105-208:~/dev/meltano-prototype/s3-to-psql$ cat .meltano/transformers/dbt/target/run/tap_s3_csv/base/projects.sql


  create  table "postgres"."analytics"."projects__dbt_tmp"
  as (
    WITH sites AS (
    SELECT * FROM "postgres"."analytics"."sites"

),

projects AS (
    SELECT
        project_id,
        sum(contract_sum)       AS contract_sum,
        avg(latitude)           AS latitude,
        avg(longitude)          AS longitude,
        sum(opportunity_sum)    AS opportunity_sum,
        max(updated_at)         AS updated_at
    FROM
        sites
    GROUP BY
        project_group_id

)

SELECT * FROM projects
a
If I understand correctly that the above files are correct/updated, can you confirm if thereโ€™s a
project_group_id
column in the table
projects
?
Sorry - I mean from โ€˜sitesโ€™
d
was at standup...
postgres.analytics.sites
has:
Copy code
site_id                | project_id |
--------------------------------------+------------+
0461ea55-2fe5-4417-8558-5429a9f59976 |     102553 |
the
sites.sql
file in the model has:
Copy code
WITH source AS (
    SELECT * FROM {{var('schema')}}.projects

),

sites AS (
    SELECT
        __sdc_primary_key   AS site_id,
        projectgroupid      AS project_id,
        contract_sum,
        latitude,
        longitude,
        opportunity_sum,
        postcode            AS post_code,
        updated_at,
        zoom_level
    FROM
        source

)

SELECT * FROM sites
The mistake is that originally I was going to call it
project_group_id
but renamed to just
project_id
to be simpler, and now it's stuck on the original name... ๐Ÿ˜ž
@aaronsteers @douwe_maan Both of you have been immensely helpful, and I've learned a lot. I think I found the problem. Syntax error in a group by clause (still had the old name) and the entire time I was mis-reading where it was saying the problem was. Sorry for all the trouble!
Copy code
SELECT
        project_id,
        sum(contract_sum)       AS contract_sum,
        avg(latitude)           AS latitude,
        avg(longitude)          AS longitude,
        sum(opportunity_sum)    AS opportunity_sum,
        max(updated_at)         AS updated_at
    FROM
        sites
    GROUP BY
        project_group_id
Oopsie!
Copy code
dbt     | 00:37:29 | Concurrency: 2 threads (target='postgres')
dbt     | 00:37:29 |
dbt     | 00:37:29 | 1 of 4 START table model analytics.site_metrics...................... [RUN]
dbt     | 00:37:29 | 2 of 4 START table model analytics.sites............................. [RUN]
dbt     | 00:37:30 | 1 of 4 OK created table model analytics.site_metrics................. [SELECT 107 in 0.15s]
dbt     | 00:37:30 | 3 of 4 START table model analytics.project_metrics................... [RUN]
dbt     | 00:37:30 | 2 of 4 OK created table model analytics.sites........................ [SELECT 65 in 0.16s]
dbt     | 00:37:30 | 4 of 4 START table model analytics.projects.......................... [RUN]
dbt     | 00:37:30 | 4 of 4 OK created table model analytics.projects..................... [SELECT 4 in 0.06s]
dbt     | 00:37:30 | 3 of 4 OK created table model analytics.project_metrics.............. [SELECT 27 in 0.08s]
dbt     | 00:37:30 |
dbt     | 00:37:30 | Finished running 4 table models in 0.41s.
dbt     |
dbt     | Completed successfully
Just confirming for all that it works now ๐Ÿ™‚
d
@david_tout Glad we figured it out in the end!