Hey everyone happy midweek Is there a way to specify the rep Meltano #singer-tap-development

Hey everyone, happy midweek! Is there a way to sp...

Stéphane Burwash

03/08/2023, 4:50 PM

Hey everyone, happy midweek! Is there a way to specify the replication method from the

meltano.yml

? I have the replication method set in one of my taps (incremental), but I would love to trigger a full table replication once in a while

edgar_ramirez_mondragon

03/08/2023, 5:10 PM

Not in

meltano.yml

, but the

run

command has a

--full-refresh

flag

Stéphane Burwash

03/08/2023, 5:11 PM

Would that get rid of historical values?

Stéphane Burwash

03/08/2023, 5:11 PM

I have a tap set in incremental that I'm trying to remove historical values that have been deleted from

aaronsteers

03/08/2023, 6:10 PM

@Stéphane Burwash - Can you try the

metadata

option in this stack overflow answer?

aaronsteers

03/08/2023, 6:10 PM

https://stackoverflow.com/questions/71669574/how-to-override-a-streams-primary-key-properties-or-incremental-replication-key

aaronsteers

03/08/2023, 6:10 PM

This might work, but may be dependent upon specific tap behaviors:

Copy code

extractors:
- name: tap-postgres
  metadata:
    "stream_name":
      replication-method: FULL_TABLE

Stéphane Burwash

03/08/2023, 6:53 PM

I'll try it, thanks @aaronsteers 😄

Stéphane Burwash

03/08/2023, 9:50 PM

@aaronsteers for some reason this doesnt seem to be functionning - I'll need to investigate

Stéphane Burwash

03/08/2023, 10:04 PM

Found this in my catalog (using

meltano invoke --dump=catalog tap-wrike--timelog-ids > state.json

)

Copy code

"stream": "timelogs",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "table-key-properties": [
              "id"
            ],
            "forced-replication-method": "INCREMENTAL",
            "valid-replication-keys": [
              "updatedDate"
            ],
            "inclusion": "available",
            "selected": true,
            "replication-method": "FULL_TABLE"
          }
        },

What is the difference between

replication-method

and

forced-replication-method

aaronsteers

03/08/2023, 10:14 PM

Which tap are you using in this case?

Stéphane Burwash

03/08/2023, 10:15 PM

tap-wrike -> It's a custom tap

aaronsteers

03/08/2023, 10:16 PM

👍 I found this:

aaronsteers

03/08/2023, 10:16 PM

message has been deleted

Stéphane Burwash

03/08/2023, 10:16 PM

Even when I set the forced-replication-method it forces incremental Also, when I remove the replication key, the tap stops functionning, so it seems to be forcing incremental even with the metadata flag

aaronsteers

03/08/2023, 10:16 PM

https://hub.meltano.com/singer/spec/

aaronsteers

03/08/2023, 10:17 PM

So, unfortunately, it looks like the specific tap implementation is ignoring your preference here.

aaronsteers

03/08/2023, 10:17 PM

I was going to suggest overriding

forced-replication-method

but it sounds like you tried that already

aaronsteers

03/08/2023, 10:18 PM

I can provide a little bit of background here: there are some streams that just can't be implemented with INCREMENTAL or FULL_TABLE replication.

Stéphane Burwash

03/08/2023, 10:18 PM

Yeah sadly already tried it:

Copy code

- name: tap-wrike--timelog-ids
    metadata:
      timelogs:
        # set all streams to "full table" mode
        forced-replication-method: FULL_TABLE
        replication-method: FULL_TABLE
    inherit_from: tap-wrike
    select:
    - timelogs.id

This will crash because I don't specify the rpelication key

aaronsteers

03/08/2023, 10:18 PM

For instance, if the source API requires a start date

aaronsteers

03/08/2023, 10:19 PM

What about syncing this/these stream(s) using a start date like 1900-01-01

aaronsteers

03/08/2023, 10:20 PM

Looks like this is the API in question: https://developers.wrike.com/api/v4/timelogs/

aaronsteers

03/08/2023, 10:20 PM

I don't see why full table could not be supported, but perhaps it's just a lot of data

Stéphane Burwash

03/08/2023, 10:22 PM

It's very weird - it seems to force incremental every time

edgar_ramirez_mondragon

03/08/2023, 10:32 PM

@Stéphane Burwash can you try using

replication_method

? (i.e. with an underscore)

aaronsteers

03/08/2023, 10:38 PM

Agreed, this is worth trying. 👆 I looked closer at their code and logged this issue: https://github.com/potloc/tap-wrike/issues/7 If you have time to try out, you could fork the repo, remove that one line with

replication_method=replication_method,

and then try from the fork.

Stéphane Burwash

03/09/2023, 2:15 PM

Thanks @aaronsteers, I'll make a PR and look into that; Just to confirm (because this would be much easier 😅 ) there is not way to pass the

--full-refresh

flag in a job ?

Stéphane Burwash

03/09/2023, 8:06 PM

Welp I did what any sensible person would do - I created a second version of my tap built on the sdk 😅 it currently only has timelogs, but I'll build it up over time

aaronsteers

03/09/2023, 8:16 PM

OMG - Well, nice work!! 🚀

there is not way to pass the
--full-refresh
flag in a job ?

Unfortunately not, not as of now that I'm aware of. We have an issue though tracking this as a feature request. I can dig up the link if helpful.

Stéphane Burwash

03/09/2023, 8:20 PM

Not at the moment thanks - my current solution will work. Could definitely be a cool feature though

aaronsteers

03/09/2023, 8:23 PM

Could definitely be a cool feature though

Agreed. I imagine post-launch of Meltano Cloud GA (if not sooner), we're going to want this as a way for users to do full backfills on Cloud-hosted projects.

Open in Slack

Previous Next