Hey everyone, happy midweek! Is there a way to sp...
# singer-tap-development
s
Hey everyone, happy midweek! Is there a way to specify the replication method from the
meltano.yml
? I have the replication method set in one of my taps (incremental), but I would love to trigger a full table replication once in a while
e
Not in
meltano.yml
, but the
run
command has a
--full-refresh
flag
s
Would that get rid of historical values?
I have a tap set in incremental that I'm trying to remove historical values that have been deleted from
a
@Stéphane Burwash - Can you try the
metadata
option in this stack overflow answer?
This might work, but may be dependent upon specific tap behaviors:
Copy code
extractors:
- name: tap-postgres
  metadata:
    "stream_name":
      replication-method: FULL_TABLE
s
I'll try it, thanks @aaronsteers 😄
@aaronsteers for some reason this doesnt seem to be functionning - I'll need to investigate
Found this in my catalog (using
meltano invoke --dump=catalog tap-wrike--timelog-ids > state.json
)
Copy code
"stream": "timelogs",
      "metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "table-key-properties": [
              "id"
            ],
            "forced-replication-method": "INCREMENTAL",
            "valid-replication-keys": [
              "updatedDate"
            ],
            "inclusion": "available",
            "selected": true,
            "replication-method": "FULL_TABLE"
          }
        },
What is the difference between
replication-method
and
forced-replication-method
?
a
Which tap are you using in this case?
s
tap-wrike -> It's a custom tap
a
👍 I found this:
message has been deleted
s
Even when I set the forced-replication-method it forces incremental Also, when I remove the replication key, the tap stops functionning, so it seems to be forcing incremental even with the metadata flag
a
So, unfortunately, it looks like the specific tap implementation is ignoring your preference here.
I was going to suggest overriding
forced-replication-method
but it sounds like you tried that already
I can provide a little bit of background here: there are some streams that just can't be implemented with INCREMENTAL or FULL_TABLE replication.
s
Yeah sadly already tried it:
Copy code
- name: tap-wrike--timelog-ids
    metadata:
      timelogs:
        # set all streams to "full table" mode
        forced-replication-method: FULL_TABLE
        replication-method: FULL_TABLE
    inherit_from: tap-wrike
    select:
    - timelogs.id
This will crash because I don't specify the rpelication key
a
For instance, if the source API requires a start date
What about syncing this/these stream(s) using a start date like 1900-01-01
Looks like this is the API in question: https://developers.wrike.com/api/v4/timelogs/
I don't see why full table could not be supported, but perhaps it's just a lot of data
s
It's very weird - it seems to force incremental every time
e
@Stéphane Burwash can you try using
replication_method
? (i.e. with an underscore)
a
Agreed, this is worth trying. 👆 I looked closer at their code and logged this issue: https://github.com/potloc/tap-wrike/issues/7 If you have time to try out, you could fork the repo, remove that one line with
replication_method=replication_method,
and then try from the fork.
s
Thanks @aaronsteers, I'll make a PR and look into that; Just to confirm (because this would be much easier 😅 ) there is not way to pass the
--full-refresh
flag in a job ?
Welp I did what any sensible person would do - I created a second version of my tap built on the sdk 😅 it currently only has timelogs, but I'll build it up over time
a
OMG - Well, nice work!! 🚀
there is not way to pass the
--full-refresh
flag in a job ?
Unfortunately not, not as of now that I'm aware of. We have an issue though tracking this as a feature request. I can dig up the link if helpful.
s
Not at the moment thanks - my current solution will work. Could definitely be a cool feature though
a
Could definitely be a cool feature though
Agreed. I imagine post-launch of Meltano Cloud GA (if not sooner), we're going to want this as a way for users to do full backfills on Cloud-hosted projects.