I am using `tap-spreadsheets-anywhere` to extract ...
# troubleshooting
n
I am using
tap-spreadsheets-anywhere
to extract s3 inventory reports from an s3 bucket and trying to fine tune the replication method. I have set the replication in the meltano.yml as follows:
Copy code
metadata:
      inventory_reporting:
        replication-method: INCREMENTAL
        replication-key: last_modified_date
However it seems that because the s3 inventory reports are written daily and have their own timestamp at the file level, the tap is defaulting to using the overall file modified date vs. the column level modified date. My desired outcome is to only load records from the latest report that have a last_modified_date > the last ELT run, but right now its loading all records/rows from the latest report if the report modified date is > the last ELT run. Has anyone run into this? Am I missing a setting to override the file level and use the more granular row level modified date for replication?
e
Unfortunately, I don't think it's supported by the tap. But do open an issue in their repo. Folks over there may have workarounds or implementation ideas. https://github.com/ets/tap-spreadsheets-anywhere/issues
n
Thanks so much @edgar_ramirez_mondragon. I submitted an issue, but noticing there is not a ton of activity the last few months on the repo. Any insight into the project?
e
In the case of that tap's repo I know inactivity doesn't mean it's not being used since it's one of the most popular plugins. It probably just means it's working fine for folks as is, so a new feature like this should probably be picked up by someone with the use case.