Hello! I'm trying to do Change Data Capture on a ...
# troubleshooting
x
Hello! I'm trying to do Change Data Capture on a Postgres database. I'm using Pipelinewise tap-postgres with log based replication and target-jsonl.
meltano run tap-postgres target-jsonl
I'm using S3 as the backend and then using aws cli to upload the jsonl into an s3 bucket. This process all works fine initially but I run into problems when updating the
select
pattern for tables in tap-postgres. After the run completes, there are no entries added to state.json for the new tables. Therefore, every run ends up with those tables getting a full replication. The state.json is getting updated - update timestamp shows this and also the log serial number for the old tables updates. The logs also do not indicate problems:
Copy code
2022-11-29T22:01:41.099318Z [info     ] Reading state from AWS S3
2022-11-29T22:01:41.559899Z [info     ] smart_open.s3.MultipartWriter('XXXXX', 'meltano_state/prod:tap-postgres-to-target-jsonl/lock'): uploading part_num: 1, 17 bytes (total 0.000GB)

....

2022-11-29T22:02:03.196648Z [info     ] Writing state to AWS S3
2022-11-29T22:03:42.028608Z [info     ] smart_open.s3.MultipartWriter('XXXX', 'meltano_state/prod:tap-postgres-to-target-jsonl/lock'): uploading part_num: 1, 17 bytes (total 0.000GB)
2022-11-29T22:03:42.243697Z [info     ] smart_open.s3.MultipartWriter('XXXX', 'meltano_state/prod:tap-postgres-to-target-jsonl/state.json'): uploading part_num: 1, 774 bytes (total 0.000GB)
2022-11-29T22:03:42.382744Z [info     ] Incremental state has been updated at 2022-11-29 22:03:42.382636.
I know the docs recommend a postgres backend but I would rather not set up a postgres db just for this. The s3 persistence should work since I'm just running from one source!
OK I think I tracked down the bug but it's still not fixed: https://meltano.slack.com/archives/C01TCRBBJD7/p1626380302112600
c
@xiaozhou_wang thanks for looking into this, it does look like this is an issue in the tap, another user had this issue when running it as a standalone singer tap. I see you've added to the discussion in the issue there to try to push a solution forward. Thanks for that! If you continue to have issues after the fix is merged in the tap, let me know and we can file a bug report to look into it further.
x
Once the change is merged, how does it flow back into Meltano? I noticed for instance that pypi has pipelinewise-tap-postgres on 2.0.0 but when I try meltano install the latest version is 1.8.4. If this goes into tap-postgres will it be unavailable in Metlano?