silverbullet1
01/14/2023, 8:25 PMmeltano.yml
file. I am planning to orchestrate this via our existing airflow installation(via KPO) by just running the meltano elt tap-xyz target-abc
cmd incrementally. I am using s3 as a state backend.
My solution is working fine except for the state management. Every time the pipeline starts, it does a complete fresh import even though the state files are being written on s3. I see this at the end of the logs:
00954--tap-lever--target-redshift stdio=stderr
2023-01-14T20:11:06.023202Z [info ] Writing state to AWS S3
2023-01-14T20:11:07.232593Z [info ] smart_open.s3.MultipartWriter('mybucketxyz', 'lever-data-state/2023-01-14T200954--tap-lever--target-redshift/lock'): uploading part_num: 1, 17 bytes (total 0.000GB)
2023-01-14T20:11:07.480713Z [info ] smart_open.s3.MultipartWriter('mybucketxyz', 'lever-data-state/2023-01-14T200954--tap-lever--target-redshift/state.json'): uploading part_num: 1, 141 bytes (total 0.000GB)
2023-01-14T20:11:07.744721Z [info ] Incremental state has been updated at 2023-01-14 20:11:07.744534.
2023-01-14T20:11:07.758494Z [info ] Extract & load complete! name=meltano run_id=c62c772d-5297-4a7f-ba20-46cf1965b638 state_id=2023-01-14T200954--tap-lever--target-redshift
2023-01-14T20:11:07.759291Z [info ] Transformation skipped.
Next run:
2023-01-14T20:11:39.625689Z [info ] Reading state from AWS S3
2023-01-14T20:11:41.101692Z [info ] smart_open.s3.MultipartWriter('mybucketxyz"', lever-data-state/2023-01-14T201131--tap-lever--target-redshift/lock'): uploading part_num: 1, 17 bytes (total 0.000GB)
2023-01-14T20:11:41.313667Z [info ] No state found for 2023-01-14T201131--tap-lever--target-redshift.
2023-01-14T20:11:41.369482Z [warning ] No state was found, complete import.
Do I need an external meltano db to implement the solution? from what I understood in the doc is that db is optional if we are relying on a cloud storage for state management.kamal_singh_naruka
01/16/2023, 12:13 PMmeltano elt tap-xyz target-abc --state-id <some-id>
silverbullet1
01/16/2023, 12:16 PMSven Balnojan
01/17/2023, 8:16 AMedgar_ramirez_mondragon
01/18/2023, 4:16 PMIt should pick the state automatically from s3 right?
meltano run
would auto-generate the state key and pull the right payload from S3, but meltano elt
requires you to explicitly set the --state-id
option