luke_rodgers
05/09/2023, 4:31 PMstorage_write_api
), incremental replication on a ~260M row table
after running for about 12 hours, meltano starts using almost all available RAM on a 4G EC2 box, and basically grinds to a halt.
when things are working smoothly, logs look like this:
2023-05-09T16:25:04.617198Z [info ] 2023-05-09 16:25:04,613 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 71500. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:05.896054Z [info ] 2023-05-09 16:25:05,895 | INFO | target-bigquery | Target sink for 'public-versions' is full. Draining... cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:05.896655Z [info ] 2023-05-09 16:25:05,895 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 72000. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:06.589807Z [info ] 2023-05-09 16:25:06,589 | INFO | target-bigquery | Target sink for 'public-versions' is full. Draining... cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:06.591187Z [info ] 2023-05-09 16:25:06,591 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 72500. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.034480Z [info ] 2023-05-09 16:25:07,034 | INFO | target-bigquery | Target sink for 'public-versions' is full. Draining... cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.035043Z [info ] 2023-05-09 16:25:07,034 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 73000. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.463604Z [info ] 2023-05-09 16:25:07,463 | INFO | target-bigquery | Target sink for 'public-versions' is full. Draining... cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.466982Z [info ] 2023-05-09 16:25:07,466 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 73500. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.853789Z [info ] 2023-05-09 16:25:07,853 | INFO | target-bigquery | Target sink for 'public-versions' is full. Draining... cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
2023-05-09T16:25:07.854452Z [info ] 2023-05-09 16:25:07,854 | INFO | target-bigquery | [a57699c1088f4a17a83f1812f70b389e] Sent 500 rows to projects/REDACTED/datasets/meltano_prod/tables/public_versions/streams/_default with offset 74000. cmd_type=elb consumer=True name=target-bigquery producer=False stdio=stderr string_id=target-bigquery
but at a certain point, the logs start looking like this:
```2023-05-09T150921.273318Z [info ] 2023-05-09 150921,272 | INFO | target-bigquery | Target sink …visch
05/09/2023, 4:33 PMpat_nadolny
05/09/2023, 4:45 PMi would expect meltano to eventually just crash or enter a recover process if it’s been in this state for long enough, but it probably only does so if the target bubbles up an error or somethingI think this is true, meltano is running the tap/target and doesnt get any alert that the target is hanging unless it errors out so I dont think Meltano would be able to do anything. Depending on what you figure out the cause is, its possible the SDK could implement a way to detect this for target developers and raise an error instead of hanging.
thomas_briggs
05/09/2023, 5:11 PMluke_rodgers
05/09/2023, 7:20 PMthomas_briggs
05/09/2023, 7:36 PMthomas_briggs
05/09/2023, 7:37 PM.meltano/logs/elt/<pipeline_name>
directoryluke_rodgers
05/10/2023, 12:08 AMluke_rodgers
05/10/2023, 1:51 AM