thanat_varathon
07/20/2023, 3:54 PMtap-oracle
and target-oracle
(both default variant)
Here’s the meltano.yml
version: 1
default_environment: dev
send_anonymous_usage_stats: false
environments:
- name: dev
- name: prod
plugins:
extractors:
- name: tap-oracle-erp
inherit_from: tap-oracle
variant: s7clarke10
pip_url: git+<https://github.com/s7clarke10/pipelinewise-tap-oracle.git>
config:
filter_schemas: MY_SCHEMA
filter_tables:
- TABLE_1
- TABLE_2
- .......
- TABLE_80
default_replication_method: INCREMENTAL
metadata:
'*':
replication-key: UPDATED
loaders:
- name: target-oracle-warehouse
inherit_from: target-oracle
variant: radbrt
pip_url: git+<https://github.com/radbrt/target-oracle.git>
When I run meltano run tap-oracle-erp target-oracle-warehouse
on the server, I notice that tap will start pulling data from the source tables but not immediately insert into the target tables. For example, here’s the log that shows the the tap pulling the data from the source
2023-07-20T10:58:02.873132Z [info ] time=2023-07-20 10:58:02 name=singer level=INFO message=select SELECT <rest of query>" cmd_type=elb consumer=False name=tap-oracle-erp producer=True stdio=stderr string_id=tap-oracle-erp
but the log that the target inserting data of the same table into the sink shows 30 minutes later. (in between the sync of other tables are running normally)
2023-07-20T11:27:40.090421Z [info ] 2023-07-20 11:27:40,090 Creating temp table c_doctype cmd_type=elb consumer=True name=target-oracle-warehouse producer=False stdio=stderr string_id=target-oracle-warehouse
2023-07-20T11:27:40.329562Z [info ] 2023-07-20 11:27:40,328 Inserting with SQL: INSERT INTO c_doctype_temp cmd_type=elb consumer=True name=target-oracle-warehouse producer=False stdio=stderr string_id=target-oracle-warehouse
2023-07-20T11:27:40.329888Z [info ] (<column_names) cmd_type=elb consumer=True name=target-oracle-warehouse producer=False stdio=stderr string_id=target-oracle-warehouse
2023-07-20T11:27:40.330042Z [info ] VALUES (<column_names>) cmd_type=elb consumer=True name=target-oracle-warehouse producer=False stdio=stderr string_id=target-oracle-warehouse
I would like to ask how Meltano works under the hood when the config has many tables to sync (80 in this case). Does Meltano try to finish syncing table one by one sequentially? Or try to pull the data and store somewhere first and then gradually push it into the target? Or this behavior depends on each combination of tap/target? (just wonder why the tap activity and target activity of the same table shows 30 minutes apart)