Hello everyone, I'm using tap postger with target ...
# getting-started
m
Hello everyone, I'm using tap postger with target snowflake and I'm curious about the singer_sfk metric
record_count
as it shows not exactly the number of records that has been inserted. In this example this iteration delivered exactly 250,000 records. What does the singer_sfk metric
record_count
mean?
Copy code
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 249964, "tags": {"stream": "public-table_name", "context": {}}}
name=target_snowflake message=Flush triggered by batch_size_rows (250000) reached in public-table_name
name=target_snowflake message=Uploading 250000 rows to stage
name=target_snowflake message=Loading 250000 rows into "table_name"
name=target_snowflake message=Loading into PROD_POSTGRES_DB_SCD."table_name": {"inserts": 250000, "updates": 0, "size_bytes": 10546932}
e
Hi @mykola_zavada! Is that the only
record_count
metric message you see?
m
Hi Edgar! it can be one or two when the target batch_size_rows set to 250,000
Copy code
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 196808, "tags": {"stream": "public-decision_tree_patient_request", "context": {}}}
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 209960, "tags": {"stream": "public-decision_tree_patient_request", "context": {}}}
message=Flush triggered by batch_size_rows (250000) reached in public-decision_tree_patient_request
message=Uploading 250000 rows to stage                           │
message=Loading 250000 rows into 'PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST"'
message=Loading into PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST": {"inserts": 250000, "updates": 0, "size_bytes": 34397997}
message=Emitting state {"bookmarks": {"public-decision_tree_patient_request": {"replication_key_signpost": "2023-10-19T13:55:53.645203+00:00", "starting_replication_value": null, " │

Incremental state has been updated at 2023-10-19 15:01:42.811357.                                                                                                                                                                              
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 79978, "tags": {"stream": "public-decision_tree_patient_request", "context": {}}}
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 207160, "tags": {"stream": "public-decision_tree_patient_request", "context": {}}}
message=Flush triggered by batch_size_rows (250000) reached in public-decision_tree_patient_request
message=Uploading 250000 rows to stage
message=Loading 250000 rows into 'PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST"'
message=Loading into PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST": {"inserts": 250000, "updates": 0, "size_bytes": 34433573}
message=Emitting state {"bookmarks": {"public-decision_tree_patient_request": {"replication_key_signpost": "2023-10-19T13:55:53.645203+00:00", "starting_replication_value": null, " │

Incremental state has been updated at 2023-10-19 15:03:29.706612.                                                                                                                                                                              │
singer_sdk.metrics   | METRIC: {"type": "counter", "metric": "record_count", "value": 91771, "tags": {"stream": "public-decision_tree_patient_request", "context": {}}}
message=Flush triggered by batch_size_rows (250000) reached in public-decision_tree_patient_request
message=Uploading 250000 rows to stage
message=Loading 250000 rows into 'PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST"'
message=Loading into PROD_POSTGRES_DB_SCD."DECISION_TREE_PATIENT_REQUEST": {"inserts": 250000, "updates": 0, "size_bytes": 34350380}
message=Emitting state {"bookmarks": {"public-decision_tree_patient_request": {"replication_key_signpost": "2023-10-19T13:55:53.645203+00:00", "starting_replication_value": null, "
Incremental state has been updated at 2023-10-19 15:05:14.840612.
I use tap-postgres by meltanolabs and target-snowflake by transferwise
e
Ok. That makes more sense. The metric reflects the rate at which records are read from the source, while the target always inserts batches of the same (configurable) size. You can think of the tap metrics as a time series with values emitted every ~1min and the rate at which the tap reads records and the rate at which the target processes 250k-record batches may differ.
m
ok, that means that the tap read first 196,808+209,960=406,768 then target inserted 250k 406.768-250,000=156,768 then read more 156,768 + 79, 978 + 207,160 = 443,906 then target inserted 250k ... that makes sense! thank you!