Have starting using `tap-ga4` this week, but notic...
# plugins-general
a
Have starting using
tap-ga4
this week, but noticing some errors occuring. Seems very intermittent error though, one run fails , another run later succeeds. My scheduled runs at midnight seem to fail 90% of the time. Error seems to be related the
devicemodel
field in the
devices
stream, and a
(not set)
text value from the API somehow being coerced to
null
, which as the
devicemodel
field is a PK field, postgres rightly complains about. Stack trace in the thread: Seeing as this field seems to always contain
(not set)
for me, I would probably fork the tap and drop it from the schema if I am the only one with the issue.
Copy code
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.767132Z [info     ] psycopg2.errors.NotNullViolation: null value in column "devicemodel" of relation "tmp_c0f2008b_b0ca_40c8_be2e_8283e01b71c4" violates not-null constraint cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.767235Z [info     ] DETAIL:  Failing row contains (167, 7.088752074034902, 0.9989423585404548, Safari, 20230831, mobile, null, 0.0010576414595452142, 1896, iOS, 2023-08-31, 2023-03-20, 1896, 1.0026441036488631, 1891, 11.323353293413174). cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.767349Z [info     ] CONTEXT:  COPY tmp_c0f2008b_b0ca_40c8_be2e_8283e01b71c4, line 6: "167,7.088752074034902,0.9989423585404548,"Safari","20230831","mobile",,0.0010576414595452142,1896,"i..." cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.767456Z [info     ]                                cmd_type=elb consumer=True name=target-postgres producer=False stdio=stderr string_id=target-postgres
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.790390Z [error    ] Loader failed
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - 2023-09-01T07:55:08.790690Z [error    ] Block run completed.           block_type=ExtractLoadBlocks err=RunnerError('Loader failed') exit_codes={<PluginType.LOADERS: 'loaders'>: 1} set_number=0 success=False
2023-09-01 07:55:16 +0000 - dagster - INFO - ga4_assets - c58a6f9c-c55c-44fc-ac76-5220a5e996d6 - ga4 - Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
I tried running the same feed with
target-jsonl
and all the data looks fine, no
null
where
(not set)
should be
Running
meltano run tap-ga4 target-postgres
with
pipelinewise
variant of postgres.
p
hmm this is odd 🤔 . If its causing issues for most users we can remove that field from the default reports but those are also only default reports and I'd expect most people to override them with their own reports, you can pass in whatever report definition you want to the tap. So you could fix your issue without having to fork the tap
a
Will do. It's really hard to diagnose what's going on, I just tried the run now around 2100 and completed fine, yet I'll bet the midnight run fails again. Very confusing. I'll proceed with the custom reports definition approach, thanks for the pointer.
Having a ga4 day! I think I've solved this one too, the api is actually providing empty strings for the
deviceModel
which is not the typical
(not set)
placeholder. Wouldn't matter except for the fact I am using postgres as a target, and this is a PK column. So empty strings getting coerced to null is a bad news here. I'll just use a mapper I think. It looks like there is some process in GA4 backend to tidy these up, the old values that used to error for me, now no longer cause errors, but newer ones do. thinkspin
Copy code
config:
      start_date: 2023-03-20
      property_id: ${TAP_GA4_PROPERTY_ID}
      client_secrets: ${TAP_GOOGLE_ANALYTICS_CLIENT_SECRETS}
      key_file_location: ''
      stream_maps:
        devices:
          deviceModel: "'(not set)' if deviceModel == '' else deviceModel"