hey team, help needed!! :wave: I'm trying to inges...
# troubleshooting
v
hey team, help needed!! 👋 I'm trying to ingest Google Ads data with meltano, but unfortunately facing weird issues on the most popular taps. Is this the right place to ask for help? • matatika variant: ◦ fails with:
400 Client Error: Bad Request for path: /v16/customers:listAccessibleCustomers
singer variant: ◦ fails with:
key error: tap-google-ads.api-field-names
considerations: • We've recently received approval from Google for a Basic Acess developer_token (which can query production accounts) and I am able to use purely Google Ads python SDK with python and get campaigns with this sample. • in the thread, I will post details of my meltano configs for attempting to ingest data with both taps What am I doing wrong?
1
matatika variant:
Copy code
- name: tap-googleads
  variant: matatika
  pip_url: git+<https://github.com/Matatika/tap-googleads.git@v0.5.0>
  config:
    customer_id: XXX-XXX-XXXX # (as seen in Google Ads account)
    developer_token: "xxx"
    oauth_credentials:
        client_id: "xxx"
        client_secret: "xxx"
        token_uri: <https://oauth2.googleapis.com/token>
        auth_uri: <https://accounts.google.com/o/oauth2/auth>
        refresh_token: "xxx"
        type: authorized_user
singer variant:
Copy code
- name: tap-googleads
  variant: singer-io
  pip_url: git+<https://github.com/singer-io/tap-google-ads.git>
  config:
    start_date: 2024-10-01T00:00:00Z
    customer_ids: XXXXXXXXXX # taken from google ads account but no hyphens
    login_customer_ids:
    - customerId: XXXXXXXXXX # taken from google ads account but no hyphens
      loginCustomerId: XXXXXXXXXX # taken from google ads account but no hyphens
    developer_token: "xxx"
    oauth_client_id: "xxx"
    oauth_client_secret: "xxx"
    refresh_token: "xxx"
r
Hi @Vitor Fortunato, are you authenticating as a manager account? If so, you will need to set
login_customer_id
. https://github.com/Matatika/tap-googleads?tab=readme-ov-file#login_customer_id
👀 1
v
no, I'm using the direct customer_id of an account that has data. let me try to authenticate as manager
r
I mean the
refresh_token
credential you are providing may pertain to a manager account, so you would need to provide
login_customer_id
in that case.
I see you did provide it when trying the Singer variant.
v
we have both manager account (MCC) and a sub-account under it, with data. trying now with:
Copy code
login_customer_id: XXX-XXX-XXXX # (manager account id / MCC)
customer_ids: XXX-XXX-XXXX # (sub-account id)
results in the same error
400 Client Error: Bad Request for path: /v16/customers:listAccessibleCustomers
----- but let me double-check the refresh token I have, because it might not match with the manager account, if that's a potential issue
thank you so much for the quick check by the way 🙏
np 1
r
Oh 😅 Are you including hyphens in the
customer_id
? They need to be omitted, as in the SInger variant. https://github.com/Matatika/tap-googleads?tab=readme-ov-file#customer_idscustomer_id
🦸 1
v
there you go! I think that was the issue! I was just able to get some output 🙏
🙌 1
r
It's always something simple... I guess you can try without
login_customer_id
again if you want. 😄
Also, I noticed you are using an older version of the tap, so if you want to update to the latest version you can run:
Copy code
meltano add --update tap-googleads
FYI the
v16
API the current version of the tap you are running with uses is due to be sunset in Feb next year: https://developers.google.com/google-ads/api/docs/sunset-dates
v
thank you! I'm a meltano newbie 🙂 I actually didn't specify the tap-google ads version, I think v16 was the default after doing
meltano add extractor tap-googleads
r
Yeah, it would have been the default when you initially added it. We tend to do that with our plugins on Meltano Hub for a little more stability (in case we make a bad commit or breaking change on the main branch).
👍 1
v
Hey @Reuben (Matatika) Now I fell into another issue that might be a bit more complex as I think it could be related to this issue that is being tracked already in the repo. I'm trying to use target-s3 and meltano fails with:
Copy code
singer_sdk.exceptions.MissingKeyPropertiesError: Record is missing one or more key_properties.
Key Properties: ['resource_names'], Record Keys: ['resourceNames']
I know data is coming like:
Copy code
{
  "resourceNames": [
    "customers/xx",
    "customers/xxx"
  ]
}
any workaround that I can apply here?
r
Ah, that was you commenting on the issue! I'll take a look into this, but in the meantime, you might be able use stream maps and alias the property to the expected format:
Copy code
config:
      stream_maps:
        stream_accessible_customers:
          resource_names: resourceNames
          resourceNames: __NULL__
https://sdk.meltano.com/en/latest/stream_maps.html#aliasing-properties You may have to do this for other streams that define using snake-case. Additionally, some streams also have an implied dependency on flattening enabled to form the primary key from nested properties (indicated by a double underscore
__
), so you may have to apply flattening and then define some more mappings as above for the time being.
👀 1
Will be working on that issue in https://github.com/Matatika/tap-googleads/pull/69
v
🦸
k
Hi @Reuben (Matatika), I have similar issues when using target-postgres:
Copy code
2024-11-08T21:58:33.116338Z [info     ]     raise MissingKeyPropertiesError( cmd_type=elb consumer=True job_name=dev:tap-googleads-to-target-postgres name=target-postgres producer=False run_id=60201b9b-63c9-4048-a146-3872a6d6278b stdio=stderr string_id=target-postgres
2024-11-08T21:58:33.116534Z [info     ] singer_sdk.exceptions.MissingKeyPropertiesError: Record is missing one or more key_properties. cmd_type=elb consumer=True job_name=dev:tap-googleads-to-target-postgres name=target-postgres producer=False run_id=60201b9b-63c9-4048-a146-3872a6d6278b stdio=stderr string_id=target-postgres
2024-11-08T21:58:33.117009Z [info     ] Key Properties: ['campaign__id'], Record Keys: ['campaign', 'customer_id', '_sdc_extracted_at', '_sdc_received_at', '_sdc_batched_at', '_sdc_deleted_at', '_sdc_sequence', '_sdc_table_version', '_sdc_sync_started_at'] cmd_type=elb consumer=True job_name=dev:tap-googleads-to-target-postgres name=target-postgres producer=False run_id=60201b9b-63c9-4048-a146-3872a6d6278b stdio=stderr string_id=target-postgres
The job will stops with "Loader failed".
r
@Klaus Rubenstein Same as above if you want a workaround, or you can try out the draft PR fix with
Copy code
pip_url: git+<https://github.com/Matatika/tap-googleads.git@|https://github.com/Matatika/tap-googleads.git@>5-primary-keys-format
k
@Reuben (Matatika) Thanks Reuben, will try it!
@Reuben (Matatika) Next issue on my side. Seems to be a bit more tricky with a postgres as target? target-jsonl works without problems.
Copy code
NotImplementedError: Altering columns is not supported. Could not convert column 'tap_googleads.stream_accessible_customers.resourceNames' from 'JSONB' to 'TEXT'.
r
I think that might be a side effect of the PR fix. Can you share your
meltano.yml
? What happens if you drop the
stream_accessible_customers
table and let the target recreate it?
v
I'm also facing issues after trying the PR code fix. I see target-jsonl working just fine and I see fields alrady flattened there, However, when using target-s3: • I see warnings like the one below for every streaming:
Copy code
"WARNING", "message": "Unable to find a config for key_properties - stream_accessible_customers",
• and then, a bunch of errors like:
Copy code
"ERROR", "message": "Error while processing stream_campaign: 000904 (42000): SQL compilation error: error line 3 at position 52\ninvalid identifier 'CAMPAIGN__ID'"
"ERROR", "message": "Error while processing stream_adgroupsperformance: 000904 (42000): SQL compilation error: error line 3 at position 52\ninvalid identifier 'CAMPAIGN__ID'"
"ERROR", "message": "Error while processing stream_adgroups: 000904 (42000): SQL compilation error: error line 3 at position 52\ninvalid identifier 'ADGROUP__ID'"
"ERROR", "message": "Error while processing stream_campaign_performance: 000904 (42000): SQL compilation error: error line 3 at position 52\ninvalid identifier 'CAMPAIGN__NAME'"
"ERROR", "message": "Error while processing stream_customer_hierarchy: 000904 (42000): SQL compilation error: error line 3 at position 52\ninvalid identifier 'CUSTOMERCLIENT__ID'"
r
Can you share your
meltano.yml
please? What
target-s3
are you using (unclear why SQL is involved here)?
v
ohh! actually, never mind. I think it worked just fine as I see all files already in S3. The error I have is on another part of the code that takes s3 to snowflake. It is not related anymore to tap-googleads
👍 1
thank you so much Reuben!!
np 1
r
Makes sense 🙂
k
@Reuben (Matatika) Dropping the
stream_accessible_customers
table was the solution! Thanks for your time Reuben, will check the data now but looks fine so far! :)
👍 1
v
I've seen your PR only works if data gets flattened. Any chance do make it work without flattening data?
r
@Vitor Fortunato Primary keys can only be top-level properties, so the intention is to use flattening to achieve this. The PR would enable flattening by default so primary keys work out-the-box with (most) loaders. rather than hitting the same errors that have been discussed in this thread previously. I haven't reviewed what properties make up the primary keys for each stream, so maybe there's an optimisation to be made there but I'm pretty confident flattening would need to be used in some places regardless. There is always the option of unsetting the primary keys for all streams
Copy code
config:
      stream_maps:
        '*':
          __key_properties__: __NULL__
but then rows will only be appended rather than updated, so you will inevitably get duplicate data. https://sdk.meltano.com/en/v0.42.1/stream_maps.html#unset-or-modify-the-stream-s-primary-key-behavior
Do you have a particular reason why you don't want the data flattened?
v
we're still investigating what could be the root cause. after flattening fields and using your MR, it does work to dump files using target-s3. However, stream schemas are not updated accordingly. If we use
meltano invoke tap-googleads --discover
to get the schema, it will still come with unflattened fields. This is a problem to us because we use this schema to run s3 to snowflake copy
r
meltano run tap-googleads target-snowflake
will respect the flattening. Why are you loading to S3 first? For bulk loading? @Edgar Ramírez (Arch.dev) Should the schema from
--discover
respect
flattening_enabled
here (and potentially other mappings)?
v
we always use s3 as landing first
r
The default
target-snowflake
runs a bulk load from the local file-system, so you shouldn't need the intermediate S3 layer if you're concerned about speed.
e
@Edgar Ramírez (Arch.dev) Should the schema from
--discover
respect
flattening_enabled
here (and potentially other mappings)?
It doesn't at the moment. The
SCHEMA
messages would be the source of truth for the emitted records.
👍 1
There are a few issues on the repo discussing whether and how Meltano should generate an artifact at the end of each run that includes, among other things, the final schema for each stream. For example: https://github.com/meltano/meltano/issues/2753.
👀 1