I'm struggling with the meltanolabs hubspot tap. w...
# troubleshooting
j
I'm struggling with the meltanolabs hubspot tap. with this config it's grabbing duplicate company records like crazy, it doesn't seem to ever even end the import, i cut it off at 150k (3 times the amount of records there are even before the filter)
Copy code
extractors:
    - name: tap-hubspot
      variant: meltanolabs
      pip_url: git+<https://github.com/MeltanoLabs/tap-hubspot.git>
      config:
        start_date: 2024-04-01T00:00:00+0000
        flattening_enabled: true
        flattening_max_depth: 2
        stream_maps:
          companies:
            __filter__: properties.lifecyclestage != 'lead'
      select:
        - users.*
        - deals.*
        - companies.*
e
@Jesse Johnson thanks for reporting this! Can you try and see if pinning an older version fixes the issue:
Copy code
pip_url: git+<https://github.com/MeltanoLabs/tap-hubspot.git@9d62d3b90ce79d6d69eca544f1448d1049937131>
cc @Pat Nadolny (Arch) in case have you've ever seen this type of behavior
👀 1
j
I solved this by going in to streams.py and selecting FULL instead of incremental for companies
the /objects/companies/search endpoint seemed to be the issue
which i saw was recently changed
e
Gotcha. Probably worth trying a replication-method override
Copy code
extractors:
    - name: tap-hubspot
      metadata:
        companies:
          replication-method: FULL_TABLE
https://docs.meltano.com/concepts/plugins#metadata-extra
j
ah, thanks. i wasn't sure if that would work. i'll try that instead
p
> in case have you've ever seen this type of behavior I havent seen this myself but I'll report back if I do
🙏 1
j
I tried that, and it looks like it still uses the search endpoint
e
I'm not too familiar with the tap, but would the solution be for the it not to use the search endpoint?
j
depends on the use case, since i'm pulling tens of thousands of records then using a stream_map, i think the search endpoint just gets overwhelmed?
e
So it's too slow to even get the first response?
j
I'm not sure, it just runs until i stop it, I could try doing some more indepth troubleshooting at some point
one thing you may be able to help me with is the filter i'm using for the stream_map: If i want to use an AND statement, what would that look like?
Copy code
__filter__: (properties.lifecyclestage != 'lead') and (properties.lifecyclestage != null)
That attempt at an and statement throws an error:
Copy code
Failed to evaluate simpleeval expressions
I'm going to try: (properties.lifecyclestage is not None)
👌 1
ok, that's happier! Disregard
👍 1