I'm finding myself baffled by the tap-hubspot vari...
# troubleshooting
j
I'm finding myself baffled by the tap-hubspot variant from meltanolabs. When I run `meltano select tap-hubspot --list --all, i get a result that includes this custom property we use:
companies.properties.redacted_custom_prop_name
. I then run
meltano select tap-hubspot companies redacted_custom_prop_name
, it seems to add it to the meltano.yml. However, if i then invoke tap-hubspot, i get this error:
Copy code
❯ meltano run tap-hubspot --refresh-catalog target-jsonl
2024-10-29T17:28:34.698950Z [info     ] Environment 'dev' is active   
2024-10-29T17:28:38.535979Z [warning  ] Property `redacted_custom_prop_name` was not found in the schema of stream `companies`
Including my redacted yml file in the thread
Copy code
version: 1
default_environment: dev
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-hubspot
    variant: meltanolabs
    pip_url: git+<https://github.com/MeltanoLabs/tap-hubspot.git>
    config:
      flattening_enabled: true
      flattening_max_depth: 2
      stream_maps:
        '*':
          __alias__: __stream_name__ + '_redacted'
        companies:
          __filter__: properties.redacted_custom_prop_1 is not None or properties.redacted_custom_prop_2
            is not None or properties.redacted_custom_prop_3 is not None or properties.redacted_custom_prop_4
            is not None or properties.redacted_custom_prop_5 is not None or properties.redacted_custom_prop_6
            is not None or properties.redacted_custom_prop_7 is not None or properties.redacted_custom_prop_8
            is not None or properties.redacted_custom_prop_9 is not None or properties.redacted_custom_prop_10
            is not None or properties.redacted_custom_prop_11 is not None or
            properties.redacted_custom_prop_12 is not None or properties.redacted_custom_prop_13
            is not None or properties.redacted_custom_prop_14 is not
            None or properties.redacted_custom_prop_15 is not None
    # metadata:
    #   companies:
    #     replication-method: FULL_TABLE
    select:
    - companies.redacted_custom_prop_name
  loaders:
  - name: target-jsonl
    variant: andyh1203
    pip_url: target-jsonl
When i run
meltano invoke --dump catalog tap-hubspot
i see it in the catalog as:
.streams[2].schema.properties.properties.properties.redacted_custom_prop_name
the goal i'm trying to achieve is creating a pipeline for 16 specific custom properties in to BQ
Annoyingly, it works if i surround the name with *
spoke too soon, they show up in
meltano select tap-hubspot --list
but not in the jsonl output
😢
v
It's possible you're cached try the refresh-catalog option https://docs.meltano.com/reference/command-line-interface/#run Note that it helps a lot to see the command you're running when you have an issue. You're getting close there's just some underlying details with meltano and edge cases you're hitting. You'll get it
j
same deal, unfortunately.
meltano run --refresh-catalog tap-hubspot target-jsonl
prints this to the output jsonl file:
Copy code
{"id": "5534339884", "hs_lastmodifieddate": "2024-10-22T11:34:09.657Z"}
{"id": "5535570836", "hs_lastmodifieddate": "2024-10-04T15:08:38.041Z"}
{"id": "5536384277", "hs_lastmodifieddate": "2024-10-04T15:08:38.036Z"}
{"id": "5536505886", "hs_lastmodifieddate": "2024-10-04T15:08:37.932Z"}
{"id": "5538551911", "hs_lastmodifieddate": "2024-10-29T13:44:06.500Z"}
{"id": "5543298638", "hs_lastmodifieddate": "2024-10-25T13:49:10.450Z"}
let me try deleting state
nah, same thing
v
It's not state, sounds like you're running a lot of things all at once. First lets get the tap to output your data (I thought you already had that)
meltano invoke tap-hubspot > out
does that output the data you want?
j
I'm not, really. I'm just working on the tap right now and using target-jsonl as a dummy data dump:
Copy code
{"type": "STATE", "value": {}}
{"type": "SCHEMA", "stream": "companies", "schema": {"properties": {"id": {"type": ["string", "null"]}, "hs_lastmodifieddate": {"format": "date-time", "type": ["string", "null"]}}, "type": "object"}, "key_properties": ["id"], "bookmark_properties": ["hs_lastmodifieddate"]}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted1", "hs_lastmodifieddate": "2024-10-27T10:31:22.625Z"}, "time_extracted": "2024-10-30 14:09:57.830256+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted2", "hs_lastmodifieddate": "2024-10-29T13:43:34.347Z"}, "time_extracted": "2024-10-30 14:09:57.830460+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted3", "hs_lastmodifieddate": "2024-10-30T01:14:29.665Z"}, "time_extracted": "2024-10-30 14:09:57.830556+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted4", "hs_lastmodifieddate": "2024-10-22T13:51:22.693Z"}, "time_extracted": "2024-10-30 14:09:57.830628+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted5", "hs_lastmodifieddate": "2024-10-21T21:14:20.458Z"}, "time_extracted": "2024-10-30 14:09:57.830690+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted6", "hs_lastmodifieddate": "2024-10-22T13:51:22.739Z"}, "time_extracted": "2024-10-30 14:09:57.830753+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted7", "hs_lastmodifieddate": "2024-10-04T04:57:42.782Z"}, "time_extracted": "2024-10-30 14:09:57.830805+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted8", "hs_lastmodifieddate": "2024-10-26T01:04:27.656Z"}, "time_extracted": "2024-10-30 14:09:57.830870+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted9", "hs_lastmodifieddate": "2024-10-04T15:08:37.126Z"}, "time_extracted": "2024-10-30 14:09:57.830921+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted10", "hs_lastmodifieddate": "2024-10-27T10:31:51.096Z"}, "time_extracted": "2024-10-30 14:09:57.830967+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted11", "hs_lastmodifieddate": "2024-10-30T12:31:43.097Z"}, "time_extracted": "2024-10-30 14:09:57.831008+00:00"}
{"type": "RECORD", "stream": "companies", "record": {"id": "redacted12", "hs_lastmodifieddate": "2024-10-27T10:31:50.508Z"}, "time_extracted": "2024-10-30 14:09:57.831052+00:00"}
If you mean to output data without a select statement, that works fine
v
Is that data everything that you want or not?
j
it's everything + all the fields i'm trying to exclude, so maybe i should do some filtering after the tap since filtering at the tap level isn't working
that is to say, column level selection is what i'm struggling with
that being said, if i can't get tap-hubspot to recognize property names, my stream_maps config for filtering out rows i don't need also doesn't work