hello, i'm using tap-hubspot --variant "airbyte" t...
# singer-taps
l
hello, i'm using tap-hubspot --variant "airbyte" to extract data from several hubspot streams. the tap takes about 45 minutes to extract very small amounts of data (~80MB) to S3, but the log is flooded with lots of errors along the lines of
Copy code
2025-10-16T13:05:43.487376Z [info     ] {'level': 'WARN', 'message': "Couldn't parse date/datetime string in hs_lifecyclestage_lead_date, trying to parse timestamp... Field value: 1709470649329. Ex: Unable to parse string [1709470649329]"} cmd_type=elb consumer=False job_name=staging:tap-hubspot-to-target-s3--raw-crm:eu-west-1-20251016 name=tap-hubspot producer=True run_id=0199ed1f-676c-7a87-ba25-9ddc70d8434c stdio=stderr string_id=tap-hubspot
Since the amount of data is very low and other ETLs are running fairly faster, I imagine the issue is with the amount of parsing errors and parsing attempts, logging the error, etc. it looks like there is a log entry for each row in the source data. I tried (to no avail) to filter the specific fields using selection / custom mappers, but the errors persist. It is crucial for me to use the airbyte variant as it is the only variant that supports custom hubspot objects out-of-the-box. I'm looking for ways to tackle this issue - the goal is to make the ETL run as fast as a few minutes instead of 45 minutes
e
Yeah, that seems to be coming from the Airbyte connector: https://github.com/airbytehq/airbyte/blob/1dbd573f50ed3af34f3d61d2a9e5fdff9ce096ab/airbyte-integrations/connectors/source-hubspot/components.py#[…]0. I don't see a way of turning it off unfortunately, but I just took a quick a look.
l
@Edgar Ramírez (Arch.dev) I tried to override the discovered schema by adding a "schema" in the tap config, like below, but unfortunately it has no effect. any idea why? because according to https://docs.meltano.com/concepts/plugins?meltano-tabs=meltano.yml#schema-extra it should be perfect for this case. i'm using meltano v3.9
Copy code
- name: tap-hubspot
    variant: airbyte
    pip_url: git+<https://github.com/MeltanoLabs/tap-airbyte-wrapper.git>
    config:
      airbyte_config:
        credentials:
          access_token: ${HUBSPOT_ACCESS_TOKEN}
          credentials_title: Private App Credentials
      force_native: true
    select:
    - companies.*
    - contacts.*
    schema:
      contacts :
        hs_lifecyclestage_lead_date:
          type: ["integer", "null"]
      companies :
        hs_lifecyclestage_lead_date:
          type: ["integer", "null"]