I’m running into a weird issue with <dreamdata-io/...
# troubleshooting
p
I’m running into a weird issue with dreamdata-io/tap-twitter-ads where it fails with:
Copy code
target-jsonl          | Exception: A record for stream accounts was encountered before a corresponding schema
Running the
elt
with
--log-level=debug
it does in fact look like the tap is emitting a row of type
RECORD
first. It fails with the same error even if I pass in a custom catalog that doesn’t contain the accounts stream, set
selected: false
in the catalog, or run the elt command with
--select campaigns.*
. See the relevant bit of the log output below
Copy code
tap-twitter-ads       | INFO Last/Currently Syncing Stream: None
tap-twitter-ads       | INFO Sync Parent Streams: ['accounts', 'campaigns']
tap-twitter-ads       | INFO Sync Child Streams: []
tap-twitter-ads       | INFO Sync Report Streams: ['campaign_events']
tap-twitter-ads       | INFO Account ID: xxxxx - START Syncing
tap-twitter-ads (out) | {"type": "STATE", "value": {"currently_syncing": "accounts"}}
tap-twitter-ads       | INFO Stream: accounts - Currently Syncing
tap-twitter-ads       | INFO Stream: accounts - START Syncing, Account ID: xxxxx
tap-twitter-ads       | INFO Stream: accounts - endpoint_config: {'path': 'accounts', 'data_key': 'data', 'key_properties': ['id'], 'replication_method': 'FULL_TABLE', 'replication_keys': [], 'params': {'account_ids': '{account_ids}', 'sort_by': ['updated_at-desc'], 'with_deleted': '{with_deleted}', 'count': 1000, 'cursor': None}}
tap-twitter-ads       | INFO country_code_list = ['US', 'CA']
tap-twitter-ads       | INFO sub_types = ['none']
tap-twitter-ads       | INFO sub_type = none
tap-twitter-ads       | INFO Stream: accounts - Request URL: <https://ads-api.twitter.com/9/accounts>
tap-twitter-ads       | INFO Stream: accounts - Request params: {'account_ids': 'xxxxx', 'sort_by': ['updated_at-desc'], 'with_deleted': 'true', 'count': 1000, 'cursor': None}
tap-twitter-ads (out) | {"type": "RECORD", "stream": "accounts", "record": {"name": "Some Name", "business_name": null, "timezone": "America/New_York", "timezone_switch_at": "2013-05-22T04:00:00Z", "country_code": null, "id": "r4ls2", "created_at": "2012-01-10T17:06:51Z", "updated_at": "2021-09-16T21:28:54Z", "industry_type": null, "business_id": null, "approval_status": "ACCEPTED", "deleted": false}, "time_extracted": "2021-09-17T14:22:13.760759Z"}
tap-twitter-ads       | INFO METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"endpoint": "accounts"}}
tap-twitter-ads       | INFO Stream: accounts, Account ID: xxxxx - FINISHED Sub Type: none, Total Sub Type Records: 1
tap-twitter-ads       | INFO Stream: accounts - FINISHED Syncing, Account ID: xxxxx, Total Records: 1
Any ideas?
p
hmm, this seems like it could be a bug with the tap? I dont see any schema messages being written out in their code. Its the taps responsibility to do that before records are sent. if you do a
meltano invoke tap-twitter-ads
to see the raw messages in stdout, are there any SCHEMA messages?
p
oh yeah doesn’t look like it, just RECORD and STATE messages. if i have schemas in a custom catalog, do you know if there’s a way i can tell meltano to use that instead, or does the tap still have to emit a schema message even if i use a catalog or schema extra?
p
I don’t think so. Meltano can accept a custom catalog but that’s just passed to the tap which uses it to select streams, if the tap doesn’t output schema messages then most if not all targets will fail validation. You can try submitting an issue on the repo and see what they say, maybe we’re missing something or the target they use doesn’t do validation. Emitting schema should just be a 1 line change since the singer-python package has a utility for it