Hello - first time using Meltano here. Trying to s...
# troubleshooting
j
Hello - first time using Meltano here. Trying to set up a tap-airtable target-bigquery pipeline, and I think I’m not setting it up correctly. This is my meltano.yml (excluding the ‘token’ config under tap-airtable).
Copy code
version: 1
send_anonymous_usage_stats: true
project_id: 7c2647e3-78bf-4c06-bebd-9d6cd5a05904
plugins:
  extractors:
  - name: tap-airtable
    namespace: tap_airtable
    pip_url: git+<https://github.com/goes-funky/tap-airtable.git>
    executable: tap-airtable
  loaders:
  - name: target-bigquery
    variant: adswerve
    pip_url: git+<https://github.com/adswerve/target-bigquery.git@0.11.3>
environments:
- name: dev
  config:
    plugins:
      extractors:
      - name: tap-airtable
        config:
          metadata_url: <https://api.airtable.com/v2/meta/>
          records_url: <https://api.airtable.com/v0>
          select_by_default: true
          remove_emojis: false
          base_id: app0Ie9bbcFynqxOY
      loaders:
      - name: target-bigquery
        config:
          project_id: core-data-stack
          dataset_id: tap_airtable2
- name: staging
- name: prod
This tells me (even after adding the token config to tap-airtable) that the token key can’t be found. I moved the config into the “plugins” section and that made it recognize the config variables, and that gives me this successful output, but it’s not actually moving any data; it’s just creating an empty dataset in the bigquery.
Copy code
2022-03-11T14:08:18.171618Z [info     ] Running extract & load...      job_id=airtable-to-bigquery name=meltano run_id=a2442ffa-1271-4bb8-bb1b-cb222dd727bf
2022-03-11T14:08:18.952617Z [info     ] INFO Pushing state: {}         cmd_type=loader job_id=airtable-to-bigquery name=target-bigquery run_id=a2442ffa-1271-4bb8-bb1b-cb222dd727bf stdio=stderr
2022-03-11T14:08:18.964065Z [info     ] Incremental state has been updated at 2022-03-11 14:08:18.963820.
2022-03-11T14:08:19.015530Z [info     ] Extract & load complete!       job_id=airtable-to-bigquery name=meltano run_id=a2442ffa-1271-4bb8-bb1b-cb222dd727bf
2022-03-11T14:08:19.015736Z [info     ] Transformation skipped.        job_id=airtable-to-bigquery name=meltano run_id=a2442ffa-1271-4bb8-bb1b-cb222dd727bf
2022-03-11T14:08:19.139882Z [info     ] Emitter initialized with endpoint <http://sp.meltano.com/i>
2022-03-11T14:08:19.255666Z [info     ] Attempting to send 1 events
2022-03-11T14:08:19.255861Z [info     ] Sending GET request to <http://sp.meltano.com/i>...
2022-03-11T14:08:20.172190Z [info     ] GET request finished with status code: 200
I’ve tried copying the output of “tap-airtable --config config.json --discover” to a “catalog” field in the yaml but still no difference. Also, when I run
Copy code
meltano select tap-airtable --list --all"
… I get the error:
Copy code
Cannot list the selected attributes: Could not find catalog. Verify that the tap supports discovery mode and advertises the `discover` capability as well as either `catalog` or `properties`
v
Welcome! It looks like the links that I used don't work right now to the guides :eek https://hub.meltano.com/taps/airtable the https://meltano.com/docs/plugin-management.html#custom-plugins as a custom extractor link goes to the wrong place. @amanda.folson fyi -- To cut to the chase it looks like you need to add the capabilities for catalog and discover https://docs.meltano.com/concepts/project#custom-plugin-definitions is where that link should go I think.
Copy code
capabilities:
    - catalog
    - discover
Could you bump this issue if you have a gitlab account (thumbs up) https://gitlab.com/meltano/hub/-/issues/161
Something has to tell meltano that the tap supports catalog, and discover. That should fix stuff for you as it looks like airtable will auto select everything (normally you need to add a
select: - **.**
x
I have came across exactly the same issue here
It does not move any data even when I tried to define
select: - table_name.*
e
@xinge_li even when the tap declares both the
catalog
and
discover
capabilities?
x
Yes
v
Please share meltano.yml and logs. This tap works for others based on the hub so it's most likely a configuration thing
x
The meltano.yml I have for this tap:
Copy code
- name: tap-airtable
    namespace: tap_airtable
    pip_url: git+<https://github.com/goes-funky/tap-airtable.git>
    executable: tap-airtable
    capabilities:
    - catalog
    - discover
    config:
      metadata_url: <https://api.airtable.com/v2/meta/>
      records_url: <https://api.airtable.com/v0/>
      token: $AIRTABLE_TOKEN
      base_id: $base_id
      selected_by_default: true
      remove_emojis: true
      default_replication_method: INCREMENTAL
    select:
    - table_name.*
the logs are as below and afterwards nothing is in the destination database:
Copy code
2022-03-12T22:23:17.664930Z [info     ] Running extract & load...      job_id=2022-03-12T222315--tap-airtable--target-postgres--airtable name=meltano run_id=e4c88362-2c7f-4a44-92fc-067a04dca422
2022-03-12T22:23:20.323409Z [info     ] Extract & load complete!       job_id=2022-03-12T222315--tap-airtable--target-postgres--airtable name=meltano run_id=e4c88362-2c7f-4a44-92fc-067a04dca422
2022-03-12T22:23:20.323875Z [info     ] Transformation skipped.        job_id=2022-03-12T222315--tap-airtable--target-postgres--airtable name=meltano run_id=e4c88362-2c7f-4a44-92fc-067a04dca422
v
Can you post all of the logs? Also try
Copy code
select:
    - *.*
While we're debugging it might make some sense to add this as well
Copy code
metadata:
      '*':
        replication-method: FULL_TABLE
To remove any incremental stuff (not certain airtable's tap supports it but it's worth a shot) I haven't dove into the code, and logs would help!
x
Above is all the logs I have, nothing more 😅 I will try
Copy code
- *.*
Did not work, and it has an error of:
Copy code
while scanning an alias
  in "meltano.yml", line 147, column 7
expected alphabetic or numeric character, but found '.'
  in "meltano.yml", line 147, column 8
Having full_table replication method did not help either
When I run
--log-level debug
it seems like the tables are skipped which I don’t know why:
Copy code
2022-03-14T22:00:36.720253Z [info     ] INFO discover base BASE_ID name=tap-airtable stdio=stderr type=discovery
2022-03-14T22:00:37.352486Z [info     ]                                name=tap-airtable stdio=stderr type=discovery
2022-03-14T22:00:37.356984Z [debug    ] Visiting CatalogNode.STREAM at '.streams[0]'.
2022-03-14T22:00:37.357500Z [debug    ] Setting '.streams[0].selected' to 'False'
2022-03-14T22:00:37.357637Z [debug    ] Setting '.streams[0].selected' to 'True'
2022-03-14T22:00:37.357795Z [debug    ] Skipping node at '.streams[0].tap_stream_id'
2022-03-14T22:00:37.357914Z [debug    ] Skipping node at '.streams[0].database_name'
2022-03-14T22:00:37.358022Z [debug    ] Skipping node at '.streams[0].table_name'
2022-03-14T22:00:37.358137Z [debug    ] Skipping node at '.streams[0].key_properties[0]'
2022-03-14T22:00:37.358281Z [debug    ] Visiting CatalogNode.PROPERTY at '.streams[0].schema.properties.id'.
2022-03-14T22:00:37.358413Z [debug    ] Skipping node at '.streams[0].schema.properties.id.inclusion'
2022-03-14T22:00:37.358534Z [debug    ] Visiting CatalogNode.PROPERTY at '.streams[0].schema.properties.Autonumber'.
2022-03-14T22:00:37.358655Z [debug    ] Skipping node at '.streams[0].schema.properties.Autonumber.inclusion'
2022-03-14T22:00:37.358776Z [debug    ] Skipping node at '.streams[0].schema.properties.Signup date.inclusion'
2022-03-14T22:00:37.358893Z [debug    ] Skipping node at '.streams[0].schema.properties.Signup date.type[0]'
2022-03-14T22:00:37.359037Z [debug    ] Skipping node at '.streams[0].schema.properties.Signup date.type[1]'
2022-03-14T22:00:37.359148Z [debug    ] Skipping node at '.streams[0].schema.properties.Claim ID.inclusion'
2022-03-14T22:00:37.359251Z [debug    ] Skipping node at '.streams[0].schema.properties.Claim ID.type[0]'
2022-03-14T22:00:37.359350Z [debug    ] Skipping node at '.streams[0].schema.properties.Claim ID.type[1]'
...
What could be the reason?
v
Because they are not selected
x
Hi! I am sorry to bother again, after I did
meltano select --all tap-airtable
, the log still shows the same skipping node message….
When I run
meltano select tap-airtable --list
, I got:
Copy code
Legend:
	selected
	excluded
	automatic

Enabled patterns:
	*.*

Selected attributes:
	[selected ] tbl0AJPrnjs3ojBo9.author
	[selected ] tbl0AJPrnjs3ojBo9.case
	[selected ] tbl0AJPrnjs3ojBo9.checked
	[selected ] tbl0AJPrnjs3ojBo9.createdDate
	[selected ] tbl0AJPrnjs3ojBo9.date
	[automatic] tbl0AJPrnjs3ojBo9.id
	[selected ] tbl0AJPrnjs3ojBo9.message
	[selected ] tbl0AJPrnjs3ojBo9.reminderId
	[selected ] tbl0AJPrnjs3ojBo9.type
It seems that the tables are selected
But I still get nothing
v
new logs with debug please!
x
Untitled.txt
v
Can you post the
meltano.yml
as well. From that log I see `Extract & load complete! job_id=2022-03-17T141958--tap-airtable--target-postgres--airtable name=meltano run_id=b0a52a63-2ff4-4be2-b945-ce89818dd462 For fun you could try a different job_id just to see if maybe it's state messing this up (not clear to me this is what's happening)
x
Although it says complete but no table is created
Copy code
- name: tap-airtable
    namespace: tap_airtable
    pip_url: git+<https://github.com/goes-funky/tap-airtable.git>
    executable: tap-airtable
    capabilities:
    - catalog
    - discover
    config:
      metadata_url: <https://api.airtable.com/v2/meta/>
      records_url: <https://api.airtable.com/v0/>
      token: $AIRTABLE_TOKEN
      base_id: appmmCT4fEgPEjMt6
      selected_by_default: true
      remove_emojis: true
    select:
    - '*.*'
 - name: target-postgres--airtable
    inherit_from: target-postgres
    config:
      host: $LOADER_PG_ADDRESS_A
      port: $LOADER_PG_PORT_A
      dbname: $LOADER_PG_DATABASE_A
      user: $LOADER_PG_USERNAME_A
      password: $LOADER_PG_PASSWORD_A
      default_target_schema: airtable
Tried with changing a job_id, does not help at all. What annoyed me is that it always gives a successful state while obviously there is something wrong here
v
Yeah something smells here, it's something with this tap
tap-airtable -c config.json --properties properties.json
clued me in from the readme
This doesn't use catalog as it's an older tap, it uses properties. Try changing
Copy code
capabilities:
    - catalog
    - discover
To
Copy code
capabilities:
    - properties
    - discover
and state shouldnt' matter at all because we have state turned off (tap doesn't' support it so it seems)
That was it I'd guess, let us know how it goes
x
I will try now
Actually now, it shows me some errors!
It gave me this error which I couldn’t understand when select -all ```2022-03-17T153515.742757Z [info ] INFO will import Rejected (Typeform) cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr 2022-03-17T153516.218804Z [info ] INFO METRIC {"type": "counter", "metric": "page", "value": 0} cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr 2022-03-17T153516.221421Z [debug ] {"type": "SCHEMA", "stream": "rejected_typeform", "schema": {"properties": {"Autonumber": {"inclusion": "automatic", "type": ["null", "number"]}, "Signup date": {"inclusion": "available", "type": ["null", "string"]}, "Claim ID": {"inclusion": "available", "type": ["null", "string"]}, "Name": {"inclusion": "available", "type": ["null", "string"]}, "Last name": {"inclusion": "available", "type": ["null", "string"]}, "Personal number": {"inclusion": "available", "type": ["null", "string"]}, "Email": {"inclusion": "available", "type": ["null", "string"]}, "Phone": {"inclusion": "available", "type": ["null", "string"]}, "Who": {"inclusion": "available", "type": ["null", "string"]}, "Diagnosis": {"inclusion": "available", "type": ["null", "string"]}, "Date": {"inclusion": "available", "format": "date", "type": ["null", "string"]}, "Healthcare facility": {"inclusion": "available", "type": ["null", "string"]}, "Private insurance": {"inclusion": "available", "type": ["null", "string"]}, "Insurance company": {"inclusion": "available", "type": ["null", "string"]}, "Union member": {"inclusion": "available", "type": ["null", "string"]}, "Union": {"inclusion": "available", "type": ["null", "string"]}, "Partner union member": {"inclusion": "available", "type": ["null", "string"]}, "Partner union": {"inclusion": "available", "type": ["null", "string"]}, "Diagnosis reported": {"inclusion": "available", "type": ["null", "string"]}, "Reported Insurance company": {"inclusion": "available", "type": ["null", "string"]}, "Compensation Insurance company": {"inclusion": "available", "type": ["null", "string"]}, "More details": {"inclusion": "available", "type": ["null", "string"]}, "Notes ID": {"inclusion": "available", "type": ["null", "string"]}, "Previous compensation": {"inclusion": "available", "type": ["null", "string"]}, "Diagnosis type (Other)": {"inclusion": "available", "type": ["null", "string"]}, "Status": {"inclusion": "available", "type": ["null", "string"]}, "Coinsured": {"inclusion": "available", "type": ["null", "string"]}, "Previously reported to IC": {"inclusion": "available", "type": ["null", "string"]}, "PoA - Status": {"inclusion": "available", "type": ["null", "string"]}, "PoA - URL": {"inclusion": "available", "type": ["null", "string"]}, "Cover letter - URL": {"inclusion": "available", "type": ["null", "string"]}, "Duplicated": {"inclusion": "available", "type": ["null", "string"]}, "id": {"inclusion": "automatic", "type": ["null", "string"]}}}, "key_properties": ["Autonumber"]} cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable (out) run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stdout 2022-03-17T153516.223047Z [info ] ERROR 'real_name' cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr 2022-03-17T153516.223966Z [info ] ERROR 'real_name' cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr 2022-03-17T153516.224343Z [info ] Traceback (most recent call last): cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr 2022-03-17T153516.224635Z [info ] File "/Users/xingeli/data-pipeline/.meltano/extractors/tap-airtable/venv/lib/python3.8/site-packages/tap_airtable/__init__.py", line 17, in main cmd_type=extr…
I cannot figure out why, any help? thank you!
v
Well that's progress there's data dancingpenguin
@xinge_li can you put a MR into the hub to fix https://hub.meltano.com/taps/airtable (replace catalog with properties)
x
Hi! Yes 🙂
v
Copy code
2022-03-17T15:35:16.226426Z [info     ]   File "/Users/xingeli/data-pipeline/.meltano/extractors/tap-airtable/venv/lib/python3.8/site-packages/tap_airtable/services/__init__.py", line 193, in _find_column cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr
2022-03-17T15:35:16.226736Z [info     ]     return m["metadata"]["real_name"] cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr
2022-03-17T15:35:16.227610Z [info     ] KeyError: 'real_name'          cmd_type=extractor job_id=insurello-export-airtable name=tap-airtable run_id=50f77f78-c903-4ff8-be6e-2e49a70a5f11 stdio=stderr
I'd go look at the code at that line and see what's going on.
real_name
doesn't exist in
metadata
which is probably the catalog. I'd look around at the catalog (dump it) and see what's going on.
Might be a good time to go back to
meltano select --list
and look at everything on the list, maybe just pick one stream that you want instead of all to do some debugging 🤷
x
I did look at the code itself, when I do
--discover
I figured out the
real_name
is basically the naming you put in an airtable, it is kinda weird when it is complaining about not exist, I will try to dump it and see what’s going on
I did select specific table too, the error is the same
It seems the error will appear regardless of picking a stream or a column in a stream. Nothing strange from dump
v
Time to hop into the code to look at whatever
real_name
is
l
I also had these exact same issues with this tap and was able to fix
real_name
errors by adding:
Copy code
if m["metadata"].get("real_name"):
    return m["metadata"]["real_name"]
https://github.com/goes-funky/tap-airtable/blob/d97e169c2d34db7163b5db36683e0215b2f0ac1f/tap_airtable/services/__init__.py#L193