Hi guys, I'm very new to Meltano and I'm trying to...
# plugins-general
j
Hi guys, I'm very new to Meltano and I'm trying to pull some shopify data and replicate to bigquery, but it seems to be skipping all entities. Any idea why?
Copy code
$ meltano elt tap-shopify target-bigquery 
meltano         | Running extract & load...
meltano         | Found catalog in extract/tap-shopify.catalog.json
tap-shopify     | INFO Skipping stream: orders
tap-shopify     | INFO Skipping stream: collects
tap-shopify     | INFO Skipping stream: products
tap-shopify     | INFO Skipping stream: transactions
tap-shopify     | INFO Skipping stream: abandoned_checkouts
tap-shopify     | INFO Skipping stream: metafields
tap-shopify     | INFO Skipping stream: custom_collections
tap-shopify     | INFO Skipping stream: customers
tap-shopify     | INFO Skipping stream: order_refunds
tap-shopify     | INFO ----------------------
tap-shopify     | INFO ----------------------
target-bigquery | INFO Pushing state: {}
target-bigquery | INFO Pushing state: {}
meltano         | Incremental state has been updated at 2021-05-21 19:29:00.293368.
meltano         | Incremental state has been updated at 2021-05-21 19:29:00.299481.
meltano         | Extract & load complete!
meltano         | Transformation skipped.
My
.env
has:
Copy code
TAP_SHOPIFY_SHOP=<shop-name>
TAP_SHOPIFY_START_DATE=2021-05-01T00:00:00Z
TAP_SHOPIFY_API_KEY=<shop-key>

TARGET_BIGQUERY_PROJECT_ID=<bq-project>
TARGET_BIGQUERY_DATASET_ID=<bq-dataset>
TARGET_BIGQUERY_ADD_METADATA_COLUMNS=true
TARGET_BIGQUERY_REPLICATION_METHOD=truncate
TARGET_BIGQUERY_TABLE_PREFIX=meltano_
TARGET_BIGQUERY_PRIMARY_KEY_REQUIRED=true

TAP_SHOPIFY__CATALOG=extract/tap-shopify.catalog.json
My `meltano.yml`:
Copy code
version: 1
send_anonymous_usage_stats: true
project_id: 2d8cbfe0-514e-4dd0-8711-efbf2148c262
plugins:
  extractors:
    - name: tap-shopify
      variant: singer-io
      pip_url: tap-shopify
      select:
        - orders.*
        - transactions.*
        - products.*
  loaders:
    - name: target-bigquery
      variant: adswerve
      pip_url: git+<https://github.com/adswerve/target-bigquery.git@v0.10.2>
And I got my catalog file by running:
Copy code
meltano invoke tap-shopify --discover > extract/tap-shopify.catalog.json
d
Check the top level metadata element for each stream in the catalog where the breadcrumb is
[]
.
Copy code
"metadata": [
        {
          "breadcrumb": [],
          "metadata": {
            "table-key-properties": [
              "id"
            ],
            "inclusion": "available",
            "selected": true
          }
You might need to add selected: true
j
Thanks @dan_ladd! It seems that it was triggered now. I seems that I don't need to set the:
Copy code
# meltano.yml
select:
  - orders.*
  - transactions.*
  - products.*
Right?
How can I use this schema from the catalog to solve this?
Copy code
CRITICAL 'type' or 'anyOf' are required fields in property: {}
d
I don't do the select in meltano.yml, that, so I assume no.
a
How can I use this schema from the catalog to solve this?
CRITICAL ‘type’ or ‘anyOf’ are required fields in property: {}
@jose_ribeiro - Your experience may vary but I’ve mostly seen this when the tap cannot detect the data types for some reason. Can you tell if any properties in your schema are missing types?
d
Copy code
CRITICAL 'type' or 'anyOf' are required fields in property: {}
Seems unique to target-bigquery. Can you search the catalog for
{}
and see if you find anything.
d
I think tap-shopify has some hard-coded
type: {}
s, which is not going to be supported by many targets that use that type to know what kind of table column to create in the destination database
j
Yep, I have a few, like:
Copy code
"subtotal_price_set": {},
          "total_discounts_set": {},
          "total_line_items_price_set": {},
          "total_price_set": {},
          "total_shipping_price_set": {},
          "total_tax_set": {},
So, I need to specify one-by-one I suppose
d
yea if you fill those in with
type: ["null", "string"]
, etc. it should work
d
I seems that I don't need to set the
select
If you're specifying your own catalog file,
select
will be ignored. Do you have a particular reason to specify your own catalog instead of using
select
,
metadata
, and
schema
to modify it dynamically? https://meltano.com/docs/integration.html#extractor-catalog-generation
j
I'm gonna try this out! thanks guys!
a
@douwe_maan - In the above, it looks like the select rules were not applied as he’d expected. From my quick check of his yaml, I didn’t see anything wrong with the select clauses.
d
@aaronsteers
.env
has:
Copy code
TAP_SHOPIFY__CATALOG=extract/tap-shopify.catalog.json
Per https://meltano.com/docs/plugins.html#select-extra:
These rules are not applied when a catalog is provided manually.
@jose_ribeiro In your case, I suggest using Meltano's dynamic catalog generation instead of manually generating it and setting it using https://meltano.com/docs/plugins.html#catalog-extra, so that you get nice selection rules like the ones you set
d
Either way, @jose_ribeiro needs to provide the catalog given the missing types, right?
j
I tried it first @douwe_maan, but I got the "anyOf" error, then I tried to set the catalog by hand
it's running now! thanks @dan_ladd @douwe_maan @aaronsteers!
p
yep - my workaround for the missing types with a bigquery target has also been to supply a catalog with the types filled in