casey
02/17/2021, 3:55 AMtap-postgres
with target-bigquery
. Both are "standard" flavors: transferwise and adswerve, respectively. But when I come upon a record that has an array type, I get the following error:
target-bigquery | CRITICAL 'type' or 'anyOf' are required fields in property: {'$ref': '#/definitions/sdc_recursive_string_array'}
target-bigquery | CRITICAL ['Traceback (most recent call last):\n', ' File "/home/chb/code/work/learning_equality/le-pipeline/meltano-projects/get-star
t/.meltano/loaders/target-bigquery/venv/lib/python3.7/site-packages/target_bigquery/__init__.py", line 93, in main\n for state in state_iterator:\n', ' Fi
le "/home/chb/code/work/learning_equality/le-pipeline/meltano-projects/get-start/.meltano/loaders/target-bigquery/venv/lib/python3.7/site-packages/target_bigq
uery/process.py", line 40, in process\n for s in handler.handle_record_message(msg):\n', ' File "/home/chb/code/work/learning_equality/le-pipeline/meltano
-projects/get-start/.meltano/loaders/target-bigquery/venv/lib/python3.7/site-packages/target_bigquery/processhandler.py", line 110, in handle_record_message\n
new_rec = filter_by_schema(schema, msg.record)\n', ' File "/home/chb/code/work/learning_equality/le-pipeline/meltano-projects/get-start/.meltano/loaders/
target-bigquery/venv/lib/python3.7/site-packages/target_bigquery/schema.py", line 75, in filter\n record[key]) # adswerve fix to match schema field name\n
', ' File "/home/chb/code/work/learning_equality/le-pipeline/meltano-projects/get-start/.meltano/loaders/target-bigquery/venv/lib/python3.7/site-packages/tar
get_bigquery/schema.py", line 84, in filter\n prop_type, _ = get_type(props)\n', ' File "/home/chb/code/work/learning_equality/le-pipeline/meltano-project
s/get-start/.meltano/loaders/target-bigquery/venv/lib/python3.7/site-packages/target_bigquery/schema.py", line 23, in get_type\n f"\'type\' or \'anyOf\' ar
e required fields in property: {property}"\n', "ValueError: 'type' or 'anyOf' are required fields in property: {'$ref': '#/definitions/sdc_recursive_string_ar
ray'}\n"]
casey
02/17/2021, 4:11 AM.meltano/run/tap-postgres/tap.properties.json
and I see what appear to be type definitions:
"definitions": {
...
"sdc_recursive_string_array": {
"type": [
"null",
"string",
"array"
],
"items": {"$ref": "#/definitions/sdc_recursive_string_array"
}
,
casey
02/17/2021, 4:13 AMcasey
02/17/2021, 4:41 AMInteger
(Default: 0) Object type RECORD items from taps can be loaded into VARIANT columns as JSON (default) or we can flatten the schema by creating columns automatically.
When value is 0 (default) then flattening functionality is turned off."casey
02/17/2021, 4:55 AMdouwe_maan
02/17/2021, 3:47 PMdouwe_maan
02/17/2021, 3:48 PMcasey
02/17/2021, 5:03 PMdouwe_maan
02/17/2021, 9:24 PMdouwe_maan
02/17/2021, 9:25 PMcasey
02/17/2021, 10:01 PMcasey
02/19/2021, 12:21 AMdouwe_maan
02/19/2021, 12:22 AMcasey
02/19/2021, 12:23 AMdouwe_maan
02/19/2021, 12:24 AMcasey
02/19/2021, 12:26 AMdouwe_maan
02/19/2021, 12:26 AMdouwe_maan
02/19/2021, 12:26 AMdouwe_maan
02/19/2021, 12:27 AMdouwe_maan
02/19/2021, 12:27 AMcasey
02/19/2021, 12:30 AMcasey
02/19/2021, 12:30 AMdouwe_maan
02/19/2021, 12:30 AMdouwe_maan
02/19/2021, 12:30 AMpip install <pip_url>
, it doesn't specifically inject any new dependenciescasey
02/19/2021, 12:31 AMdouwe_maan
02/19/2021, 12:32 AMcasey
02/19/2021, 12:32 AMcasey
02/19/2021, 1:47 AMpipelinewise-singer-python==1.2.0
pipelinewise-tap-postgres==1.7.1
pipelinewise-target-bigquery==1.0.1
but the problem is that, if I let the version of meltano "float" so to speak, pip ends up installing 0.15.1
. No joke--I saw it literally iterate over each version, from 1.69 to 0.15.1, trying to find something that was compatible with the previously installed packages from pipelinewise. The deps at the hear of the matter are simpleson and jsonschema. It's as if the various maintainers agreed to pin versioning in an interleaved fashion, with the transferwise/pipelinewise team pinning on every other range of versions of these two libraries and meltano's choosing the interleaving ranges.casey
02/19/2021, 1:53 AMmeltano 1.69.0 requires jsonschema<3.0.0,>=2.6.0, but you have jsonschema 3.2.0 which is incompatible.
meltano 1.69.0 requires simplejson<4.0.0,>=3.16.0, but you have simplejson 3.11.1 which is incompatible
and for meltano 1.68
meltano 1.68.0 depends on simplejson<4.0.0 and >=3.16.0
pipelinewise-singer-python 1.2.0 depends on simplejson==3.11.1
and for meltano 1.67
The conflict is caused by:
meltano 1.67.0 depends on jsonschema<3.0.0 and >=2.6.0
pipelinewise-singer-python 1.2.0 depends on jsonschema==3.2.0
meltano 1.67.0 depends on jsonschema<3.0.0 and >=2.6.0
pipelinewise-singer-python 1.1.4 depends on jsonschema==3.2.0
meltano 1.67.0 depends on jsonschema<3.0.0 and >=2.6.0
pipelinewise-singer-python 1.1.3 depends on jsonschema==3.2.0
meltano 1.67.0 depends on simplejson<4.0.0 and >=3.16.0
pipelinewise-singer-python 1.1.2 depends on simplejson==3.11.1
meltano 1.67.0 depends on simplejson<4.0.0 and >=3.16.0
pipelinewise-singer-python 1.1.1 depends on simplejson==3.11.1
meltano 1.67.0 depends on simplejson<4.0.0 and >=3.16.0
pipelinewise-singer-python 1.1.0 depends on simplejson==3.11.1
meltano 1.67.0 depends on simplejson<4.0.0 and >=3.16.0
pipelinewise-singer-python 1.0.0 depends on simplejson==3.11.1
douwe_maan
02/19/2021, 4:12 PM(pipelinewise-)singer-python
where meltano
is installed, if you're using meltano add
to add specific plugins.douwe_maan
02/19/2021, 4:29 PMjake_hannan
02/19/2021, 5:09 PMtap-postgres
target-bigquery
so jumping on this thread to follow updates 🙂douwe_maan
02/19/2021, 5:10 PMjake_hannan
02/19/2021, 5:26 PMcasey
02/19/2021, 5:31 PMsimpleson
. So I just installed those via pip install <git url>
. I cloned meltano and changed the poetry.lock
and pyproject.toml
files such that the jsonschema
is 3.2.0
(it's at 2.6.0
, iirc). Then I did a poetry build
in the meltano
project directory. That get's me a wheel file that I then can use with pip install ./meltano/dist/wheelfilename
.douwe_maan
02/19/2021, 5:34 PMjsonschema
, a tap uses another, and a target uses another still.douwe_maan
02/19/2021, 5:35 PMpip install meltano
, and then adding new plugins (taps and targets) using meltano add --custom
, so that they get their own virtual environment.douwe_maan
02/19/2021, 5:35 PMcasey
02/19/2021, 5:36 PMcasey
02/19/2021, 5:54 PMdouwe_maan
02/19/2021, 5:54 PMdouwe_maan
02/19/2021, 5:54 PMcasey
02/19/2021, 7:48 PMmeltano invoke --plugin-type extractor pipelinewise-tap-postgres --help
fails after I add it as a custom plugin. Here's my `meltano.yml`:
version: 1
send_anonymous_usage_stats: true
project_id: 77721c56-636c-4ea0-af68-95f764fc9319
plugins:
extractors:
- name: pipelinewise-tap-postgres
namespace: pipelinewise_tap_postgres
pip_url: git+<https://github.com/transferwise/pipelinewise-tap-postgres.git>
executable: pipelinewise-tap-postgres
capabilities:
- discover
- state
settings:
- name: host
- name: port
- name: user
- name: password
- name: dbname
- name: filter_schemas
- name: ssl
- name: logical_poll_seconds
- name: break_at_end_lsn
- name: max_run_seconds
- name: debug_lsn
config:
host: x.x.x.x
user: usrnamex
dbname: dbnamex
casey
02/19/2021, 7:54 PM[2021-02-19 14:53:14] meltano install extractor pipelinewise-tap-postgres
Installing 1 plugins...
Installing extractor 'pipelinewise-tap-postgres'...
Installed extractor 'pipelinewise-tap-postgres'
[chb] le-pipeline/meltano/le-meltano via v3.8.7 (le-meltano)
[2021-02-19 14:54:05] meltano invoke --plugin-type extractor pipelinewise-tap-postgres --help
Executable 'pipelinewise-tap-postgres' could not be found. Extractor 'pipelinewise-tap-postgres' may not have been installed yet using `meltano install extract
or pipelinewise-tap-postgres`, or the executable name may be incorrect.
[chb] le-pipeline/meltano/le-meltano via v3.8.7 (le-meltano)
[2021-02-19 14:54:21]
casey
02/19/2021, 8:11 PM[chb] leq-meltano via v3.8.7 (le-meltano)
[2021-02-19 15:05:44] meltano add --custom extractor pipelinewise-tap-postgres
Adding new custom extractor with name 'pipelinewise-tap-postgres'...
Specify the plugin's namespace, which will serve as the:
- identifier to find related/compatible plugins
- default database schema (`load_schema` extra),
for use by loaders that support a target schema
Hit Return to accept the default: plugin name with underscores instead of dashes
(namespace) [pipelinewise_tap_postgres]:
Specify the plugin's `pip install` argument, for example:
- PyPI package name:
pipelinewise-tap-postgres
- Git repository URL:
git+<https://gitlab.com/meltano/pipelinewise-tap-postgres.git>
- local directory, in editable/development mode:
-e extract/pipelinewise-tap-postgres
Default: plugin name as PyPI package name
(pip_url) [pipelinewise-tap-postgres]: git+<https://github.com/transferwise/pipelinewise-tap-postgres.git>
Specify the package's executable name
Default: package name derived from `pip_url`
(executable) [pipelinewise-tap-postgres]:
Specify the tap's supported Singer features (executable flags), for example:
`catalog`: supports the `--catalog` flag
`discover`: supports the `--discover` flag
`properties`: supports the `--properties` flag
`state`: supports the `--state` flag
To find out what features a tap supports, reference its documentation or try one
of the tricks under <https://meltano.com/docs/contributor-guide.html#how-to-test-a-tap>.
Multiple capabilities can be separated using commas.
Default: no capabilities
(capabilities) [[]]: discover,state
Specify the tap's supported settings (`config.json` keys)
Nested properties can be represented using the `.` separator,
e.g. `auth.username` for `{ "auth": { "username": value } }`.
To find out what settings a tap supports, reference its documentation.
Multiple setting names (keys) can be separated using commas.
Default: no settings
(settings) [[]]: host,port,user,password,dbname,filter_schemas,ssl,logical_poll_seconds,break_at_end_lsn,max_run_seconds,debug_lsn
Added extractor 'pipelinewise-tap-postgres' to your Meltano project
Installing extractor 'pipelinewise-tap-postgres'...
Installed extractor 'pipelinewise-tap-postgres'
[chb] leq-meltano via v3.8.7 (le-meltano)
[2021-02-19 15:08:01] meltano invoke pipelinewise-tap-postgres --help
Executable 'pipelinewise-tap-postgres' could not be found. Extractor 'pipelinewise-tap-postgres' may not have been installed yet using `meltano install extractor pipelinewise-tap-postgres`, or the executable name may be incorrect.
[chb] leq-meltano via v3.8.7 (le-meltano)
[2021-02-19 15:08:49]
casey
02/19/2021, 8:49 PMpipelinewise-tap-postgres
when prompted to Specify the package's executable name
, the binary in venv/bin
is named tap-postgres
douwe_maan
02/19/2021, 8:50 PMdouwe_maan
02/19/2021, 8:50 PMcasey
02/19/2021, 11:35 PMextractors
in meltano.yml
to tap-postgres
. Now, meltano invoke pipelinewise-tap-postgres --help
works. But I wonder if this will pose problems in the future.douwe_maan
02/19/2021, 11:37 PMexecutable
option is for; executable names and package names don't always match, and I don't imagine that'll change for this particular packagecasey
02/20/2021, 4:40 AMjake_hannan
02/20/2021, 6:43 PMdouwe_maan
02/22/2021, 3:55 PMadswerve
variant of target-bigquery
: https://gitlab.com/meltano/meltano/blob/master/src/meltano/core/bundle/discovery.yml#L1132
If you're not sure how exactly you'd add it to discovery.yml
, it would already help if you could create an issue and share the meltano.yml
definition you ended up with!casey
02/22/2021, 10:35 PMjake_hannan
02/22/2021, 11:26 PMcasey
02/22/2021, 11:28 PMjake_hannan
02/23/2021, 12:13 AMloaders:
- name: pipelinewise-target-bigquery
namespace: pipelinewise_target_bigquery
pip_url: pipelinewise-target-bigquery
executable: target-bigquery
config:
project_id: warehouse-name
default_target_schema: schema-name
primary_key_required: false
jake_hannan
02/23/2021, 12:13 AM