Hi everyone New to meltano so sorting through the best how t Meltano #plugins-general

Hi everyone. New to meltano so sorting through the...

clifford_mosley

11/04/2021, 3:22 PM

Hi everyone. New to meltano so sorting through the best "how tos". I'm beginning to use tap-jira against our Jira instance and data is flowing relatively well to a Postgres database. An early issue I'm struggling with is that I can't find the data from issues.fields object persisted anywhere in the database. In meltano.yml, I am exposing issues.* in the select but it seems limited to the schema it knows. What's the right meltano way for expanding this ... something in the meltano select I misconfigured, custom extractor based on tap-jira, integrating my own catalog.json or equivalent in, something else? my current meltano.yml select looks like this ...

Copy code

select:
    - issues.*
    - changelogs.*
    - worklogs.*
    - issue_comments.*
    - projects.projectKeys
    - projects.projectTypeKey
    - projects.roles
    - projects.self
    - projects.simplified
    - projects.url
    - '!*.historyMetadata*'

running a

meltano select tap-jira --list --all

shows that fields should be available ...

Copy code

[selected ] issues.fields
	[selected ] issues.fields.attachment
	[selected ] issues.fields.created
	[selected ] issues.fields.lastViewed
	[selected ] issues.fields.updated
	[selected ] issues.fieldsToInclude

I do in fact see an issues__fields__attachment table in postgres. Has anyone bumped into similar? I've tried limiting the select to only issues.fields and it yields the same behavior. Also tried applying some of the approaches @nick_hamlin and @douwe_maan discussed but was too clueless to fully understand them yet.

nick_hamlin

11/04/2021, 3:27 PM

@clifford_mosley, I definitely can’t claim this is the right way to solve this problem, but here’s the select I’m using that has been working (and successfully captures

issues.fields

Copy code

select:
    # In this case, it's easier to indicate what we DON'T want
    - '*.*'

    # Universally unhelpful fields
    - '!*.self'
    - '!*.properties'
    - '*.url*'

    # Cruft tables
    - '!versions.*'
    - '!project_types.*'
    - '!project_categories.*'

    # Cruft fields
    - '!users.avatarUrls'
    - '!users.expand'
    - '!users.timeZone'
    - '!resolutions.iconUrl'
    - '!projects.url'
    - '!changelogs.historyMetadata'
    - '!issues.editmeta'
    - '!issues.expand'
    - '!issues.properties'
    - '!issues.renderedFields'
    - '!issues.schema'

clifford_mosley

11/04/2021, 3:28 PM

cool, let me apply them

stephen_bailey

11/04/2021, 4:26 PM

hey @clifford_mosley -- i know that in redshift, there was a config option that would unnest objects automatically, up to a certain level. because the

fields

column in JIRA is just one massive thing i would not be surprised if it was split into a half dozen other files. can you list the tables in the target schema? I would expect them to be in

issues__fields

and other similar tables like the

issues__fields__attachment

one that you mention.

stephen_bailey

11/04/2021, 4:27 PM

the config option was

data_flattening_max_level

in pipelinewise-target-redshift:

Copy code

(Default: 0) Object type RECORD items from taps can be loaded into VARIANT columns as JSON (default) or we can flatten the schema by creating columns automatically.

When value is 0 (default) then flattening functionality is turned off.

https://github.com/transferwise/pipelinewise-target-redshift

clifford_mosley

11/04/2021, 7:08 PM

for future readers, my issue here (haven't solved it all) isn't the tap side ... when I changed target to a csv output, the field data is there. need to figure out why that data is being swallowed on push to postgres now. Nick had suggested in his Redshift version of similar, "fields" were persisted as a json blob that he post processed. This blob does not exist in my postgres table.

clifford_mosley

11/04/2021, 7:08 PM

switching to not using datamill variant of the postgres loader gave me the fields data as jsonb in the postgres db. for me, that's fine as I can post process it.

clifford_mosley

11/04/2021, 7:11 PM

and to your question/suggestion @stephen_bailey, with the datamill variant, I was seeing issues__fields__attachment but no issue__fields_

Open in Slack

Previous Next