Hi everyone. New to meltano so sorting through the...
# plugins-general
c
Hi everyone. New to meltano so sorting through the best "how tos". I'm beginning to use tap-jira against our Jira instance and data is flowing relatively well to a Postgres database. An early issue I'm struggling with is that I can't find the data from issues.fields object persisted anywhere in the database. In meltano.yml, I am exposing issues.* in the select but it seems limited to the schema it knows. What's the right meltano way for expanding this ... something in the meltano select I misconfigured, custom extractor based on tap-jira, integrating my own catalog.json or equivalent in, something else? my current meltano.yml select looks like this ...
Copy code
select:
    - issues.*
    - changelogs.*
    - worklogs.*
    - issue_comments.*
    - projects.projectKeys
    - projects.projectTypeKey
    - projects.roles
    - projects.self
    - projects.simplified
    - projects.url
    - '!*.historyMetadata*'
running a
meltano select tap-jira --list --all
shows that fields should be available ...
Copy code
[selected ] issues.fields
	[selected ] issues.fields.attachment
	[selected ] issues.fields.created
	[selected ] issues.fields.lastViewed
	[selected ] issues.fields.updated
	[selected ] issues.fieldsToInclude
I do in fact see an issues__fields__attachment table in postgres. Has anyone bumped into similar? I've tried limiting the select to only issues.fields and it yields the same behavior. Also tried applying some of the approaches @nick_hamlin and @douwe_maan discussed but was too clueless to fully understand them yet.
n
@clifford_mosley, I definitely can’t claim this is the right way to solve this problem, but here’s the select I’m using that has been working (and successfully captures
issues.fields
Copy code
select:
    # In this case, it's easier to indicate what we DON'T want
    - '*.*'

    # Universally unhelpful fields
    - '!*.self'
    - '!*.properties'
    - '*.url*'

    # Cruft tables
    - '!versions.*'
    - '!project_types.*'
    - '!project_categories.*'

    # Cruft fields
    - '!users.avatarUrls'
    - '!users.expand'
    - '!users.timeZone'
    - '!resolutions.iconUrl'
    - '!projects.url'
    - '!changelogs.historyMetadata'
    - '!issues.editmeta'
    - '!issues.expand'
    - '!issues.properties'
    - '!issues.renderedFields'
    - '!issues.schema'
c
cool, let me apply them
s
hey @clifford_mosley -- i know that in redshift, there was a config option that would unnest objects automatically, up to a certain level. because the
fields
column in JIRA is just one massive thing i would not be surprised if it was split into a half dozen other files. can you list the tables in the target schema? I would expect them to be in
issues__fields
and other similar tables like the
issues__fields__attachment
one that you mention.
the config option was
data_flattening_max_level
in pipelinewise-target-redshift:
Copy code
(Default: 0) Object type RECORD items from taps can be loaded into VARIANT columns as JSON (default) or we can flatten the schema by creating columns automatically.

When value is 0 (default) then flattening functionality is turned off.
https://github.com/transferwise/pipelinewise-target-redshift
c
for future readers, my issue here (haven't solved it all) isn't the tap side ... when I changed target to a csv output, the field data is there. need to figure out why that data is being swallowed on push to postgres now. Nick had suggested in his Redshift version of similar, "fields" were persisted as a json blob that he post processed. This blob does not exist in my postgres table.
switching to not using datamill variant of the postgres loader gave me the fields data as jsonb in the postgres db. for me, that's fine as I can post process it.
and to your question/suggestion @stephen_bailey, with the datamill variant, I was seeing issues__fields__attachment but no issue__fields_