Hey I m using tap hubspot to tap postgres and facing problem Meltano #troubleshooting

Hey! I'm using tap-hubspot to tap-postgres and fac...

joao_vitor_ferreira_silva

02/16/2022, 2:07 PM

Hey! I'm using tap-hubspot to tap-postgres and facing problem with json schema. I checked the doc of singer and saw that schema has been defined, but on postgress all colums is arriving as json. There is a way to normalize and mapping the data to postgres?

pat_nadolny

02/16/2022, 2:28 PM

Hey @joao_vitor_ferreira_silva 👋 - theres two ways to do this, either by using the data flattening setting in the target which will turn nested json into their own columns or its also common to manage denesting the json in your dbt staging models, although some warehouses are better than others at that. I think either way should work for you!

joao_vitor_ferreira_silva

02/16/2022, 3:48 PM

Thanks for the answer @pat_nadolny! Helped a lot. But now i'm stuck in another problem 😕 Do you ever saw that? Looks like schema have two columns with the same name and i have no idea how fix that unless mapping all columns by hand.

joao_vitor_ferreira_silva

02/16/2022, 3:49 PM

Do we have any setting to keep the "father object name"? Like "deals.id"

pat_nadolny

02/16/2022, 6:32 PM

@joao_vitor_ferreira_silva if the field isnt needed you could use selection rules to exclude it, that would get you unblocked for now at least

pat_nadolny

02/16/2022, 6:33 PM

@edgar_ramirez_mondragon do you know of a way to do this? Stream map aliasing can solve it but I dont think either tap or target are SDK based

edgar_ramirez_mondragon

02/16/2022, 6:40 PM

@pat_nadolny I can't think of another way (as you correctly point out neither are SDK-based). I'm curious one can end up with two unnested fields with the same name.

Do we have any setting to keep the "father object name"? Like "deals.id"

@joao_vitor_ferreira_silva do you mean this is a nested array turned into a new table, that by chance got an unnested field with the same name as the parent table?

joao_vitor_ferreira_silva

02/16/2022, 7:29 PM

Thanks again @pat_nadolny and @edgar_ramirez_mondragon. I 'll try the solution with selection rules. About the error i got the log

Copy code

ValueError: Duplicate column name produced in schema: phtbccadc__value cmd_type=loader job_id=hubspot-to-postgres name=target-postgres run_id=94cbbb95-ca7f-4c54-9283-b97130176b6b stdio=stderr

so i start to think the problem could be with the columns, right? (Sorry if not, i'm just a begginer with Meltano)

joao_vitor_ferreira_silva

02/16/2022, 7:46 PM

Been more specific, the problem is on file

db_sync.py

target-postgres

folder in this piece of code :

def flatten_schema(d, parent_key=[], sep='__', level=0, max_level=0):

items = []

if 'properties' not in d:

return {}

for k, v in d['properties'].items():

new_key = flatten_key(k, parent_key, sep)

if 'type' in v.keys():

if 'object' in v['type'] and 'properties' in v and level < max_level:

items.extend(flatten_schema(v, parent_key + [k], sep=sep, level=level + 1, max_level=max_level).items())

else:

items.append((new_key, v))

else:

if len(v.values()) > 0:

if list(v.values())[0][0]['type'] == 'string':

list(v.values())[0][0]['type'] = ['null', 'string']

items.append((new_key, list(v.values())[0][0]))

elif list(v.values())[0][0]['type'] == 'array':

list(v.values())[0][0]['type'] = ['null', 'array']

items.append((new_key, list(v.values())[0][0]))

elif list(v.values())[0][0]['type'] == 'object':

list(v.values())[0][0]['type'] = ['null', 'object']

items.append((new_key, list(v.values())[0][0]))

key_func = lambda item: item[0]

sorted_items = sorted(items, key=key_func)

for k, g in itertools.groupby(sorted_items, key=key_func):

if len(list(g)) > 1:

raise ValueError('Duplicate column name produced in schema: {}'.format(k))

return dict(sorted_items)

pat_nadolny

02/16/2022, 7:50 PM

@joao_vitor_ferreira_silva yeah that sounds right - maybe sharing some sample data would make it easier to understand whats happening.

meltano invoke tap-hubspot > output.json

will send the output to a file. You can share some of that as long as its non-sensitive data

3 Views

Open in Slack

Previous Next