Hello! Pretty new to data eng and had never heard ...
# getting-started
g
Hello! Pretty new to data eng and had never heard of meltano before. I found myself with an opportunity to learn with a task, but i`m having issues. Do someone could share some insights on where i could look up for documentation on meltano? The HUB and docs.meltano ain't helping that much to a newbie on the subject. It`s kinda of a simple project on paper, but it`s taking me some nights of sleep.
e
Hi @Guilherme Deutschendorf!
I found myself with an opportunity to learn with a task, but i`m having issues
Can you say more about what issues you're having?
Do someone could share some insights on where i could look up for documentation on meltano? The HUB and docs.meltano ain't helping that much to a newbie on the subject.
The best resource is probably https://docs.meltano.com/getting-started/. If you need a bit more low-level details of how Meltano's extractors and loaders communicate, you could give https://hub.meltano.com/singer/spec a read.
It`s kinda of a simple project on paper, but it`s taking me some nights of sleep.
Feel free to share more details on the project if you can. Myself or other members of the community can surely help figure out the rough edges. There's probably a few 100 years worth of cumulative data eng experience in this community 😄
g
I'm building a pipeline to extract the tables from a single postgres database. The task was that every single table should be extracted independent, so i created multiple tap-postgres inheritances. It works fine for most of them, but there's two extractions that doesn't work. I'm seeking some guidance on solving it, if anyone could share some light i'd appreciate here's one of the logs:
Copy code
2024-07-03T14:33:43.423169Z [info     ]     raise InvalidRecord(e.message, record) from e cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423282Z [info     ] singer_sdk.exceptions.InvalidRecord: Record Message Validation Error: Decimal('32.38') is not of type 'string', 'null' cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423445Z [info     ] Traceback (most recent call last): cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423567Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/sinks/core.py", line 121, in validate cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423687Z [info     ]     self.validator.validate(record) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423815Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/jsonschema/validators.py", line 451, in validate cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.423935Z [info     ]     raise error                cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424045Z [info     ] jsonschema.exceptions.ValidationError: Decimal('32.38') is not of type 'string', 'null' cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424163Z [info     ]                                cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424272Z [info     ] Failed validating 'type' in schema['properties']['freight']: cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424378Z [info     ]     {'type': ['string', 'null']} cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424485Z [info     ]                                cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424592Z [info     ] On instance['freight']:        cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424697Z [info     ]     Decimal('32.38')           cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424820Z [info     ]                                cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.424931Z [info     ] The above exception was the direct cause of the following exception: cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425040Z [info     ]                                cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425146Z [info     ] Traceback (most recent call last): cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425253Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/bin/target-csv", line 10, in <module> cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425364Z [info     ]     sys.exit(TargetCSV.cli())  cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425476Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/click/core.py", line 1157, in __call__ cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425595Z [info     ]     return self.main(*args, **kwargs) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425703Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/click/core.py", line 1078, in main cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425832Z [info     ]     rv = self.invoke(ctx)      cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.425952Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/plugin_base.py", line 80, in invoke cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426067Z [info     ]     return super().invoke(ctx) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426174Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/click/core.py", line 1434, in invoke cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426288Z [info     ]     return ctx.invoke(self.callback, **ctx.params) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426397Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/click/core.py", line 783, in invoke cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426510Z [info     ]     return __callback(*args, **kwargs) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426621Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 567, in invoke cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426742Z [info     ]     target.listen(file_input)  cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426854Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 36, in listen cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.426964Z [info     ]     self._process_lines(file_input) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427071Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 307, in _process_lines cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427180Z [info     ]     counter = super()._process_lines(file_input) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427287Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/io_base.py", line 95, in _process_lines cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427399Z [info     ]     self._process_record_message(line_dict) cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427514Z [info     ]   File "/home/guilherme/Área de trabalho/Projeto/projeto/.meltano/loaders/target-csv/venv/lib/python3.9/site-packages/singer_sdk/target_base.py", line 357, in _process_record_message cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
2024-07-03T14:33:43.427634Z [info     ]     sink._validate_and_parse(transformed_record)  # noqa: SLF001 cmd_type=loader name=target-csv run_id=6bec69a3-ccf1-4cf7-abca-9a5004bf2769 state_id=2024-07-03T143341--tap-postgres--orders--target-csv stdio=stderr
e
Ok, so it's a bit buried in there but
freight
seems to come in as
string
type but
Decimal('32.38')
value. Do you know the type of that field in the source database?
g
it's float4 / real
and the other extraction that i'm having issue is the same problem, a type real that's coming out as string
e
Gotcha. A quick workaround would be to use schema to override the type of that field:
Copy code
plugins:
  extractors:
  - name: tap-postgres--myinheritedtap
    schema:
      [the-schema-name]-[the-table-name]:
        freight:
          type: [numeric, "null"]
Long-term, we may want to add it to the types mapping in https://github.com/MeltanoLabs/tap-postgres/blob/ee35e85c523f27b95d429d1cdfaca352fdba87cd/tap_postgres/client.py#L191