paul_tiplady
02/24/2022, 6:47 PM65
is currently omitted from the tap’s singer schema as far as I can see:
"amount": {
"inclusion": "available",
"multipleOf": 0.01,
"type": [
"null",
"number"
]
},
One other solution is just to flag “use BIGDECIMAL instead of DECIMAL” for all columns created in BQ by the target. But that’s obviously not a universal solution.paul_tiplady
02/24/2022, 6:52 PMpaul_tiplady
02/24/2022, 6:52 PMpaul_tiplady
02/24/2022, 6:57 PMbq-bignumeric
type if we can map the tap’s schema fields accordingly. But this seems unsatisfying from a Singer ecosystem perspective.paul_tiplady
02/24/2022, 6:59 PMedgar_ramirez_mondragon
02/24/2022, 7:14 PMpaul_tiplady
02/24/2022, 8:24 PMbq-bigint
) are problematic.
2. It means I need to manually annotate every field to override the inferred type; schema inference is one of the big selling points of using Singer in the first place. These schema annotations become annoying to maintain; I already have a Python script to generate my meltano.yml
file for 50 source-tables, and I’m trying to make that layer thinner, not thicker.
3. I think it’s probably a footgun for new users, too. The destination-schema will look correct at a glance, and work for some data, until you happen to load some data that doesn’t fit in the non-big-DECIMAL. Basically, the schema the tap generates is subtly incorrect, and if you don’t know to look for it, you could easily miss the problem.
If the goal is to have taps/targets plug together cleanly and composably, then I think this is a fundamental problem with the Singer usage of JSONSchema. Interested in everyone else’s thoughts on whether this should be something that “just works”, or if it’s reasonable to have the long-term solution be to require schema overrides here.