Hey folks just wanted to share some WIP I put forth on tap m Meltano #plugins-general

Hey folks, just wanted to share some WIP I put for...

paul_tiplady

04/26/2022, 4:21 PM

Hey folks, just wanted to share some WIP I put forth on tap-mysql for representing the precision/scale of a DECIMAL field: https://github.com/transferwise/pipelinewise-tap-mysql/pull/108. This is following up from target-bigquery putting in some work to infer DECIMAL scale from the

multipleOf

field: https://github.com/adswerve/target-bigquery/pull/23. However we noted that it’s not possible to correctly define a field with precision using just

multipleOf

, you also need to be able to compute the precision too. I’m not familiar with how other plugins tackle this; the seems like it should probably go at the

singer

library level to be honest, as I think every tap should really implement logic like this. Any pointers to prior art here that might be useful to take into account?

aaronsteers

04/26/2022, 5:07 PM

We have an SDK discussion issue here: How should numeric transforms be handed from tap to target? (#342) · Issues · Meltano / Meltano SDK for Singer Taps and Targets · GitLab

aaronsteers

04/26/2022, 5:09 PM

I'm not sure I follow you on this point:

However we noted that it’s not possible to correctly define a field with precision using just
multipleOf
, you also need to be able to compute the precision too.

Can you say more about this?

paul_tiplady

04/28/2022, 5:25 PM

Sure, let’s say you have a

DECIMAL(40,2)

. If you do

tap-mysql | target-bigquery

multipleOf

can get you

DECIMAL(???, 2)

, i.e. the scale, but not the precision. In JSONSchema, the only way I found to represent the precision was

maximum

minimum

. (It seems you guys are thinking along the same lines in the meltano target-postgres you linked). The target can take a guess here if the precision is not specified, but then worst-case you’ll have an invalid schema that may look correct when you run your first pipeline, but produce a runtime loader error when you eventually encounter a row that’s bigger than the target schema supports. (For example, if you pick BigQuery NUMERIC but the source table actually requires BIGNUMERIC to hold every possible row). This should really fail at “compile time”, i.e. the target should reject the initial schema message if it can’t represent it correctly, or at very least produce a warning for the operator to resolve.

felix_kenton

05/05/2022, 4:05 PM

I’ve just noticed that all our fixed-precision NUMERIC/DECIMAL types in PostgreSQL have resulted in FLOAT columns in our Snowflake target. This seems like a simpler case than determining the difference between NUMERIC and BIGNUMERIC in BigQuery, as Snowflake effectively only has a single NUMBER type for fixed all precision use cases. Am I effectively running into the same issue of precision/scale being lost in translation, or something different?

mark_johnston

08/19/2022, 12:30 AM

I'm working on a tap-db2 -> target-snowflake implementation and have gone for the following approach to setting up different data types in Snowflake - and not losing precision in the RECORD messages: • DECIMALs from DB2 are given format property "singer.decimal" in the schema, with type property "number". The numeric scale & precision are stored as

schema[additionalProperties][scale_precision]

, e.g.

"additionalProperties":{"scale_precision":"(15,9)"}

• When the tap writes the RECORD message, "singer.decimal" informs the data value to be output into the message as a string • The snowflake target obtains the scale/precision from the schema message to create a column of type NUMBER(scale, precision) The idea for the singer.decimal type came from tap-oracle Forks: tap-db2 (feature/singer_decimal) target-snowflake (feature/snowflake_number_type) Pip_url(s) for meltano.yml if you would like to test:

Copy code

pip_url: git+<https://github.com/mjsqu/tap-db2.git@feature/singer_decimal>
pip_url: git+<https://github.com/mjsqu/pipelinewise-target-snowflake@feature/snowflake_number_type>

Open in Slack

Previous Next