carlos_gonzalez
10/28/2022, 4:17 PMrun command and defining new properties for the stream with the new name but using the original value. That worked just fine, but I got a couple of problems:
• The run command does not allow to manually provide a catalog for the extractor (tap-s3-csv) as in elt command. Since I'm dealing with CSV files, the extractor sometimes autodetects data types wrongly because it uses a sample of the data for that
• The way I'm renaming the stream properties has the side effect to lose the original data types and the resulting stream identifies everything as a string. I found a way to go around this issue by using int( and float( expressions but that introduced conversion errors for invalid values (such as empty strings) and the expressions doesn't support string to datetime conversions either
Any ideas how to deal with those two issues?aaronsteers
10/28/2022, 6:02 PMcatalog mapping in your yaml. If you want to use multiple inputs, you can declare a few instances of your tap using the inherits_from feature.aaronsteers
10/28/2022, 6:03 PMaaronsteers
10/28/2022, 6:03 PM```extractors:
- name: tap-gitlab
catalog: extract/tap-gitlab.catalog.json```
aaronsteers
10/28/2022, 6:06 PMI found a way to go around this issue by usingAs of now, we don't yet have datetime support in mappers, and the hints you are using are the best that is currently available. To solve for null values, you could use a workaround ofandint(expressions but that introduced conversion errors for invalid values (such as empty strings) and the expressions doesn't support string to datetime conversions eitherfloat(
str(my_col or '') , which just relies on standard Python 'or' operator to coalesce from a null value to a non-null one.aaronsteers
10/28/2022, 6:08 PMcarlos_gonzalez
10/29/2022, 3:07 AMsteve_clarke
11/01/2022, 5:17 AM