for those who have targeted postgresql I wonder how do you a Meltano #troubleshooting

for those who have targeted postgresql.. I wonder ...

emcp

09/09/2021, 7:56 PM

for those who have targeted postgresql.. I wonder how do you add a timestap to the meltano run in python.. I have the following code in my

client.py

.. but when I go to query this in apache superset it is giving me an error.. is there some extra manipulation I should do before I pass the timestamp via the dictionary?? Client.py

Copy code

creation_datetime = datetime.now(timezone.utc)

...


new_record = {
              "key": valid_value.contract.symbol.lower(),
              "datetime_downloaded_from_api": creation_datetime
             }

yield new_record

Apache Superset query tools says SELECT * FROM my_new_table;

Copy code

PostgreSQL Error

postgresql error: '>=' not supported between instances of 'datetime.timedelta' and 'int'

Maybe this is more of a Apache Superset question.. I will try to triage but ask there Here's my streams.py

Copy code

class ContractsStream(myStream):
    """Define custom stream."""
    name = "my_new_table"
    primary_keys = ["contract_id", "datetime_downloaded_from_api", ]
    replication_key = None

    schema = th.PropertiesList(
        th.Property("datetime_downloaded_from_api", th.DateTimeType),
        th.Property("contract_id", th.IntegerType)
    ).to_dict()

emcp

09/09/2021, 8:29 PM

hmmmm I can see perhaps the error involves the fact that the meltano tap default time inputs have a Z at the end.. and what I am giving the tap is something else

Copy code

"datetime_downloaded_from_api": "2021-09-09T20:24:10.901279+00:00"}, "time_extracted": "2021-09-09T20:25:26.000565Z"}

I guess I can try dropping my timestamp entirely and seeing if I can leverage this time_extracted instead

emcp

09/09/2021, 8:30 PM

im trying to pin runs to the rough start time.. instead of the actual moment of extraction.. but I am guessing maybe it's just better to use the raw data and learn SQL to use a range to coalescence the data

emcp

09/09/2021, 9:36 PM

question now.. am I allowed to reference

time_extracted

to be part of my primary key in the Stream definition???

Copy code

class ContractsStream(myStream):
    """Define custom stream."""
    name = "my_new_table"
    primary_keys = ["contract_id", "time_extracted", ]
    replication_key = None

    schema = th.PropertiesList(
        th.Property("datetime_downloaded_from_api", th.DateTimeType),
        th.Property("contract_id", th.IntegerType)
    ).to_dict()

edward_ryan

09/09/2021, 9:41 PM

The Z implies UTC Look forward to hearing if

time_extracted

can be part of the primary key

emcp

09/10/2021, 10:01 AM

thank you @edward_ryan … for some reason I see output when I invoke manually.. but when I run in meltano it boots up.. starts doing something in the tap.. but when I goto check my Postgresql instance nothings showing up.. might be me though… I tried the primary_key on

time_extracted

but backed off in hopes of getting my tap working again.. once I have it downloading into postfresql successfully I can try again.. in meantime I cracked a way forward manually modifying the input to adhere to the UTC with z format.. python by default does something else and I worried that was the cause of apache superset not being able to preview the table.. but alas I think superset still isn’t showing a preview even when I DO have it working in the tap.. I will go bug apache superset slack for more help on that front..

emcp

09/10/2021, 10:33 AM

I think someone has solved the apache superset issue by backing off latesy psycopg binary build… I will test that next week and report back as I try to use

time_extracted

as a part of primary key

emcp

09/10/2021, 10:33 AM

https://apache-superset.slack.com/archives/C0170U650CQ/p1631208440222700?thread_ts=1631180942.218400&cid=C0170U650CQ

emcp

09/10/2021, 3:46 PM

update: pinning my psycopg2-binary successfully fixed the bug in timestamps.. BUT..

time_extracted

is not an included column .. at least not using the current papermills target for postgresql.. maybe it's an option that can be enabled.. but for now I have manually set a column and looked to proceed further from there here's the columns related to datetime that I saw... dunno if I want to put them in the primary key with these names which are a bit obscure

Open in Slack

Previous Next