matthew_funk
08/22/2023, 4:22 PMReuben (Matatika)
08/22/2023, 8:36 PMpost_process
is the best place to modify record data. From our previous conversation, am I correct in assuming you want the date from the URL as a record property?
Unfortunately, post_process
only provides access to the record row
and context
as you have seen, and modifying the stream context directly is considered bad practice. You're may have to do this in parse_response
, where you have access to the response to pull out the date from the request URL. Of course, here you will have to update your stream schemas to pick up the new record property (e.g. current_date
).Reuben (Matatika)
08/22/2023, 8:43 PMcontext
correct here? As the docs suggest, I assume it is not a good idea to be arbitrarily setting properties in context (like you can in click
, for example).edgar_ramirez_mondragon
08/22/2023, 8:58 PMparse_response
is the right place to accomplish this.matthew_funk
08/24/2023, 3:23 PMmatthew_funk
08/24/2023, 5:10 PMReuben (Matatika)
08/24/2023, 5:59 PMcurrent_date
to the DemandStream
schema yet?matthew_funk
08/24/2023, 6:07 PMmatthew_funk
08/24/2023, 6:12 PMReuben (Matatika)
08/24/2023, 6:23 PM.jsonl
file. I run rm output/<stream>.jsonl; meltano run <tap> target-jsonl
a fair bit to get around that behaviour.matthew_funk
08/24/2023, 7:10 PMReuben (Matatika)
08/24/2023, 8:14 PMpandas
just to localize a date... I think it is a fairly big package also, so adding it as a dependency is probably going to increase plugin install time significantly.
As far as adding a dependency to the tap, the easiest way to do this is poetry add <package>
. This adds entries in pyproject.toml
and poetry.lock
, as well as installing the package to the Poetry-managed virtual environment for the project (probably not where your pip install
was targeting).matthew_funk
08/24/2023, 8:17 PMReuben (Matatika)
08/24/2023, 10:15 PMdatetime
module in Python to see how you can leverage that in the way you want (no doubt it is possible) - from a quick search I see a library called pytz
mentioned a fair amount, so maybe look at how that works with datetime
for handling timezones.
https://docs.python.org/3/library/datetime.html
https://pypi.org/project/pytz/Reuben (Matatika)
08/24/2023, 10:20 PMcurrent_date
timestamp value in the record you are trying to apply the offset for?matthew_funk
08/25/2023, 1:38 PMReuben (Matatika)
08/25/2023, 2:07 PM>>> import datetime
>>> today = datetime.date.today()
>>> time = datetime.time.fromisoformat("15:36")
>>> str(today)
'2023-08-25'
>>> str(time)
'15:36:00'
>>> combined = datetime.datetime.combine(today, time)
>>> str(combined)
'2023-08-25 15:36:00'
Not sure about the offset bit though...Reuben (Matatika)
08/25/2023, 2:09 PMTime
from the record in post_process
(I'm assuming you're already in post_process
for the conversion), since it will be unneeded given current_date
having the translated time component.Reuben (Matatika)
08/25/2023, 2:20 PM>>> str(combined.replace(tzinfo=datetime.timezone(datetime.timedelta(hours=-7))))
'2023-08-25 15:36:00-07:00'
🤔