I'm writing a new custom tap using the Meltano SDK...
# plugins-general
c
I'm writing a new custom tap using the Meltano SDK "RESTStream" class. I want to pick a replication_key (last updated timestamp), but in my input that I received from the remote API, the timestamp is a nested field. Is there an easy way to use a nested field as the replication_key? example data:
Copy code
"results": [
 {
   "last_updated": {
     "value": "2022-03-01 00:00:10",
     "display_value": "2022-03-01 00:00:10"
   },
   "name": {
     "value": "John Doe",
     "display_value": "John Doe"
   }
 },
 {
   .. next record
 }
]
a
Hi, @christoph. Yes, this is certainly doable with a few steps. • In your
post_process()
method, you can say
record["last_updated"] = record.pop("last_updated")["value"]
or similar, which will replace the nested representation with a simple top-level one. • Then
replication_key
can just be
"last_updated"
. • And in your
schema
declaration, you just want to remember to declare the schema for
"last_updated"
as a single DateTime, rather than as
object
with two nested properties. Does this help?
For primary keys and replication keys alike, you generally really do want those as top-level elements. Generally just a tiny bit of work in
post_process()
should give you the shape of data that is easier to work with.
c
In your
post_process()
method
Ah. Right! Thanks. That was the missing piece of the puzzle! It's the one method from the cookiecutter template that I've always deleted straight away, hence it wasn't on my mind when looking for a possible solution 😂 Thanks for the input @aaronsteers I'll give that a go shortly.
a
Great! Feedback always appreciated. Glad I could help.