What s the difference between the `Per record` and `Per batc Meltano #singer-target-development

What’s the difference between the `Per record` and...

hassan_syyid

07/08/2021, 9:50 PM

What’s the difference between the

Per record

and

Per batch

options for

serialization_method

with singer target sdk?

douwe_maan

07/08/2021, 10:00 PM

cc @aaronsteers

aaronsteers

07/08/2021, 10:11 PM

Hi, @hassan_syyid - The difference is just in how your target prefers to write records. Are you writing one record at a time (such as with a singleton API endpoint), or do you need to write many rows at once in order to get the best performance (such as with Snowflake/Redshift that require loading via CSV)?

hassan_syyid

07/08/2021, 10:11 PM

Ahh makes sense thanks

aaronsteers

07/08/2021, 10:11 PM

👍

hassan_syyid

07/08/2021, 10:11 PM

Writing an Airtable target currently 🙂

aaronsteers

07/08/2021, 10:12 PM

Oh cool! Do you know which camp that falls into? Or perhaps it supports both?

hassan_syyid

07/08/2021, 10:13 PM

I think it’s better suited for batch

aaronsteers

07/08/2021, 10:18 PM

Yeah, I see it does support batch... I was curious so I wanted to check it out. What's weird/interesting is that it seems there are different API contracts depending on how you create the objects in the GUI interface. For a random object in my account, I found these docs. On the one hand this particular endpoint doesn't want to accept more than 10 records per POST, but on the other hand it also doesn't want greater than 5 total requests per second. (50 per second is much better than only 5 records per second.) Your mileage may differ though, depending on how your objects are setup.

aaronsteers

07/08/2021, 10:19 PM

Because rate limiting may become an issue for you, I'll link this issue. We don't have formal rate limit handling yet but you can use your own custom logic and/or post into this issue with ideas/proposals for improved central SDK-based handling: Formal handling of API rate limits (#140) · Issues · Meltano / Meltano SDK for Singer Taps and Targets · GitLab

hassan_syyid

07/08/2021, 10:27 PM

@aaronsteers If I change

DEFAULT_BATCH_SIZE_ROWS

will the default

process_rows

create batches of max 10 rows?

aaronsteers

07/08/2021, 10:31 PM

Sorry - you would think so... but actually I think you want

Sink.max_size

aaronsteers

07/08/2021, 10:31 PM

I'll log an issue to clean up the ambiguity.

aaronsteers

07/08/2021, 10:32 PM

Setting

max_size = 10

should force

is_full

to report 'true' whenever the sync reaches 10 records, which then causes the

process_batch()

method to be called.

hassan_syyid

07/08/2021, 10:32 PM

Do I need to empty the

context["records"]

myself/

aaronsteers

07/08/2021, 10:33 PM

Nope. The context will be disposed of when you're done.

hassan_syyid

07/08/2021, 10:37 PM

Copy code

Uploaded 10 | success=True
Uploaded 20 | success=False
level=ERROR message={"error":{"type":"INVALID_RECORDS","message":"A maximum of 10 records can be created per request but you have provided 20."}}
=Uploaded 28 | success=False
level=ERROR message={"error":{"type":"INVALID_RECORDS","message":"A maximum of 10 records can be created per request but you have provided 28."}}

hassan_syyid

07/08/2021, 10:37 PM

Are you certain?

hassan_syyid

07/08/2021, 10:37 PM

Seems like

records

count is going up

aaronsteers

07/08/2021, 10:40 PM

That would be a bug. 🐛 Can you go ahead and try resetting the 'records' entry and see if that resolves it?

hassan_syyid

07/08/2021, 10:41 PM

Yup, adding

context["records"] = []

this fixed it

aaronsteers

07/08/2021, 10:53 PM

Okay, thanks for the real-time feedback. I'm logging that as a bug and will fix in the next release.

aaronsteers

07/08/2021, 10:53 PM

Bugs notwithstanding, very cool to see the logs of records being posted! 🙂

aaronsteers

07/08/2021, 10:54 PM

Did you end up using a generic REST / requests library approach, or custom airtable library for auth and posting updates?

hassan_syyid

07/08/2021, 11:01 PM

Just used requests for now

hassan_syyid

07/08/2021, 11:01 PM

Got a little prototype working which is super cool

hassan_syyid

07/08/2021, 11:01 PM

https://github.com/gluestickxyz/target-airtable

hassan_syyid

07/08/2021, 11:01 PM

Barely wrote any code 😅

douwe_maan

07/08/2021, 11:02 PM

Just how we like it 😄

aaronsteers

07/08/2021, 11:09 PM

Nice! Looks like

<100

lines of code. Maybe a new record? 🙂

steven_brandt

01/27/2022, 7:19 PM

Hey guys, I just found this thread about the target-airtable. I actually need one, too. How can I use @hassan_syyid’s implementation to install it into a python venv and use it together with a singer tap? 🤔

hassan_syyid

01/27/2022, 10:04 PM

You should be able to use it normally. Usually what I do is create a venv for the tap and one for the target. So you can do:

Copy code

tap-quickbooks --config config.json --catalog catalog.json > data.txt

switch venv

Copy code

cat data.txt | target-airtable --config config.json

Open in Slack

Previous Next