Hi All,
I was running a benchmark in meltano & wanted to chunk the response by 1024 bytes (using tap-rest-api-msdk & target-s3 plugins) each.
What I am trying to achieve is, though api is producing huge amount of rows (>10k), wanted to write API response into s3 in a streaming fashion (smaller data volume) & release the memory. It will help us to reduce the memory consumption. Will result in a longer running job. But that's acceptable.
In the streams, I have set stream = True
def requests_session(self) -> requests.Session:
if not self._requests_session or not self._requests_session.stream:
self._requests_session = requests.Session()
self._requests_session.stream = True
return self._requests_session
Reading the chunk in response using
for chunk in response.iter_content(chunk_size=1024):
// yield from json code
Do you have any reference how to do the same? Is this possible to achieve the above mentioned scenario?