Hello everyone I m pretty new here but for the past couple o Meltano #getting-started

Hello, everyone! I'm pretty new here, but for the ...

pedro_miguel_taboas

07/19/2023, 9:32 AM

Hello, everyone! I'm pretty new here, but for the past couple of weeks, I've been delving into Meltano and the SDK. At work, we're creating a tap, and while we've managed to make it perform its intended function, I've encountered a few issues and have some questions about our approach. Our ultimate goal is to load data from an Excel file into Snowflake. This extractor is making calls to an endpoint to retrieve an object with data from a series of hosted files. With this data, we then call another endpoint, using the most recent record of a specific file name (an Excel file). Firstly, how would you go about it? Would you opt for two streams or just one? My other question is this: if you were to implement a dynamic schema discovery function, how would you handle it? Currently, our working version employs only one stream, and the schema discovery is managed by a dedicated schema function. This function retrieves data from the API and infers the schema, based on the response received(The transformed contents of the excel file). Lastly, and my current issue, the schema discovery function works, but only when it's cached. Otherwise, it gets 'stuck' and fails to complete the job. As a newcomer, my debugging skills are limited, and I'm finding it challenging to comprehend this behaviour (the caching was merely my intuition, without fully grasping what might be happening). Can someone kindly explain to me what could be causing this issue? Or perhaps pointing me to where I could start debugging it? While it's currently working as expected, I'm a bit concerned about potential surprises down the road. Thank you for all the sharing; it has already been immensely helpful to me. :) Have a great day!

visch

07/19/2023, 12:25 PM

Here's a few quick answers. Try this extractor for getting Excel data https://hub.meltano.com/extractors/tap-spreadsheets-anywhere or to take inspiration from To debug meltano to see exactly what commands are being run use the debug flag see https://docs.meltano.com/reference/command-line-interface#debugging

user

07/19/2023, 1:43 PM

Also IDE debugging tips https://sdk.meltano.com/en/latest/dev_guide.html#ide-tips

user

07/19/2023, 1:48 PM

Firstly, how would you go about it? Would you opt for two streams or just one?

It depends if you care to store the results of the first API call with the object of files to request. It sounds like that first API call is only used to retrieve information about how to retrieve the actual data in the second call, so from what I hear I'd probably go with 1 stream that makes 2 requests. The data from the first request can be used to make the second request then its safe to thrown it away.

Open in Slack

Previous Next