Hi I am new to meltano For now I have built a custom tap to Meltano #singer-tap-development

Hi I am new to meltano. For now I have built a cus...

Prithiv Vijai

06/13/2024, 3:59 PM

Hi I am new to meltano. For now I have built a custom tap to extract data from some API , is there any way to fetch all the data from the API without specifying schema like we would do in normal python scripts , because I am not aware of the scheme or fields in the API from which I am trying to fetch data

Reuben (Matatika)

06/13/2024, 4:27 PM

You probably want to set up dynamic discovery for your tap. Have a look at this code sample from the SDK docs: https://sdk.meltano.com/en/latest/code_samples.html#dynamically-discovering-schema-for-a-stream

Reuben (Matatika)

06/13/2024, 4:30 PM

This is how we do it in `tap-google-sheets`: https://github.com/Matatika/tap-google-sheets/blob/55162e5d40e3bba54029604e7ede5934438ba200/tap_google_sheets/tap.py#L94-L124

Prithiv Vijai

06/13/2024, 6:14 PM

Hi @Reuben (Matatika) thanks for replying but I am unable to incorporate this logic for my use case , need help! below is my custom tap which i have built using the cookiecutter template from gitlab. I tried many suggestions from ChatGPT but nothing worked.

my_custom_tap.txt

Reuben (Matatika)

06/13/2024, 6:21 PM

What happens/what are you expecting to happen?

Reuben (Matatika)

06/13/2024, 6:23 PM

I can see you've defined a bunch of field names, but I assume you want that to come from some other API call response.

Prithiv Vijai

06/13/2024, 6:23 PM

This code is working fine but there are other fields in my API which is ignored because I have not specified them in my schema , but I want them too

Prithiv Vijai

06/13/2024, 6:24 PM

What I want is without specifying the schema I want to fetch everything(all the fields) from the API

Reuben (Matatika)

06/13/2024, 6:25 PM

Can you provide a sample response so I can see what level the fields you want to pull out are at?

Prithiv Vijai

06/13/2024, 6:26 PM

See this is the log when i run "poetry run tap-flexsod > out.json"

Prithiv Vijai

06/13/2024, 6:30 PM

I am getting the output stored but its only fetching the fields i have specified in the schema but i want to fetch the data from every field in my API (because i dont know the field names prior)

Reuben (Matatika)

06/13/2024, 6:43 PM

You will have to make the request to

/refService/getSubtypes

and dynamically build up

SubtypesStream

discover_streams

Prithiv Vijai

06/13/2024, 7:01 PM

Could you please explain more on that or provide me with some example ?

Reuben (Matatika)

06/13/2024, 7:31 PM

Copy code

class Tapflexsod(Tap):
    """flexsod tap class."""
    name = "tap-flexsod"


    def discover_streams(self) -> List[Stream]:
        """Return a list of discovered streams."""
        subtypes_stream = SubtypesStream(tap=self, schema={})
        subtypes_records = subtypes_stream.get_records()
        subtypes_stream.schema = th.PropertiesList(*[th.Property(f, th.StringType) for f in next(subtypes_records)]).to_dict()

        return [subtypes_stream]

Of course this will make two requests to

/refService/getSubtypes

so at that point you might as well stub out

get_records

to return the records you already have.

Copy code

def discover_streams(self) -> List[Stream]:
        """Return a list of discovered streams."""
        subtypes_stream = SubtypesStream(tap=self, schema={})
        subtypes_records = list(subtypes_stream.get_records())
        subtypes_stream.schema = th.PropertiesList(*[th.Property(f, th.StringType) for f in subtypes_records[0]]).to_dict()
        subtypes_stream.get_records = lambda _: iter(subtypes_records)

        return [subtypes_stream]

Prithiv Vijai

06/14/2024, 2:41 PM

Thank you @Reuben (Matatika) its working.

👍 1

2 Views

Open in Slack

Previous Next