Hi I am new to meltano. For now I have built a cus...
# singer-tap-development
p
Hi I am new to meltano. For now I have built a custom tap to extract data from some API , is there any way to fetch all the data from the API without specifying schema like we would do in normal python scripts , because I am not aware of the scheme or fields in the API from which I am trying to fetch data
r
You probably want to set up dynamic discovery for your tap. Have a look at this code sample from the SDK docs: https://sdk.meltano.com/en/latest/code_samples.html#dynamically-discovering-schema-for-a-stream
p
Hi @Reuben (Matatika) thanks for replying but I am unable to incorporate this logic for my use case , need help! below is my custom tap which i have built using the cookiecutter template from gitlab. I tried many suggestions from ChatGPT but nothing worked.
r
What happens/what are you expecting to happen?
I can see you've defined a bunch of field names, but I assume you want that to come from some other API call response.
p
This code is working fine but there are other fields in my API which is ignored because I have not specified them in my schema , but I want them too
What I want is without specifying the schema I want to fetch everything(all the fields) from the API
r
Can you provide a sample response so I can see what level the fields you want to pull out are at?
p
See this is the log when i run "poetry run tap-flexsod > out.json"
I am getting the output stored but its only fetching the fields i have specified in the schema but i want to fetch the data from every field in my API (because i dont know the field names prior)
r
You will have to make the request to
/refService/getSubtypes
and dynamically build up
SubtypesStream
in
discover_streams
.
p
Could you please explain more on that or provide me with some example ?
r
Copy code
class Tapflexsod(Tap):
    """flexsod tap class."""
    name = "tap-flexsod"


    def discover_streams(self) -> List[Stream]:
        """Return a list of discovered streams."""
        subtypes_stream = SubtypesStream(tap=self, schema={})
        subtypes_records = subtypes_stream.get_records()
        subtypes_stream.schema = th.PropertiesList(*[th.Property(f, th.StringType) for f in next(subtypes_records)]).to_dict()

        return [subtypes_stream]
Of course this will make two requests to
/refService/getSubtypes
so at that point you might as well stub out
get_records
to return the records you already have.
Copy code
def discover_streams(self) -> List[Stream]:
        """Return a list of discovered streams."""
        subtypes_stream = SubtypesStream(tap=self, schema={})
        subtypes_records = list(subtypes_stream.get_records())
        subtypes_stream.schema = th.PropertiesList(*[th.Property(f, th.StringType) for f in subtypes_records[0]]).to_dict()
        subtypes_stream.get_records = lambda _: iter(subtypes_records)

        return [subtypes_stream]
p
Thank you @Reuben (Matatika) its working.
👍 1