anirudh goverdhana
03/19/2024, 11:52 AMReuben (Matatika)
03/19/2024, 12:39 PMmeltano.yml
config and the error you are seeing (logs)?Reuben (Matatika)
03/19/2024, 2:21 PMplugins:
extractors:
- name: tap-rest-api-msdk
variant: widen
pip_url: tap-rest-api-msdk
config:
api_url: <https://dummy.restapiexample.com/api/v1>
headers:
User-Agent: meltano
streams:
- name: employees
path: /employees
records_path: $.data[*]
primary_keys: [id]
It was necessary to set User-Agent
to get a successful response.anirudh goverdhana
03/19/2024, 2:23 PManirudh goverdhana
03/20/2024, 9:47 AMReuben (Matatika)
03/20/2024, 10:28 AMReuben (Matatika)
03/20/2024, 10:29 AManirudh goverdhana
03/20/2024, 10:34 AMReuben (Matatika)
03/20/2024, 10:39 AM2024-03-20T10:36:20.611557Z [info ] Environment 'dev' is active
2024-03-20 10:36:21,575 | INFO | tap-rest-api-msdk | No schema found. Inferring schema from API call.
2024-03-20 10:36:22,209 | INFO | singer_sdk.helpers.jsonpath | JSONPath matches: 24
{"type": "STATE", "value": {}}
2024-03-20 10:36:22,211 | INFO | tap-rest-api-msdk | Beginning full_table sync of 'employees'...
2024-03-20 10:36:22,211 | INFO | tap-rest-api-msdk | Tap has custom mapper. Using 1 provided map(s).
{"type": "SCHEMA", "stream": "employees", "schema": {"properties": {"id": {"type": "integer"}, "employee_name": {"type": "string"}, "employee_salary": {"type": "integer"}, "employee_age": {"type": "integer"}, "profile_image": {"type": "string"}}, "type": "object", "required": ["employee_age", "employee_name", "employee_salary", "id", "profile_image"]}, "key_properties": ["id"]}
2024-03-20 10:36:22,212 | INFO | tap-rest-api-msdk | the next_page_token_jsonpath = $.next_page.
2024-03-20 10:36:22,780 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.566765, "tags": {"stream": "employees", "endpoint": "/employees", "http_status_code": 429, "status": "failed"}}
2024-03-20 10:36:22,780 | INFO | backoff | Backing off _request(...) for 2.9s (singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees)
2024-03-20 10:36:22,780 | ERROR | root | Backing off 2.94 seconds after 1 tries calling function <bound method RESTStream._request of <tap_rest_api_msdk.streams.DynamicStream object at 0x73a01cb24430>> with args (<PreparedRequest [GET]>, None) and kwargs {}
2024-03-20 10:36:26,002 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.271849, "tags": {"stream": "employees", "endpoint": "/employees", "http_status_code": 429, "status": "failed"}}
2024-03-20 10:36:26,002 | INFO | backoff | Backing off _request(...) for 4.5s (singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees)
2024-03-20 10:36:26,002 | ERROR | root | Backing off 4.51 seconds after 2 tries calling function <bound method RESTStream._request of <tap_rest_api_msdk.streams.DynamicStream object at 0x73a01cb24430>> with args (<PreparedRequest [GET]>, None) and kwargs {}
2024-03-20 10:36:30,783 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.263652, "tags": {"stream": "employees", "endpoint": "/employees", "http_status_code": 429, "status": "failed"}}
2024-03-20 10:36:30,783 | INFO | backoff | Backing off _request(...) for 8.2s (singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees)
2024-03-20 10:36:30,783 | ERROR | root | Backing off 8.18 seconds after 3 tries calling function <bound method RESTStream._request of <tap_rest_api_msdk.streams.DynamicStream object at 0x73a01cb24430>> with args (<PreparedRequest [GET]>, None) and kwargs {}
2024-03-20 10:36:39,577 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.598462, "tags": {"stream": "employees", "endpoint": "/employees", "http_status_code": 429, "status": "failed"}}
2024-03-20 10:36:39,577 | INFO | backoff | Backing off _request(...) for 16.1s (singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees)
2024-03-20 10:36:39,577 | ERROR | root | Backing off 16.07 seconds after 4 tries calling function <bound method RESTStream._request of <tap_rest_api_msdk.streams.DynamicStream object at 0x73a01cb24430>> with args (<PreparedRequest [GET]>, None) and kwargs {}
ae2024-03-20 10:36:56,230 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.564422, "tags": {"stream": "employees", "endpoint": "/employees", "http_status_code": 429, "status": "failed"}}
2024-03-20 10:36:56,230 | ERROR | backoff | Giving up _request(...) after 5 tries (singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees)
2024-03-20 10:36:56,230 | INFO | singer_sdk.metrics | METRIC: {"type": "counter", "metric": "http_request_count", "value": 0, "tags": {"stream": "employees", "endpoint": "/employees"}}
2024-03-20 10:36:56,230 | INFO | singer_sdk.metrics | METRIC: {"type": "timer", "metric": "sync_duration", "value": 34.01871681213379, "tags": {"stream": "employees", "context": {}, "status": "failed"}}
2024-03-20 10:36:56,230 | INFO | singer_sdk.metrics | METRIC: {"type": "counter", "metric": "record_count", "value": 0, "tags": {"stream": "employees", "context": {}}}
2024-03-20 10:36:56,230 | ERROR | tap-rest-api-msdk | An unhandled error occurred while syncing 'employees'
Traceback (most recent call last):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1187, in sync
for _ in self._sync_records(context=context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1081, in _sync_records
for record_result in self.get_records(current_context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 574, in get_records
for record in self.request_records(context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 395, in request_records
resp = decorated_request(prepared_request, context)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 274, in _request
self.validate_response(response)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 185, in validate_response
raise RetriableAPIError(msg, response)
singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees
Traceback (most recent call last):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/bin/tap-rest-api-msdk", line 8, in <module>
sys.exit(TapRestApiMsdk.cli())
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/click/core.py", line 1157, in __call__
return self.main(*args, **kwargs)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 501, in invoke
tap.sync_all()
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/tap_base.py", line 460, in sync_all
stream.sync()
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1194, in sync
raise ex
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1187, in sync
for _ in self._sync_records(context=context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/core.py", line 1081, in _sync_records
for record_result in self.get_records(current_context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 574, in get_records
for record in self.request_records(context):
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 395, in request_records
resp = decorated_request(prepared_request, context)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/backoff/_sync.py", line 105, in retry
ret = target(*args, **kwargs)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 274, in _request
self.validate_response(response)
File "/tmp/p/.meltano/extractors/tap-rest-api-msdk/venv/lib/python3.8/site-packages/singer_sdk/streams/rest.py", line 185, in validate_response
raise RetriableAPIError(msg, response)
singer_sdk.exceptions.RetriableAPIError: 429 Client Error: Too Many Requests for path: /api/v1/employees
It makes the first request to infer the schema find, finds 24 records and then I immediately get rate-limited. I suggest you find a different public free API: https://github.com/public-apis/public-apisReuben (Matatika)
03/20/2024, 10:42 AMError Occured! Page Not found
error because your api_url
is <https://dummy.restapiexample.com/api/v1/employees>
and the employees
stream path
is /employees
, so the tap is trying to make a request to <https://dummy.restapiexample.com/api/v1/employees/employees>
.Akshay Hangloo
05/08/2024, 11:34 PMbackoff_time_extension: 18
Reuben (Matatika)
05/09/2024, 12:27 AMbackoff_type
(message
or header
) so that the tap actually applies your backoff_time_extension
configuration, rather than falling back to the default SDK behaviour: https://github.com/Widen/tap-rest-api-msdk/blob/761f4bbf463cef95a836dc1b567c8305eba8083d/tap_rest_api_msdk/streams.py#L267-L273
As for whether or not you should build a custom tap - that's very much dependent on your use-case: I've always thought of tap-rest-api-mdsk
as a great prototyping tool, but the configuration is bound to be more complicated versus a specific tap, which can abstract away a lot of that logic. I would say that if you're planning to use this in a production environment eventually, I would at least have a look at creating a custom tap once you have a POC working. From a FOSS perspective, It's also worth considering if there is/could be interest from users in integrating the source with the wider Singer ecosystem.