haleemur_ali
02/09/2024, 9:14 PMstart_date
to yesterday on the default environment
We've run into a situation where a new developer on a project added a stream, but since the start_date
was set to some day in february of 2021, it used up the API quota on the affected SaaS tool. To minimize this risk, I was hoping to be able to specify a start_date on the dev environment to dynamically be yesterday.Edgar Ramírez (Arch.dev)
02/09/2024, 9:29 PMMatt Menzenski
02/10/2024, 2:49 AMhaleemur_ali
02/11/2024, 10:44 PMhaleemur_ali
02/12/2024, 1:46 PMexport MELTANO_DEV_START_DATE=$(date -d '-2 day' '+%Y-%m-%dT00:00:00Z')
meltano "$@"
exit $?
and in the docker-compose the entrypoint is set to the script above.
then, for each tap, its dev
environment defines config.start_date
to $MELTANO_DEV_START_DATE
in our team, we have a docker based development workflow, so its safe to assume that all devs will test out their changes locally using
docker compose run meltano run <tap-name> target-jsonl
there are still a couple of traps folks might fall into.
1. the tap has been run locally but a few months ago, so Meltano picks up the start_date from the stored state. Its possible to ignore that by specifying --full-refresh
, but it feels like a big risk to expect folks to remember to do this consistently.
2. the tap does not have a dev environment defined with config.start_date
to $MELTANO_DEV_START_DATE
yet. (e.g. when a new tap is being introduced and the changes are on the developer's local machine only). While this doesn't constitute a risk for the data engineering pipelines, being a tap that hasn't been deployed to production yet, it is still worthwhile to prevent quota exhaustion as these can be shared company-wide resources.