nitin_kharya
08/19/2021, 11:33 AMthomas_schmidt
08/19/2021, 12:18 PMnitin_kharya
08/19/2021, 12:29 PMthomas_schmidt
08/19/2021, 2:36 PM.gitlab-ci.yml
file we take the following approach:
⢠On each merge request and merge into master we build an image for Meltano runs (including the Meltano project + setup AND the cloud sql proxy installed)
⢠We have scheduled jobs defined which basically use the latest image, connect to the CloudSQL instance using the proxy and then run the elt job
yaml
# Generic setup for el jobs
.el-setup
stage: el
image:
name: ${CI_REGISTRY_IMAGE}:${CI_COMMIT_SHA}
entrypoint: [""]
before_script:
- cp -Rn /project/. .
- echo $SA_FILE > ${GOOGLE_APPLICATION_CREDENTIALS}
- ./cloud_sql_proxy -instances="<instance-name>"=tcp:0.0.0.0:1234 > cloudsql.log 2>&1 &
artifacts:
paths:
- cloudsql.log
expire_in: 1 week
only:
- schedules
# Single EL job
el-gitlab:
extends:
.el-setup
script:
- export TARGET_BIGQUERY_DATASET_ID=meltano_warehouse
- meltano elt tap-gitlab target-bigquery --job_id=gitlab-bigquery
retry: 2
Note: We ingest our service account credentials using the Gitlab environment variables
echo $SA_FILE > ${GOOGLE_APPLICATION_CREDENTIALS}
which allows us to connect to CloudSQL then
The database URI is then set in the environment variables with MELTANO_DATABASE_URI=postgresql://<username>:<password>@<host>:<port>/<database>
Let me know whether this helps you or you have additional questions šnitin_kharya
08/25/2021, 2:10 PM