par_degerman
06/09/2023, 8:27 PMplugins:
extractors:
- name: source-db
inherit_from: tap-mysql
variant: transferwise
pip_url: pipelinewise-tap-mysql
config:
host: localhost
user: meltano
engine: mariadb
use_gtid: true
select:
- "*-Orders.*"
- "*-Payments.*"
- "!*-Payments.CreditCardNumber"
- "*-Customers.*"
- "!*-Customers.owner*"
- "!*-Customers.secondOwner*"
- "*-EventDataKey.*"
- "*-Export.*"
- "*-Note.*"
- "*-LogItem.*"
- "!*-LogItem.data"
metadata:
"*":
replication-method: LOG_BASED
Problem is now that the de-selected columns show up from time to time in the snowflake warehouse. The pipelines all run orchetrated by airflow, and the de-selected columns shows up only for some of the databases (seemingly random) and only at certain syncs.
So two things;
A. Is the above not the correct way to select "all but a few" columns out of a large number of databases?
B. Is there a better way? Perhaps using a stream mapping to transform away the unwanted columns?