Anyone have thoughts (or better, guidance!) on def...
# best-practices
t
Anyone have thoughts (or better, guidance!) on defining indexes on tables in DBMS targets? We're pushing data from MySQL to Postgres, then using dbt to transform the data in PG. For the sake of performance our models are incremental, which requires reading rows from the tables meltano populates using a
_sdc_batched_at > ...
clause; to get reasonable performance at scale that column needs to be indexed. I can add manually indexes easily enough but it feels wrong to be making changes to tables that the meltano pipeline is supposed to be maintaining. 🤔 I've found the
schema_mapping
option within the pipelinewise Postgres target, which seems like a possible solution in our particular case but isn't a general answer. I also don't want to have to enumerate every table in our schema in the meltano.yml. 🙄 Arguably the target should create an index on that particular column automatically (which is does for the _sdc_deleted_at column) but it feels like there should be a more general solution out there too... this isn't the only column anyone will ever need to index, after all. Anyway... any thoughts would be appreciated because as of now all I have is a short list of options I don't like. 😉