Hi all, I have a general question about CDC techno...
# troubleshooting
k
Hi all, I have a general question about CDC technology. I’m using
tap-mysql
(PipelineWise variant) and
target-postgresql
.(Variant Meltano). The ELT process is based on CDC using the binlog. Currently, the process only considers the primary key. Is it possible to specify which column should be indexed when the process starts? If not, I’d like to know how the ELT process will behave if an index is created on an existing column in PostgreSQL after loading. Will this break the CDC process? Could it lead to a deviation in the binlog? I’m running MySQL 5.7 and PostgreSQL 13. I was considering creating a concurrent index on the
updated_at
column. What do you think? I’m interested in any opinions, any strategies My tap and target configs:
Copy code
plugins:
  extractors:
  - name: tap-mysql
    variant: transferwise
    pip_url: git+<https://github.com/edgarrmondragon/pipelinewise-tap-mysql.git@patch-1>
    config:
      database: ***
      engine: mysql
      session_sqls:
      - SET @@session.time_zone='+0:00'
      - SET @@session.wait_timeout=86400
      - SET @@session.net_read_timeout=86400
      - SET @@session.innodb_lock_wait_timeout=3600
    select:
    - schema-table.*
    metadata:
      '*':
        replication-method: LOG_BASED

  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      batch_size_rows: 50000
      hard_delete: true
      load_method: upsert
      use_copy: true
      validate_records: true
      sanitize_null_text_characters: true
👀 1