Hi all. I have the following problem: CDC process ...
# troubleshooting
k
Hi all. I have the following problem: CDC process broke down using the bin log from MySQL to PostgreS due to the lack of Internet. The state.json file in the .meltano/run/tap-mysql directory has disappeared. Is it possible to somehow resume the process without the initial snapshot of the table? The table weighs about 200G with 700 million rows process takes 4 days(. I would also like to ask if there is any way to speed up the process? Maybe there are certain optimization strategies?
Copy code
plugins:
  extractors:
  - name: tap-mysql
    variant: transferwise
    pip_url: git+<https://github.com/edgarrmondragon/pipelinewise-tap-mysql.git@patch-1>
    config:
      database: ***
      engine: mysql
      session_sqls:
      - SET @@session.time_zone='+0:00'
      - SET @@session.wait_timeout=86400
      - SET @@session.net_read_timeout=86400
      - SET @@session.innodb_lock_wait_timeout=3600
    select:
    - schema-table.*
    metadata:
      '*':
        replication-method: LOG_BASED

  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      batch_size_rows: 50000
      hard_delete: true
      load_method: upsert
      use_copy: true
      validate_records: true
      sanitize_null_text_characters: true
e
does
meltano state list
reveal anything?
k
Unfortunately, I already cleared the state list using the command
Copy code
meltano state clear dev:tap-mysql-to-target-postgres,
but usually it outputs something like
Copy code
2025-02-07T04:48:59.453082Z [info ] The default environment 'dev' will be ignored for `meltano state`. To configure a specific environment, please use the option `--environment=<environment name>`.
dev:tap-mysql-to-target-postgres