Hello, I need an advice. I created a new pipeline ...
# best-practices
d
Hello, I need an advice. I created a new pipeline MySQL to S3. The very 1st run will do the FULL EXPORT of the table and next one might me incremental. Meltano is running in Docker and during the run the Docker is dying due to lack of storage in container. What is the right approach to avoid this kind of problem? For example, during 1st run let Maltano copy data in batches or so. Error message
[Errno 105] No buffer space available
d
what is your loader?
you may need to tweak the batch sizing
ie for snowflake adjusting batch_size_rows: 350000
d
I am getting from Mysql and save files to S3 for Athena to consume. The meltano.yml:
Copy code
- name: tap-mysql-billing
      inherit_from: tap-mysql
      config:
        host: billing-db
        port: 3306
        database: billing
        user: datalake
        password: $TAP_MYSQL_PASSWORD_BILLING
      select:
        - account_type_prices.account_type_id
        - account_type_prices.id
        - account_type_prices.price_id
        - account_types.id
        - account_types.name
      metadata:
        "account_type_prices*":
          replication-method: INCREMENTAL
          replication-key: id
I assume for the 1st time it is doing full extract
where can I adjust batch size?
I don't see a
batch size
in tap-mysql extract https://hub.meltano.com/extractors/tap-mysql
d
sorry for the delay - which loader are you using though?
yes first time is doing a full extract which can be intensive, it took some trial and error for me to not run out of memory too