I tried to run the following meltano command from ...
# troubleshooting
s
I tried to run the following meltano command from Dagster which is running using AWS ECS,
meltano run tap-s3-csv-update target-postgres-application --full-refresh
When it loaded
current/update/update.csv
file (14.5KB), it is working fine. but when it tried to load
current/update/update_data.csv
file (50.0MB), I got the following errors. The error messages didn't show the root cause. It is working fine when I ran from my local env. How can I solve the issue here and how to debug this issue? Thanks in advance. ```[2m2023-06-28T221753.819815Z[0m [[32m[1minfo [0m] [1mEnvironment 'dev' is active[0m [2m2023-06-28T221806.906567Z[0m [[32m[1minfo [0m] [1mPerforming full refresh, ignoring state left behind by any previous runs.[0m [2m2023-06-28T221817.993238Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221817 name=tap_s3_csv level=INFO message=Attempting to create AWS session[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.095091Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=Starting sync.[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.193323Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=data_update: Starting sync[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.193762Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=Syncing table "data_update".[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.194022Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=Getting files modified since 2023-06-01 000000+00:00.[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.194254Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=Checking bucket "my-data-platform-dev" for keys matching ".csv"[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221818.194480Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221818 name=tap_s3_csv level=INFO message=Skipping files which have a LastModified value older than 2023-06-01 000000+00:00[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221819.136358Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221819 name=tap_s3_csv level=INFO message=Found 3 files.[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221819.139097Z[0m [[32m[1minfo [0m] [1mtime=2023-06-28 221819 name=tap_s3_csv level=INFO message=Skipping matched file "current/update/" as it is empty[0m [36mcmd_type[0m=[35melb[0m [36mconsumer[0m=[35mFalse[0m [36mname[0m=[35mtap-s3-csv-update[0m [36mproducer[0m=[35mTrue[0m [36mstdio[0m=[35mstderr[0m [36mstring_id[0m=[35mtap-s3-csv-update[0m [2m2023-06-28T221819.139614Z[0m [[32m[1minfo [0m] [1mtime=…
It took about 30-40 minutes and failed.
a
Setting
NO_COLOR=1
as an environment variable will clean up the logs in dagster a bit. Also I would use
meltano --log-level=debug run ...
and see if you get any more info in the logs.
Could you share the
tables:
section of
meltano.yml
It does look like a long time to download the larger file, could you try reducing the size of
update_data.csv
and see if the run completes? Maybe start with a couple MB and work up from there?
v
Also there sounds be more logs that show the target postgres stack trace
s
Thanks, @Andy Carter & @visch Here is my tables configuration:
Copy code
tables: [{search_prefix: current/update, search_pattern: .csv, table_name: data_update,
          key_properties: ["[ID]"], delimiter: "	"}]
@Andy Carter it is working fine with the small size files, and it is working fine using 50mb from my local env too.
I have run the meltano command using
--log-level=debug
last night, I got more helpful info from Dagster. it is related to the memory. I am trying to re-run the job using more memories. Here is the message:
Copy code
Multiprocess executor: child process for step load_update_data was terminated by signal 9 (SIGKILL). This usually indicates that the process was killed by the operating system due to running out of memory. Possible solutions include increasing the amount of memory available to the run, reducing the amount of memory used by the ops in the run, or configuring the executor to run fewer ops concurrently.
dagster._core.executor.child_process_executor.ChildProcessCrashException

Stack Trace:
  File "/usr/local/lib/python3.9/site-packages/dagster/_core/executor/multiprocess.py", line 240, in execute
    event_or_none = next(step_iter)
,  File "/usr/local/lib/python3.9/site-packages/dagster/_core/executor/multiprocess.py", line 357, in execute_step_out_of_process
    for ret in execute_child_process_command(multiproc_ctx, command):
,  File "/usr/local/lib/python3.9/site-packages/dagster/_core/executor/child_process_executor.py", line 174, in execute_child_process_command
    raise ChildProcessCrashException(exit_code=process.exitcode)
v
Interesting. I wonder which tap and target you're using (the variant). We're working on a new tap that works for s3 csvs that should be more memory efficient, and I think both the default and meltanolabs variant of target-postgres is very efficient with memory as well. Allocating more memory is smart as well I'd just imagine it's something silly like loading a 50gb file into memory
s
extractor: name: tap-s3-csv variant: transferwise loader: name: target-postgres variant: transferwise
my file is only 50mb, the original file size is 1G. I split the file to multiple ones, and only test the first one.
since I have one big file, do I need to add any other parameters to tun the loader?
v
They have their own "tuning" parameters, it should default decently and 50mb shouldn't be a big deal so I don't know. I'd love for you to try a seperate tap and target we're working on (target is more mature) but I"m not sure if you'd be willing to enter experiment land!
Might be worth just figuring out the root issue here honestly
s
sure, let me know which one, and I can give a try.
I think it is related to ECS, Dagster is running on it,
#C05CNUF699B , and #C04CLA0S4G0
s
I have increased the memory to 1g for the Dagster job, but the job is still hanging, and it will fail eventually.