dylan_just
08/28/2024, 10:12 PMEdgar Ramírez (Arch.dev)
08/29/2024, 12:22 AMdylan_just
08/29/2024, 12:29 AMvisch
08/29/2024, 2:37 PMCharles Feduke
08/29/2024, 2:38 PMdylan_just
09/01/2024, 9:41 PMdylan_just
09/01/2024, 9:43 PMCharles Feduke
09/02/2024, 3:47 PMcopy
statement. For really large tables if you’re not splitting the data into multiple files you are getting no parallelism on unload/load and therefore limited benefits from having a multi-node Redshift cluster. (Again, this is mid-stage big data stuff, the things that Spark and Hadoop are really good at, and something Redshift supported.)Charles Feduke
09/02/2024, 3:48 PMCharles Feduke
09/02/2024, 3:54 PMmax_parallelism
property and set it to 1, and I remember going through this source code before… I don’t think the target does anything with that value (ideally it’d break up the file from the tap into N files).Charles Feduke
09/02/2024, 4:00 PMdylan_just
09/02/2024, 9:00 PM