juan_luis_cano_rodriguez
05/13/2022, 10:51 AMmeltano run to load a large PostgreSQL table, and queries on it are taking a long time, even simple ones (analysis of a `SELECT COUNT(*) FROM table WHERE timestamp > '2022-01-01'` for reference). from my limited knowledge of databases, I understand that setting proper indexes on the columns I'll use the most for queries will improve the performance.
what would the #C01UTUSP34M be here? running a data transformation with dbt that creates a new table, essentially duplicating the data but with proper indexes?
to be clear, I'm concerned that if I create an index on the table meltano created for loading, things will break. but maybe it's not the case?juan_luis_cano_rodriguez
05/13/2022, 12:17 PMVACUUM ANALYZE?" yep
⢠"why don't you use <insert cool OLAP db technology>?" - while I'm open to changing the underlying technology, that doesn't address my specific problem with PostgreSQL queries being slowthomas_briggs
05/13/2022, 12:44 PMthomas_briggs
05/13/2022, 12:45 PMthomas_briggs
05/13/2022, 12:48 PMthomas_briggs
05/13/2022, 12:49 PMjuan_luis_cano_rodriguez
05/13/2022, 1:04 PMI'm a database guy, so I may have the wrong perspective, but this feels like a pretty standard database/qery optimization problem, not so much a meltano/ELT questionright, I should have made my concerns more explicit: that if I put an index on the table meltano created for loading, things will break. but maybe it's not the case
juan_luis_cano_rodriguez
05/13/2022, 1:05 PMCan you define "large" and "slow"? And what are the specs on your RDS instance? The first step is ensure that your expectations are correct.so as you could see, the table has ~1 M rows, and this simple query
SELECT COUNT(*) FROM table WHERE timestamp > '2022-01-01' takes 90 seconds. I'm not a database expert but it looks like too much to me, maybe I would have expected something 10x faster at least?juan_luis_cano_rodriguez
05/13/2022, 1:06 PMjuan_luis_cano_rodriguez
05/13/2022, 1:08 PMdb.t3.micro instance, so IIUC, it has 2 vCPU, 1 GB RAMthomas_briggs
05/13/2022, 1:10 PMjuan_luis_cano_rodriguez
05/13/2022, 1:11 PMthomas_briggs
05/13/2022, 1:11 PMthomas_briggs
05/13/2022, 1:12 PMjuan_luis_cano_rodriguez
05/13/2022, 1:12 PMthomas_briggs
05/13/2022, 1:16 PMjuan_luis_cano_rodriguez
05/13/2022, 1:22 PMpipelinewise-target-postgres and not meltano. I will do the experiment with the other columns just for curiosity in any case š thanks for your help!thomas_briggs
05/13/2022, 1:22 PMalexander_butler
05/14/2022, 11:33 PMbrin index on timestamp
super slim
I optimized the crap out of RDS postgres on AWS before moving into a real OLAP databasejuan_luis_cano_rodriguez
05/15/2022, 5:16 PMjuan_luis_cano_rodriguez
05/16/2022, 6:12 PM