Left snowflake warehouse on 4XL during trial. Move...
# random
v
Left snowflake warehouse on 4XL during trial. Move to prod, ran some queries against the Metadata on Views of the snowflake database (took 1 hr to run), $400 bill. I now understand why they make so much money. What a joke I'd assume I'm not the only one
yes I downsized my compute wh down to small. Crazy you need to know ahead of time what compute wh cluster you need. I'm sure the blame is all on me, but this isn't "easy" to me
i'm the idiot
t
Yeah we had ended up with a bill from AWS from testing Redshift that was ummm... a bit larger than that 😮 So I feel your pain.
s
^^ Love it. So compute & storage are becoming almost free, except when you put them together and call it a database 😄
j
Ouch! We had some migrations where we were keeping old stuff running and then moved everything to a new instance and paid 2x for a bit. That was not fun.
a
OMG - sorry to hear this @visch! You are definitely not the only one. This is a story as old as Snowflake itself.
Most orgs almost immediately pre-create warehouses and restrict access to X-SMALL only except for approved use cases.
To anyone who think Meltano and the open source data stack just competes with Fivetran and friends, imagine moving all your Snowflake CI and ETL workloads to something like Meltano+DuckDB and publishing just the result to Snowflake. All the ease-of-use of Snowflake for querying, but without Snowflake's compute markup for your daily workloads. 😄 🤫
j
working on it
a
Oh yeah, @jacob_matson - was thinking of you when I wrote that 👆 😄
j
i'm eyeing that dagster util as well.
I think with a bit of work it will run in a container (already working) but then also spin up additional containers as required
MDS in a box, in a container etc etc
v
Does sound nice for some of this stuff! I'm trying to pull metadata information out of snowflake so no good for me but that sounds much better than running python in a warehouse and having the bill rack up
a
Sorry to hear that @visch - feels like a right of passage for everyone using Snowflake. We had a similar experience too. On a brighter note. Spoke to a couple of guys working on a snowflake cost model, aim is to get a query cost: https://hub.getdbt.com/get-select/dbt_snowflake_monitoring/latest/ One good tip was to configure the auto suspend to 1 minute. By default it's 10 minutes - so for your 1 meta data query it will be idle for 9 minutes…. Jokes
j
lots of heat right now in snowflake cost management: capital one slingshot, select.dev, and a few more
m
@aaronsteers That sounds like a workflow that could really solve my current case. Any idea where I can learn more?
j
@max_de_rooij mdsinabox.com for what’s working today or DM me and I can show you what’s next
a
@max_de_rooij - I'd definitely start with the mds-in-a-box example from @jacob_matson. Then, as a final step, there are a few options you'd have to push the result back to Snowflake or similar. Does this help?
a
It is certainly an interesting model. We are using Postgres in this way, and pushing the result back to MongoDB for a client! We could definitely be using DuckDB (perhaps to parquet) to do the same and of course, the results could be simply copied anywhere.
m
Thanks @aaronsteers and @jacob_matson. I reserved some time coming week to implement a simple variant. Will DM you Jacob 🙂