Hi, we are looking for a way to apply table level ...
# plugins-general
s
Hi, we are looking for a way to apply table level filter when extracting data from MSSQL server using tap-mssql plugin in meltano. We tried below option, but it is not working . Can you help to provide right method to achieve table level filtering during extraction? " extractors: - name: tap-mssql variant: wintersrd pip_url: tap-mssql config: database: host: port: tds_version: '7.3' use_date_datatype: true user: dev password: ** select: - dbo-emp.* filter: empId: =3 metadata: dbo-emp: replication-method: FULL_TABLE
a
Happy to be proven wrong, but I don't think meltano supports this, nor does the specific tap you are using. where did you see the
filter:
syntax from?
👍 1
You might need to do this downstream in dbt transformation or similar from your source tables.
s
ok, we were trying this as a workaround, since reading around 10 million records from MSSSQL and loading to target bigquery, using meltano code deployed in cloudrun service was getting timed out . It just loads 2 million records before time out, hence thought of filtering source data and load in multiple chunks
a
Just curious, did you find the
filter:
example somewhere in the main meltano docs? Or just something you tried out hoping it would work?
s
something, we tried out
👍 1
a
Given you're using full table replication, you're going to need to. One option might be to create a sql view in your host database that prefilters to
empId = 3
, and then use that as your source table for meltano to sync.
s
full_table was shown here , as an example - actual fact able would have incremental loading. As I understand, for incremental loading also, firt run would do the full load and incremental runs from second run onwards. Is there any way we cn restrict records fetched in the initial run?
a
You could run locally for your initial sync (without risk of timeout) but send your data to your production target, then migrate your state to your production env. That way you are only ever running incremental in cloudrun. https://docs.meltano.com/concepts/state_backends/#migrating-state Are you sure there's no way to extend the timeout?