I am having issues with tap-mssql variant SpaceCon...
# getting-started
d
I am having issues with tap-mssql variant SpaceCondor using log based replication (Change Tracking not CDC): https://hub.meltano.com/extractors/tap-mssql--spacecondor/ I have Change Tracking configured in SQL Server but am getting a full replication on each run when using meltano run (switched to target-jsonl from target-snowflake for debugging):
meltano run tap-mssql target-jsonl
The state gets updated with each run:
{"bookmarks": {"dbo-c_logical_field_user_values": {}}}
Extractor config:
extractors:
- name: tap-mssql
config:
host: PROJECT01
port: 60065
database: rss_test
username: svcTestAccount
default_replication_method: LOG_BASED
sqlalchemy_url_query_options:
- key: driver
value: ODBC Driver 18 for SQL Server
- key: TrustServerCertificate
value: yes
select:
- dbo-c_logical_field_user_values.*
I think this might be a configuration issue, not sure but perhaps isn't picking up the default replication method?
{"event": "Visiting CatalogNode.STREAM at '.streams[352]'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910101Z"}
{"event": "Setting '.streams[352].selected' to 'False'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910162Z"}
{"event": "Setting '.streams[352].selected' to 'True'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910211Z"}
{"event": "Skipping node at '.streams[352].tap_stream_id'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910259Z"}
{"event": "Skipping node at '.streams[352].table_name'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910306Z"}
{"event": "Skipping node at '.streams[352].replication_method'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910354Z"}
{"event": "Skipping node at '.streams[352].key_properties[0]'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910402Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.logical_field_sid'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910457Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.enabled_flag'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910513Z"}
{"event": "Skipping node at '.streams[352].schema.properties.enabled_flag.maxLength'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910604Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.modified_by_user_sid'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910779Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.modified_datetime'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.910924Z"}
{"event": "Skipping node at '.streams[352].schema.properties.modified_datetime.format'", "level": "debug", "timestamp": "2025-04-22T00:34:06.910988Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.timestamp'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911099Z"}
{"event": "Visiting CatalogNode.PROPERTY at '.streams[352].schema.properties.system_modified_datetime'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911160Z"}
{"event": "Skipping node at '.streams[352].schema.properties.system_modified_datetime.format'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911212Z"}
{"event": "Skipping node at '.streams[352].schema.type'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911262Z"}
{"event": "Skipping node at '.streams[352].schema.required[0]'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911312Z"}
{"event": "Skipping node at '.streams[352].schema.$schema'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911361Z"}
{"event": "Skipping node at '.streams[352].is_view'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911410Z"}
{"event": "Skipping node at '.streams[352].stream'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911458Z"}
{"event": "Visiting CatalogNode.METADATA at '.streams[352].metadata[0]'.", "level": "debug", "timestamp": "2025-04-22T00:34:06.911509Z"}
{"event": "Visiting metadata node for tap_stream_id 'dbo-c_logical_field_user_values', breadcrumb '['properties', 'logical_field_sid']'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911558Z"}
{"event": "Setting '.streams[352].metadata[0].metadata.selected' to 'False'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911616Z"}
{"event": "Setting '.streams[352].metadata[0].metadata.selected' to 'True'", "level": "debug", "timestamp": "2025-04-22T00:34:06.911665Z"}
Anyone have suggestions on troubleshooting? No errors like the previous question about not finding the state. SQL Server tables have Change Tracking enabled in SQL Server as:
ALTER TABLE dbo.' + @ls_table_name + N'
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = OFF);
e
I don't think the tap does actually do anything with the
default_replication_method
value, so you'll need to define the replication method as a
metadata
override: https://docs.meltano.com/concepts/plugins/#metadata-extra
d
Thanks @Edgar Ramírez (Arch.dev), that was the issue. I added the following metadata section to the config for anyone looking at this issue in the future:
extractors:
- name: tap-mssql
config:
host: PROJECT01
port: 60065
database: rss_test
username: svcTestAccount
sqlalchemy_url_query_options:
- key: driver
value: ODBC Driver 18 for SQL Server
- key: TrustServerCertificate
value: yes
select:
- dbo-c_logical_field_user_values.*
metadata:
dbo-*:
replication_method: LOG_BASED
This tap doesn't appear to be able to detect the primary key, so you will need to specify the metadata for each table in order to add the key. Is that "normal" behavior for SQL taps that support log based replication?
2025-04-22T17:26:19.153818Z [info     ] ValueError: Could not detect replication key for 'dbo-c_logical_field_user_values' stream(replication method=LOG_BASED) cmd_type=elb consumer=False job_name=dev:tap-mssql-to-target-jsonl name=tap-mssql producer=True run_id=af88abc1-c2ee-46e4-a181-6b8bb21292dd stdio=stderr string_id=tap-mssql
e
Can you try
Copy code
metadata:
  dbo-*:
    replication_method: LOG_BASED
    replication_key: _sdc_change_version
d
That generates this error:
singer_sdk.exceptions.InvalidReplicationKeyException: Field '_sdc_change_version' is not in schema for stream 'dbo-c_logical_field_user_values'
This configuration looks like it works correctly:
metadata:
dbo-c_logical_field_user_values:
replication_method: LOG_BASED
replication-key: logical_field_sid
logical_field_sid:
is-replication_key: true
The tap is generating something like this extracted from the logs (I wasn't running a trace) when I didn't have VIEW CHANGE TRACKING permissions set for the SQL account. It is using the replication key to join on the change tracking tables. SELECT c.SYS_CHANGE_VERSION AS _sdc_change_version, c.SYS_CHANGE_OPERATION AS _sdc_change_operation, c.logical_field_sid, tb.enabled_flag, tb.modified_by_user_sid, tb.modified_datetime, tb.timestamp, tb.system_modified_datetime FROM CHANGETABLE CHANGES dbo.c_logical_field_user_values,", "level": "info", "timestamp": "2025-04-22T210121.749032Z"} 1", "level": "info", "timestamp": "2025-04-22T210121.749182Z"} ) AS c", "level": "info", "timestamp": "2025-04-22T210121.749341Z"} LEFT JOIN", "level": "info", "timestamp": "2025-04-22T210121.749495Z"} dbo.c_logical_field_user_values AS tb", "level": "info", "timestamp": "2025-04-22T210121.749649Z"} ON", "level": "info", "timestamp": "2025-04-22T210121.749802Z"} tb.logical_field_sid = c.logical_field_sid", "level": "info", "timestamp": "2025-04-22T210121.749965Z"} ORDER BY", "level": "info", "timestamp": "2025-04-22T210121.750121Z"} c.SYS_CHANGE_VERSION ASC", "level": "info", "timestamp": "2025-04-22T210121.750245Z"}
How do you define compound keys for the replication-key in Meltano?
Looks like a combination of setting the key-properties and then the replication-key? • `key-properties`: A list of properties that together uniquely identify a record in the stream. For example,
["id"]
. • `replication-key`: The name of a property in the source to use as a bookmark. For example, this will often be an
"updated_at"
field or an auto-incrementing primary key (requires
replication-method
).
Change Tracking in SQL Server requires a primary key, but compound keys are allowed so just need to properly define the compound key in the configuration.
Anyone have suggestions on where in the tap code that I can view the replication code to see if compound keys are supported? I haven't looked much at tap development.
Looks like it is in client.py.
class MSSQLChangeTrackingStream(SQLStream)
I think the question is does the replication key support a compound key?
e
The answer to that is generally no. I'm not sure what changes would need to happen in the SDK and whether it can be done in a way that doesn't deviate fundamentally from the Singer spec.
d
Thanks, I will take a look at the spec. Compound keys are very common so I feel like I am missing something.
Seems like LOG_BASED replication shouldn't need a replication key as the changes are already tracked in the log.
It looks to me like a defect in this tap. It believe it shouldn't need a bookmark as you replicate all records in the change tracking logs.
Copy code
using_change_tracking = True

        if not self.primary_keys:
            using_change_tracking = False
            self.logger.warning(
                "Table has no primary keys. Cannot use CHANGE_TRACKING. "
                "Executing a full table sync instead."
            )
        elif not change_tracking_enabled:
            using_change_tracking = False
            self.logger.error(
                "Table is not enabled for CHANGE_TRACKING. "
                "Executing a full table sync instead."
            )
        elif not bookmark:
            using_change_tracking = False
            self.logger.warning(
                "There is no previous bookmark. Executing a full table sync."
            )
        elif bookmark < minimum_valid_version:
            using_change_tracking = False
            self.logger.warning(
                "CHANGE_TRACKING_MIN_VALID_VERSION has reported a value greater "
                "than current-log-version. Executing a full table sync."
            )
This is how the replication should work: https://www.stitchdata.com/docs/replication/replication-methods/log-based-incremental Ongoing replication jobs After the historical replication of a table is complete, Stitch reads updates for the table from the database’s logs. During ongoing replication jobs using Log-based Incremental Replication, a few things will happen: 1. Using the maximum log position ID from the previous job - in this case, the historical replication job - Stitch begins reading log messages in the binary file. Data for tables set to replicate is extracted. 2. At the end of the replication job, Stitch bookmarks its place in the log file by storing its current log position ID. 3. During the next replication job, Stitch will resume reading data with a greater log position ID than the log position ID from the previous job. 4. At the end of the replication job, Stitch bookmarks its place in the log file again. 5. Repeat.
Is there a way to use the Stich version of tap-mssql with Meltano?
e
Though it seems you'll have to define the
settings
after you add it, cause the plugin definition hosted in the Hub doesn't have them.
d
Looks like that is the same one. Thanks, I work on testing that one!
🙌 1
The installation of plugin fails on Meltano 3.8.0. Any way to install manually?
Copy code
~/source/meltano/singer_replication$ meltano add tap-mssql --variant singer-io
Added extractor 'tap-mssql' to your project
Variant:        singer-io
Repository:     <https://github.com/singer-io/tap-mssql>
Documentation:  <https://hub.meltano.com/extractors/tap-mssql--singer-io>

2025-07-18T19:22:51.369006Z [info     ] Skipped installing extractor 'tap-mssql'
2025-07-18T19:22:51.369190Z [info     ] Skipped installing 1/1 plugins

To learn more about extractor 'tap-mssql', visit <https://hub.meltano.com/extractors/tap-mssql--singer-io>
~/source/meltano/singer_replication$ meltano config tap-mssql test
2025-07-18T19:24:48.398054Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
2025-07-18T19:24:48.573616Z [error    ] Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to join our friendly Slack community.

Plugin configuration is invalid
Executable 'tap-mssql' could not be found. Extractor 'tap-mssql' may not have been installed yet using `meltano install extractor tap-mssql`, or the executable name may be incorrect.
It looks like manually build and install it on the system and then set the executable property? https://gitlab.com/meltano/meltano/-/issues/2341#note_407635818
e
Ah, it's clojure based so that's why it doesn't have a
pip_url
. https://github.com/singer-io/tap-mssql?tab=readme-ov-file#requirements