This didn’t work the way I expected - is there a w...
# singer-tap-development
m
This didn’t work the way I expected - is there a way to validate an ArrayType property such that it can be any combination of the “allowed values”?
Copy code
th.Property(
            "operation_types",
            th.ArrayType(th.StringType),
            required=False,
            description=(
                "List of MongoDB change stream operation types to include in tap output. The default behavior is to "
                "limit to document-level operation types. See full list of operation types at"
                "<https://www.mongodb.com/docs/manual/reference/change-events/#operation-types>. Note that the list "
                "of allowed_values for this property includes some values not available to all MongoDB versions."
            ),
            default=[
                "create",
                "delete",
                "insert",
                "replace",
                "update",
            ],
            allowed_values=[
                "create",
                "createIndexes",
                "delete",
                "drop",
                "dropDatabase",
                "dropIndexes",
                "insert",
                "invalidate",
                "modify",
                "rename",
                "replace",
                "shardCollection",
                "update",
            ],
        ),
Copy code
2023-04-11T13:43:37.069262Z [info     ]     raise ConfigValidationError(summary) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-04-11T13:43:37.069428Z [info     ] singer_sdk.exceptions.ConfigValidationError: Config validation failed: ['create', 'delete', 'insert', 'replace', 'update'] is not one of ['create', 'createIndexes', 'delete', 'drop', 'dropDatabase', 'dropIndexes', 'insert', 'invalidate', 'modify', 'rename', 'replace', 'shardCollection', 'update'] cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
d
Currently the SDK uses
enum
for allowed values representation in the JSON schema and it doesn’t work with arrays unless this property is not moved to
items
. (https://github.com/meltano/sdk/blob/main/singer_sdk/typing.py#L479) Now:
Copy code
"operation_types": {
      "type": ["array", "null"],
      "items": { "type": ["string"] },
      "enum": [
                "create",
                "createIndexes",
                "delete",
                "drop",
                "dropDatabase",
                "dropIndexes",
                "insert",
                "invalidate",
                "modify",
                "rename",
                "replace",
                "shardCollection",
                "update",
      ],
    }
Should be:
Copy code
"operation_types": {
      "type": ["array", "null"],
      "items": {
          "type": ["string"],
          "enum": [
                "create",
                "createIndexes",
                "delete",
                "drop",
                "dropDatabase",
                "dropIndexes",
                "insert",
                "invalidate",
                "modify",
                "rename",
                "replace",
                "shardCollection",
                "update",
          ],
      },
    }
m
hmm, thank you - so it seems like I need to move that
enum
property into the
items
object to get the behavior I want?
this seems like a bug if that’s the case
d
It’s more like an extension for
allowed_values
feature of SDK, something like this:
Copy code
if self.allowed_values:
    if type is array: # pseudocode
        type_dict["items"].update({"enum": self.allowed_values}) # just a concept
    else:
        type_dict.update({"enum": self.allowed_values})
I’ve created an issue: https://github.com/meltano/sdk/issues/1600 since it’s already at least two of us who need this feature (:
m
this is gross, but works great: ``` config_jsonschema = th.PropertiesList( th.Property( "mongodb_connection_string", th.StringType, required=False, secret=True, description=( "MongoDB connection string. See " "https://www.mongodb.com/docs/manual/reference/connection-string/#connection-string-uri-format " "for specification." ), ), th.Property( "mongodb_connection_string_file", th.StringType, required=False, description="Path (relative or absolute) to a file containing a MongoDB connection string URI.", ), th.Property( "prefix", th.StringType, required=False, default="", description="An optional prefix which will be added to each stream name.", ), th.Property( "start_date", th.DateTimeType, required=False, description=( "Start date. This is used for incremental replication only. Log based replication does not support " "this setting - do not provide it unless using the incremental replication method. Defaults to " "epoch zero time 1970-01-01 if tap uses incremental replication method." ), ), th.Property( "database_includes", th.ArrayType( th.ObjectType( th.Property("database", th.StringType, required=True), th.Property("collection", th.StringType, required=True), ), ), required=True, description=( "A list of objects, each specifying database and collection name, to be included in tap output." ), ), th.Property( "add_record_metadata", th.BooleanType, required=False, default=False, description="When True, _sdc metadata fields will be added to records produced by this tap.", ), th.Property( "allow_modify_change_streams", th.BooleanType, required=False, default=False, description=( "In DocumentDB (unlike MongoDB), change streams must be enabled specifically (see " "https://docs.aws.amazon.com/documentdb/latest/developerguide/change_streams.html#change_streams-enabling" "). If attempting to open a change stream against a collection on which change streams have not been " "enabled, an OperationFailure error will be raised. If this property is set to True, when this error " "is seen, the tap will execute an admin command to enable change streams and then retry the read " "operation. Note: this may incur new costs in AWS DocumentDB." ), ), th.Property( "operation_types", th.ArrayType(th.StringType), required=False, description=( "List of MongoDB change stream operation types to include in tap output. The default behavior is to " "limit to document-level operation types. See full list of operation types at" "https://www.mongodb.com/docs/manual/reference/change-events/#operation-types. Note that the list " "of allowed_values for this property includes some values not available to all MongoDB versions." ), default=[ "create", "delete", "insert", "replace", "update", ], ), ).to_dict() config_jsonschema["properties"]["operation_types"]["items"]["enum"] = [ "create", "createIndexes", "delete", "drop", "dropDatabase", …
if I pass an invalid type in the array, it fails with the correct error:
Copy code
operation_types:
          - testOp    # not supported
          - create
          - delete
          - insert
          - replace
          - update
fails with:
Copy code
2023-04-11T21:27:44.470506Z [info     ]     raise ConfigValidationError(summary) cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
2023-04-11T21:27:44.470774Z [info     ] singer_sdk.exceptions.ConfigValidationError: Config validation failed: 'testOp' is not one of ['create', 'createIndexes', 'delete', 'drop', 'dropDatabase', 'dropIndexes', 'insert', 'invalidate', 'modify', 'rename', 'replace', 'shardCollection', 'update'] cmd_type=elb consumer=False name=tap-mongodb producer=True stdio=stderr string_id=tap-mongodb
d