Hello all. I am new to Meltano. I was able to set...
# troubleshooting
j
Hello all. I am new to Meltano. I was able to setup an Ubuntu server running Meltano. I configured tap-cloudwatch and target-postgres. When I put a query in like below, I get an error. If I take the | out it runs fine. Can you not have a pipe?
Copy code
fields @timestamp, @message | sort @timestamp desc | limit 25
👀 1
e
Hi @Jason Collins! What's the error you get and what does your
meltano.yml
look like?
j
Copy code
2025-01-10T17:58:44.295602Z [info     ] Incremental state has been updated at 2025-01-10 17:58:44.295557+00:00.
2025-01-10T17:58:44.487575Z [error    ] Extractor failed              
2025-01-10T17:58:44.487948Z [error    ] Block run completed.           block_type=ExtractLoadBlocks err=RunnerError('Extractor failed') exit_codes={<PluginType.EXTRACTORS: 'extractors'>: 1} set_number=0 success=False
Copy code
version: 1
default_environment: dev
project_id: 07d96793-b7e7-4401-8406-5c69d3e089fb
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-cloudwatch
    variant: meltanolabs
    pip_url: git+<https://github.com/meltanolabs/tap-cloudwatch.git>
    config:
      log_group_name: redacted
      query: fields @timestamp, @message | limit 25
      start_date: '2025-01-06T00:00:00Z'
      aws_region_name: us-east-1
      aws_endpoint_url: ''
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      database: meltano-cw-test
      hard_delete: false
      host: redacted
      user: redacted
      port: 5432
      use_copy: false
      batch_size_rows: 100
e
Gotcha, can you try
meltano invoke tap-cloudwatch
and share the results?
j
That was also the wrong one. Let me get you the right yml sorry. I have been changing the query around.
Copy code
version: 1
default_environment: dev
project_id: 07d96793-b7e7-4401-8406-5c69d3e089fb
environments:
- name: dev
- name: staging
- name: prod
plugins:
  extractors:
  - name: tap-cloudwatch
    variant: meltanolabs
    pip_url: git+<https://github.com/meltanolabs/tap-cloudwatch.git>
    config:
      log_group_name: redacted
      query: fields @timestamp, @message | sort @timestamp desc | limit 25
      start_date: '2025-01-06T00:00:00Z'
      aws_region_name: us-east-1
      aws_endpoint_url: ''
  loaders:
  - name: target-postgres
    variant: meltanolabs
    pip_url: meltanolabs-target-postgres
    config:
      database: meltano-cw-test
      hard_delete: false
      host: redacted
      user: redacted
      port: 5432
      use_copy: false
      batch_size_rows: 100
Not sure how much of the invoke you need.
Copy code
File "/home/ubuntu/dscwlogs-postgres/.meltano/extractors/tap-cloudwatch/venv/lib/python3.10/site-packages/tap_cloudwatch/client.py", line 69, in get_records
    for batch in cloudwatch_iter:
  File "/home/ubuntu/dscwlogs-postgres/.meltano/extractors/tap-cloudwatch/venv/lib/python3.10/site-packages/tap_cloudwatch/cloudwatch_api.py", line 139, in get_records_iterator
    self._validate_query(query)
  File "/home/ubuntu/dscwlogs-postgres/.meltano/extractors/tap-cloudwatch/venv/lib/python3.10/site-packages/tap_cloudwatch/cloudwatch_api.py", line 91, in _validate_query
    raise InvalidQueryException("sort not allowed")
tap_cloudwatch.exception.InvalidQueryException: sort not allowed
e
That's the piece I needed, thanks! It seems the tap adds the sorting so it fails if the users passes it in the query: https://github.com/MeltanoLabs/tap-cloudwatch/blob/28ddd2fb45519cf3d435796c58095bbe1d7b469f/tap_cloudwatch/subquery.py#L106-L108
The reason for using
asc
sorting is to make replication resumable
j
That makes since. Honestly, not to worried about that. Once it is in the database it can be sorted. I get the error on the limit as well.
Copy code
File "/home/ubuntu/dscwlogs-postgres/.meltano/extractors/tap-cloudwatch/venv/lib/python3.10/site-packages/tap_cloudwatch/cloudwatch_api.py", line 139, in get_records_iterator
    self._validate_query(query)
  File "/home/ubuntu/dscwlogs-postgres/.meltano/extractors/tap-cloudwatch/venv/lib/python3.10/site-packages/tap_cloudwatch/cloudwatch_api.py", line 93, in _validate_query
    raise InvalidQueryException("limit not allowed")
tap_cloudwatch.exception.InvalidQueryException: limit not allowed
Looks like, if I am reading the tap right, it handles the limit as well.
j
Got it, thanks. I will keep hammering. Great tool so far.
e
Awesome! Do let me know if you run into any issues.