bubba
03/23/2023, 8:42 PM.devcontainer. After switching I am running into errors running tap-salesforce -> target-snowflake. I was able to run successfully run meltano invoke tap-salesforce > out.jsonl and then cat out.jsonl | meltano invoke target-snowflake-sfdc but when running meltano run tap-salesforce target-snowflake I get an error. Also will note that it does not fail on smaller sync volumes. More info in the thread.bubba
03/23/2023, 8:42 PM.devcontainer
// For format details, see <https://aka.ms/devcontainer.json>. For config options, see the
// README at: <https://github.com/devcontainers/templates/tree/main/src/python>
{
"name": "Meltano Devcontainer",
// Or use a Dockerfile or Docker Compose file. More info: <https://containers.dev/guide/dockerfile>
"image": "<http://mcr.microsoft.com/devcontainers/python:0-3.9|mcr.microsoft.com/devcontainers/python:0-3.9>",
// Features to add to the dev container. More info: <https://containers.dev/features>.
"features": {
"github-cli": {
"version": "latest"
},
"<http://ghcr.io/devcontainers/features/docker-in-docker:1|ghcr.io/devcontainers/features/docker-in-docker:1>": {
"version": "latest"
},
"<http://ghcr.io/devcontainers-contrib/features/meltano|ghcr.io/devcontainers-contrib/features/meltano>": {},
"<http://ghcr.io/devcontainers/features/docker-in-docker:2|ghcr.io/devcontainers/features/docker-in-docker:2>": {},
"<http://ghcr.io/devcontainers/features/node:1|ghcr.io/devcontainers/features/node:1>": {},
"<http://ghcr.io/devcontainers-contrib/features/meltano:2|ghcr.io/devcontainers-contrib/features/meltano:2>": {}
},
"customizations": {
"vscode": {
"extensions": [
"z3z1ma.meltano-power-user",
"streetsidesoftware.code-spell-checker",
"mechatroner.rainbow-csv",
"bastienboutonnet.vscode-dbt",
"eamodio.gitlens",
"4ops.terraform",
"redhat.vscode-yaml",
"albert.TabOut",
"ms-python.isort",
"ms-python.flake8",
"njpwerner.autodocstring",
"samuelcolvin.jinjahtml",
"Alpha4.jsonl"
],
"settings": {
"python.pythonPath": "/usr/local/bin/python"
}
}
},
"postStartCommand": "./.devcontainer/setup.sh && meltano --cwd=./ install --parallelism=4"
// Use 'forwardPorts' to make a list of ports inside the container available locally.
// "forwardPorts": [],
// Use 'postCreateCommand' to run commands after the container is created.
// "postCreateCommand": "pip3 install --user -r requirements.txt",
// Configure tool-specific properties.
// "customizations": {},
// Uncomment to connect as root instead. More info: <https://aka.ms/dev-containers-non-root>.
// "remoteUser": "root"
}bubba
03/23/2023, 8:43 PMmeltano.yml
version: 1
default_environment: dev
project_id: a1263922-28e7-4c50-98ff-2177807be279
environments:
- name: dev
- name: prod
send_anonymous_usage_stats: false
plugins:
extractors:
- name: tap-salesforce
variant: meltanolabs
pip_url: git+<https://github.com/meltanolabs/tap-salesforce.git@v1.5.0>
config:
api_type: BULK
select:
- "!*.*Maps_Latitude__c"
- "!*.*Maps_Longitude__c"
- Lead.*
loaders:
- name: target-snowflake
variant: transferwise
pip_url: pipelinewise-target-snowflake
config:
temp_dir: output
- name: target-snowflake-sfdc
inherit_from: target-snowflake
config:
add_metadata_columns: true
file_format: POC.MELTANO.SALESFORCE_CSV
default_target_schema: SALESFORCEbubba
03/23/2023, 8:43 PM2023-03-23T20:31:29.745908Z [info ] INFO Making GET request to <https://7shifts.my.salesforce.com/services/async/41.0/job/750JA000001AbJAYA0/batch/751JA000001ZwXLYA0> with params: None cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:31:29.823489Z [info ] INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.07719254493713379, "tags": {"endpoint": "get_batch", "status": "succeeded"}} cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:31:29.847603Z [info ] INFO Making GET request to <https://7shifts.my.salesforce.com/services/async/41.0/job/750JA000001AbJAYA0/batch/751JA000001ZwXLYA0/result> with params: None cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:31:29.916989Z [info ] INFO METRIC: {"type": "timer", "metric": "http_request_duration", "value": 0.06950092315673828, "tags": {"endpoint": "batch_result_list", "sobject": "Lead", "status": "succeeded"}} cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:31:29.918186Z [info ] INFO Making GET request to <https://7shifts.my.salesforce.com/services/async/41.0/job/750JA000001AbJAYA0/batch/751JA000001ZwXLYA0/result/752JA000000jgkw> with params: None cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:32:06.399501Z [info ] INFO METRIC: {"type": "counter", "metric": "record_count", "value": 1, "tags": {"endpoint": "Lead"}} cmd_type=elb consumer=False name=tap-salesforce producer=True stdio=stderr string_id=tap-salesforce
2023-03-23T20:32:36.255855Z [error ]
Traceback (most recent call last):
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/logging/output_logger.py", line 203, in redirect_logging
yield
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/block/extract_load.py", line 435, in run
await self.run_with_job()
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/block/extract_load.py", line 461, in run_with_job
await self.execute()
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/block/extract_load.py", line 427, in execute
await manager.run()
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/block/extract_load.py", line 625, in run
await self._wait_for_process_completion(self.elb.head)
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/block/extract_load.py", line 690, in _wait_for_process_completion
raise output_futures_failed.exception()
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/logging/utils.py", line 236, in capture_subprocess_output
if not await _write_line_writer(writer, line):
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/logging/utils.py", line 206, in _write_line_writer
await writer.wait_closed()
File "/usr/local/lib/python3.9/asyncio/streams.py", line 359, in wait_closed
await self._protocol._get_close_waiter(self)
File "/usr/local/py-utils/venvs/meltano/lib/python3.9/site-packages/meltano/core/logging/utils.py", line 204, in _write_line_writer
await writer.drain()
File "/usr/local/lib/python3.9/asyncio/streams.py", line 387, in drain
await self._protocol._drain_helper()
File "/usr/local/lib/python3.9/asyncio/streams.py", line 197, in _drain_helper
await waiter
BrokenPipeErrorbubba
03/23/2023, 8:46 PMDocker Desktop 4.17.0 (99724) Apple M1 Pro Ventura 13.2.1Sven Balnojan
03/24/2023, 9:09 AMbubba
03/24/2023, 3:25 PM--log-level=debug didn't gain any information. Smaller sync volumes as in only a couple hundred rows rather than a couple thousand. I'm running the default 100,000 rows for snowflake but also tried 25,000 and 10,000. I am relatively new to containers, so not sure how to scale up. I'll try outputting to jsonl and might try messing with container volumes. Thanks for the suggestions! I'll post if I find anything.bubba
03/24/2023, 4:47 PMmeltano run tap-salesforce target-jsonl Works without error.Sven Balnojan
03/27/2023, 7:30 AMbubba
03/27/2023, 5:22 PM6GB allocated to Docker on my Mac, after increasing to 12GB the issue has since gone away. I think the issue comes down to the way the SFDC tap works? Even with limiting the batch size in the Snowflake I still ran into the issue. My understanding of how the tap works is lacking. But I'm guessing it's pulling all or most data and storing it in memory? Some of the objects have 400+ columns, so it could just be the large amount of data as the issue.