Asking for a friend... Whats the loader/extract th...
# best-practices
n
Asking for a friend... Whats the loader/extract that I can use to perform ETL from a folder that contains lots of json files to a postgres db? I have tried to use https://hub.meltano.com/extractors/tap-singer-jsonl/ but am running into this error:
Copy code
(etl) ➜  etl git:(main) ✗ meltano config tap-singer-jsonl set local.folders /Users/noamsiegel/Downloads/tripadvisor-matched-files/

2025-02-04T07:34:25.310094Z [info     ] The default environment 'dev' will be ignored for `meltano config`. To configure a specific environment, please use the option `--environment=<environment name>`.
Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Failed to parse JSON array from string: '/Users/noamsiegel/Downloads/tripadvisor-matched-files/'
I tried with and without the last "/" at the end of the folder path
r
n
@Reuben (Matatika) I tried that, but my json is nested... not sure how to proceed
r
What do you mean by "nested"? Like multiple JSON files in different subdirectories? The tap should be able to handle that IIRC.
n
@Reuben (Matatika) I mean that it is not a flat json structure. More like this:
Copy code
{
  "company": {
    "name": "TechCorp",
    "location": "San Francisco",
    "departments": [
      {
        "name": "Engineering",
        "manager": {
          "name": "Alice Johnson",
          "title": "VP of Engineering"
        },
        "teams": [
          {
            "name": "Backend",
            "lead": {
              "name": "Bob Smith",
              "title": "Senior Backend Engineer"
            },
            "members": [
              {
                "name": "Charlie Green",
                "title": "Backend Engineer",
                "skills": [
                  "Python",
                  "Django",
                  "PostgreSQL"
                ],
                "projects": [
                  {
                    "name": "User Authentication",
                    "status": "In Progress"
                  },
                  {
                    "name": "API Development",
                    "status": "Completed"
                  }
                ]
              },
              {
                "name": "Dana White",
                "title": "Database Administrator",
                "skills": [
                  "SQL",
                  "AWS RDS",
                  "Performance Optimization"
                ]
              }
            ]
          },
          {
            "name": "Frontend",
            "lead": {
              "name": "Eve Black",
              "title": "Senior Frontend Engineer"
            },
            "members": [
              {
                "name": "Frank Adams",
                "title": "UI Engineer",
                "skills": [
                  "React",
                  "CSS",
                  "Figma"
                ]
              }
            ]
          }
        ]
      },
      {
        "name": "Marketing",
        "manager": {
          "name": "Grace Lee",
          "title": "VP of Marketing"
        },
        "teams": [
          {
            "name": "Content",
            "lead": {
              "name": "Hank Miller",
              "title": "Content Strategist"
            },
            "members": [
              {
                "name": "Isla Brown",
                "title": "Copywriter",
                "skills": [
                  "SEO",
                  "Blog Writing"
                ]
              }
            ]
          }
        ]
      }
    ]
  }
}
r
I had a look at the README and it looks like JSON support expects a specific format: https://github.com/ets/tap-spreadsheets-anywhere?tab=readme-ov-file#json-support Is this what you ran into? Going back to
tap-singer-jsonl
then, the above error can be resolved by passing the
local.folders
setting as a JSON array
Copy code
meltano config tap-singer-jsonl set local.folders '["/Users/noamsiegel/Downloads/tripadvisor-matched-files/"]'
or just manually enter it in the
meltano.yml
yourself:
Copy code
config:
  local:
    folders:
    - /Users/noamsiegel/Downloads/tripadvisor-matched-files/
n
Thank you @Reuben (Matatika)! That with a few more fixes did the trick for me. Now playing with putting the data into a postgres table.
👍 1