hey hey all! could i get your insight on the extra...
# plugins-general
t
hey hey all! could i get your insight on the extractor config? what variables do i put into the csv or all-spreadsheets fields for the extractor config? I've used the csv listed in the steps and the file name and the extractor is not configured. I've used the cli to get the extractor values needed listed but I don't know what that read out means. help.
d
@thalia_elie Did you see https://meltano.com/plugins/extractors/csv.html#settings and https://meltano.com/plugins/extractors/spreadsheets-anywhere.html#settings? What does your configuration look like when you run
meltano config tap-csv
or
meltano config tap-spreadsheets-anywhere
?
t
yes
ive read and reread each and gone thru each step
the issue is that the values i need to submit arent clear to me.
i dont know what im supposed to input for values for the config
d
OK, I assume you're talking about
entity
,
file
and
keys
under https://meltano.com/plugins/extractors/csv.html#files?
Those values will depend on the specific CSV files you'd like to extract data from
t
good to know i do not have keys for this file. what keys and entity is melanto referring to?
d
As described in those docs,
entity
is the name you'd like to give to the contents of this CSV file, which would be used for as the table name in the DB you'll load the data into.
keys
is a list of column names that together uniquely identify each row
These will correspond to primary keys in the DB once data is loaded
t
ok. to answer the question you asked earlier, I used the UI to enter the data into the fields and the option to configure wasn't working. when I run the command in the cli the response is : meltano command not found
d
OK, I definitely recommend starting with the CLI, which is the preferred and most feature complete way of interacting with Meltano right now
Did you follow the steps under https://meltano.com/docs/getting-started.html#install-meltano? Those should end with a working
meltano
command
t
i've seen in other comments that the UI is not supported so I am working thru cli.
yes. thats what im working on now
im back to the extractor set up now and I've listed it
this is what i get: (.venv) ELSPHIM-400998:my-meltano-project eliet$ meltano config tap-csv list files [env: TAP_CSV_FILES] current value: None (from default) Array of objects with
entity
,
file
, and
keys
keys csv_files_definition [env: TAP_CSV_FILES_DEFINITION, TAP_CSV_CSV_FILES_DEFINITION] current value: None (from default) CSV Files Definition: Project-relative path to JSON file holding array of objects with
entity
,
file
, and
keys
keys
d
Right,
files
and
csv_files_definition
are the two supported settings, and the docs describe in more detail how either can be configured
The output lists the env vars that can be used to set each setting, as well as their current values (None in both cases)
t
i understand that the docs share information, I do note that you are referencing them each time. i am still not clear on this config set up , which is why I joined the slack, to gain insight from a person about the docs. each section has a link to another doc which creates a long list of tabs but arent clear to me what values I need to put where
d
You're right, the docs can be a little overwhelming and take you into a rabbit hole of new concepts to learn! Let me be more specific: In your Meltano project, do you see
meltano.yml
? You can configure the
files
setting for
tap-csv
and its
entity
,
file
, and
keys
subproperties using a
config
entry, like in the example here: https://meltano.com/plugins/extractors/csv.html#how-to-use
Copy code
plugins:
  extractors:
  - name: tap-csv
    variant: meltano
    pip_url: git+<https://gitlab.com/meltano/tap-csv.git>
    config:
      files:
        - entity: <entity>
          file: <path>
          keys: [<key>]
        # ...
t
so i see this set up:
Copy code
meltano config <plugin> set <setting> <value>
and an example below. but these variables do not match what I'm gaining from our slack. there arent file definitions, entities or keys here. i would love to know if the extractor config for tap-csv or tap-all-spreadsheets-everywhere specifies where these values are entered? in the yaml? within a select command? that's really what I'm asking. where do I put the values needed to configure the extractor?
ok
i will work on that, thanks
d
And I'll work on https://gitlab.com/meltano/meltano/-/issues/2383 and https://gitlab.com/meltano/meltano/-/issues/2382 which I think would've helped you here 🙂
t
ive configured the yml file and added the follow ing variables to a tap_csv_files_definition in a separate file.
d
OK! What does
meltano config tap-csv list
look like now?
t
it seems like my next step is to test with
Copy code
meltano config <plugin> list
d
Does it list the path as the current value of
csv_files_definition
?
t
i get: TypeError: 'NoneType' object is not a mapping
d
Hmm what does
meltano.yml
look like?
t
no, i listed the path to my csv file . i can show both.
d
Please do
t
meltano.yml
csv files definition
d
OK, a few things:
• You're seeing an error because your YAML is invalid:
files: ...
needs to be nested under
config:
, so it needs one more level of indentation (2 more spaces ahead of each line)
If you fix that, what does
meltano config tap-csv list
show?
• If you'd like to store your files definition in a separate file, you need to set
csv_files_definition: path/to/file.json
under
config
in
meltano.yml
, instead of including the
files
config there directly
Of course, you can also keep the
files
in
meltano.yml
as you have now, without the separate file
• Your JSON file will likely fail to be parsed because of the
// ...
, which is not actually valid JSON, but is included in the docs example to illustrate that more JSON objects with their own entity/file/keys could follow the one in the example
Good point that it's confusing that the example has that line, while you shouldn't actually keep it in your file! I'll fix that
t
i fixed the indentations. moved files in 2 spaces and everything else below it:
message has been deleted
the error reads: expected <block end>, but found '<block mapping start>'  in "/Users/eliet/meltano-projects/my-meltano-project/meltano.yml", line 12, column 12
d
Yeah, your
file:
and
keys:
lines below
entity:
should have
file
and
keys
start at the same indentation level as
entity
itself
t
ok. lined the rows up
here is the readout: (.venv) ELSPHIM-400998:my-meltano-project eliet$ meltano config tap-csv list files [env: TAP_CSV_FILES] current value: [{'entity': 'GitFlixUsers', 'file': '/Users/eliet/Desktop/GitFlixUsers', 'keys': ['name,age,gender,clv,avg_logins,logins']}] (from
meltano.yml
) Array of objects with
entity
,
file
, and
keys
keys csv_files_definition [env: TAP_CSV_FILES_DEFINITION, TAP_CSV_CSV_FILES_DEFINITION] current value: None (from default) CSV Files Definition: Project-relative path to JSON file holding array of objects with
entity
,
file
, and
keys
keys To learn more about extractor 'tap-csv' and its settings, visit https://meltano.com/plugins/extractors/csv.html
d
OK! That looks almost right, but I think your
file
value should have a
.csv
suffix to match the full path, and
keys
should only contain the columns that uniquely identify the row. If the
name
in each row is expected to be unique, that'd be a good candidate
I'm gonna break for lunch now, but I'll be back to help later this afternoon!
t
ok. i added the .csv (thanks for catching that.) I'm using the Meltano example doc to test and i see the first column listed as Name. would i only need one key, as long as it uniquely identifies the row? I'm guessing that in a real example i would set user i.d. as the unique identifier but for this test run...is name as the only key acceptable?
d
Yep, exactly, normally you'd use the ID, but for the example
name
is sufficient if there's no ID
t
ok enjoy your lunch. this is the recent file. I've removed all other keys except names. and my read out when i run the list all command is the same
d
OK that looks correct!
Let's see if it works in an ELT pipeline once you've added a loader as well
t
ive selected a loader and have listed the settings
i see a command to configure this as well: meltano config <plugin> set <setting> <value>
i hope this is simpler than the yaml 😅 because listing the csv config didn't seem to direct me to editing the yaml file and I didn't use the task definition...so it wasn't directly helpful to me ...I'm not sure if this read out is significant to my progress either
im GUESSING that i use the command to set the values for destination path, delimiter and quotechar
is that right?
im researching and i THINK this doc is relevant: https://meltano.com/plugins/loaders/csv.html#getting-started to my next step..am i on the right path?
i see directions : configure the settings below..but then it moves to the next step with nothing listed below in that section.
I think I add this script to the meltano. yml file
i think i got it
the top is me forgetting to make my destination folder a directory but i think i got it..yes?
im guessing transformation skipped because its a csv file in the correct format, yes?
d
the top is me forgetting to make my destination folder a directory but i think i got it..yes?
Correct, this looks as expected!
im guessing transformation skipped because its a csv file in the correct format, yes?
Correct
im researching and i THINK this doc is relevant: https://meltano.com/plugins/loaders/csv.html#getting-started to my next step..am i on the right path?
Yes, that's right! You can find that link on the bottom of the
meltano config target-csv list
output as well: https://meltano.slack.com/archives/C013EKWA2Q1/p1605296726134600?thread_ts=1605289339.120700&amp;cid=C013EKWA2Q1
i see directions : configure the settings below..but then it moves to the next step with nothing listed below in that section.
The idea is that you jump from that "Configure the settings below using
meltano config
." line to the Settings section using the link, where the doc then describes which settings are required and how they should be configured, with an example minimal configuration. But maybe that could be more clear?