why dates are hardcoded in tap-carbon-intensity? :...
# troubleshooting
a
why dates are hardcoded in tap-carbon-intensity? 😞 API works fine but tap extracts only 30 min sample from 2018 year. it frustrates me https://gitlab.com/meltano/tap-carbon-intensity/-/blob/master/tap_carbon_intensity/__init__.py taps for jsonl and parquet files are not importing because of python compatibility issues csv tap fails on extracting with catalog discovery failure it looks strange for me that there are so many issues with those common taps like csv, parquet, json and utilities like Superset or Metabase Frustrating experience using meltano by so far... Is anyone else struggling too, or is it just me, unlucky enough, facing all these problems?
a
why dates are hardcoded in tap-carbon-intensity? 😞 API works fine but tap extracts only 30 min sample from 2018 year.
Hi, @aleksei_razvodov. As you've probably found by now, this tap is basically retired and Meltano went through an effort a few years back to remove references to it. If there are any tutorials still using it, please let us know and we can update those to use a different source. We do hide this from search results on hub.meltano.com but Google will still surface the page.
taps for jsonl and parquet files are not importing because of python compatibility issues
As a rule, all taps and targets should get their own isolated virtual environments when added with
meltano add
. Could you say more about what you are seeing here?
it looks strange for me that there are so many issues with those common taps like csv, parquet, json
These are actually much less used in production use cases. The most common tap for CSV is tap-spreadsheets-anywhere and I see it recently added support for JSONL as well. Parquet is a bit niche as a source. The only one I see on the Hub is one that I created a while back but I'm not actively using it. We have more users using target-parquet and target-s3-parquet, but none or not many using
tap-parquet
as of now.
Frustrating experience using meltano by so far...
Is anyone else struggling too, or is it just me, unlucky enough, facing all these problems?
Sorry to hear it's been frustrating so far. I think on our side we can do a better job of warning when taps and targets on the Hub are more experimental in nature or which are not actively being maintained. You can get some idea from the "Meltano Stats" section in the side panel of the Hub, but we can do more, I think, to make these distinctions obvious.
@aleksei_razvodov - I really hope we can help you get set up for success, and I don't want you to get stuck on POC taps/targets which are perhaps less production worthy. Do you mind describing a bit more about the first pipelines you'd like to have set up? Are the StackExchange, Parquet, and JSONL data sources representative of the data you want to replicate and/or analyze, or were those intended more as a POC for you?
a
Thanks for the answer, @aaronsteers I'll add more details about issues I had with taps and utilities little later. I would be glad if you help me find any open API source with live data for whole website/resource, not for specific project like ads analytics, project management and so on. I want to see a lot of data without overhead of generating it on another resource. Theme doesn't matter. Firstly I wanted to set up a live dashboard based on any API (aka streaming) data source. I went through dozens of extractors on the meltano hub, but a lot of them asked for authorization and getting information about a specific project. For example, I thought about getting statistics of promising GitHub open-source repos, but I didn't figure out how to gather this information about all repos based on searching queries. I've almost succeeded with tap-stackexchange, but it gives me error reading 2.3/questions sometimes. Sometimes it works without any changes and I don't understand why. Then I decided to switch to basic data in csv or jsons or even parquet but faced a lot of issues as well. I went through the tutorial smoothly, but any step aside generates more technical problems than I expected. I think that warning 'under maintenance' or 'not actively used, so it could be a little broken' could help a lot. Because I saw hundreds and thousands on usage count but still faced a lot of issues I shouldn't face in a matured solution.
new project 'test'
$ meltano add extractor tap-singer-jsonl
Copy code
Added extractor 'tap-singer-jsonl' to your Meltano project
Variant:        kgpayne (default)
Repository:     <https://github.com/kgpayne/tap-singer-jsonl>
Documentation:  <https://hub.meltano.com/extractors/tap-singer-jsonl--kgpayne>

Installing extractor 'tap-singer-jsonl'...
Extractor 'tap-singer-jsonl' could not be installed: failed to install plugin 'tap-singer-jsonl'.
ERROR: Ignored the following versions that require a different python version: 0.0.2 Requires-Python >=3.10,<3.12; 0.0.3 Requires-Python >=3.10,<3.12; 0.0.4 Requires-Python >=3.10,<3.12; 0.1.0 Requires-Python >=3.10,<3.12
ERROR: Could not find a version that satisfies the requirement tap-singer-jsonl (from versions: none)
ERROR: No matching distribution found for tap-singer-jsonl

Need help fixing this problem? Visit <http://melta.no/> for troubleshooting steps, or to
join our friendly Slack community.

Failed to install plugin(s)
$ whereis python
Copy code
python: /usr/bin/python3.9-config /usr/bin/python3.9 /usr/lib/python3.9 /etc/python3.9 /usr/local/bin/python3.10-config /usr/local/bin/python3.10 /usr/local/lib/python3.9 /usr/local/lib/python3.10 /usr/include/python3.9
Maybe there's a problem with my python3 installation. I've tried to uninstall it but probably failed.
$ python -V
Copy code
Python 3.10.0
$ python3 -V
Copy code
Python 3.9.2
meltano add extractor tap-parquet
```Added extractor 'tap-parquet' to your Meltano project Variant: dataops-tk (default) Repository: https://github.com/dataops-tk/tap-parquet Documentation: https://hub.meltano.com/extractors/tap-parquet--dataops-tk Installing extractor 'tap-parquet'... Extractor 'tap-parquet' could not be installed: failed to install plugin 'tap-parquet'. Running command git clone --filter=blob:none --quiet https://github.com/dataops-tk/tap-parquet.git /tmp/pip-req-build-5r4jtbhg ERROR: Ignored the following versions that require a different python version: 0.0.2.dev1110036431 Requires-Python >=3.6,<3.9; 0.0.2.dev1110045918 Requires-Python >=3.6,<3.9; 0.0.2.dev1110089869 Requires-Python >=3.6,<3.9; 0.0.2.dev1110124048 Requires-Python >=3.6,<3.9; 0.0.2.dev1110272955 Requires-Python >=3.6,<3.9; 0.0.2.dev1110380533 Requires-Python >=3.6,<3.9; 0.0.2.dev1110403648 Requires-Python >=3.6,<3.9; 0.0.2.dev1110492086 Requires-Python >=3.6,<3.9; 0.0.2.dev1110531009 Requires-Python >=3.6,<3.9; 0.0.2.dev1113404346 Requires-Python >=3.6,<3.9; 0.0.2.dev1118257716 Requires-Python >=3.6,<3.9; 0.0.2.dev1118390906 Requires-Python >=3.6,<3.9; 0.0.2.dev1118394141 Requires-Python >=3.6,<3.9; 0.0.2.dev1118449687 Requires-Python >=3.6,<3.9; 0.0.2.dev1118993814 Requires-Python >=3.6,<3.9; 0.0.2.dev1119285758 Requires-Python >=3.6,<3.9; 0.0.2.dev1119371075 Requires-Python >=3.6,<3.9; 0.0.2.dev1119430340 Requires-Python >=3.6,<3.9; 0.0.2.dev1119444960 Requires-Python >=3.6,<3.9; 0.0.2.dev1119472154 Requires-Python >=3.6,<3.9; 0.0.2.dev1121398448 Requires-Python >=3.6,<3.9; 0.0.2.dev1121409284 Requires-Python >=3.6,<3.9; 0.0.2.dev1122239779 Requires-Python >=3.6,<3.9; 0.0.2.dev1122378994 Requires-Python >=3.6,<3.9; 0.0.2.dev1122383494 Requires-Python >=3.6,<3.9; 0.0.2.dev1125555069 Requires-Python >=3.6,<3.9; 0.0.2.dev1125557515 Requires-Python >=3.6,<3.9; 0.0.2.dev1125956132 Requires-Python >=3.6,<3.9; 0.0.2.dev1125958239 Requires-Python >=3.6,<3.9; 0.0.2.dev1128525228 Requires-Python >=3.6,<3.9; 0.0.2.dev1129435727 Requires-Python >=3.6,<3.9; 0.0.2.dev1132353840 Requires-Python >=3.6,<3.9; 0.0.2.dev1132643726 Requires-Python >=3.6,<3.9; 0.0.2.dev1132716689 Requires-Python >=3.6,<3.9; 0.0.2.dev1132766666 Requires-Python >=3.6,<3.9; 0.0.2.dev1132772314 Requires-Python >=3.6,<3.9; 0.0.2.dev1132882587 Requires-Py…
Also it would be great if meltano stats could be reached from Stats page with ordering and filtering. It's hard to open each existing tap and check its popularity
if every project should use its own virtual environment and it's not built in feature in
meltano init %project_name%
it's not clear at least for me. Tutorial and all taps and targets pages tell only about 'install meltano, init project and cd into it'
a
Really great feedback, @aleksei_razvodov. Thanks. I'll put some comments/suggestions below.
Regarding the Python conflicts and virtual environments... I originally misunderstood your prior comment to mean that you were seeing version conflicts between dependencies. (Meltano keeps them isolated by default so you don't have to do anything special for that.) Understanding now from the above, that the conflict is from the supported python version, you can use this workaround to install plugins that (sometimes wrongly) declare that they do not support more advanced python versions.
Copy code
meltano add --no-install extractor tap-parquet
meltano install --force extractor tap-parquet
This sequence would first add the plugin to your project without installing. And second, it would install ignoring Python version constraints. From the docs:
If the plugin you are trying to install declares that it does not support the version of Python you are using, but you want to attempt to use it anyway, you can override the Python version restriction by providing the
--force
flag to
meltano install
.
I would be glad if you help me find any open API source with live data for whole website/resource, not for specific project like ads analytics, project management and so on. I want to see a lot of data without overhead of generating it on another resource. Theme doesn't matter.
Firstly I wanted to set up a live dashboard based on any API (aka streaming) data source.
I went through dozens of extractors on the meltano hub, but a lot of them asked for authorization and getting information about a specific project. For example, I thought about getting statistics of promising GitHub open-source repos, but I didn't figure out how to gather this information about all repos based on searching queries.
I've almost succeeded with tap-stackexchange, but it gives me error reading 2.3/questions sometimes. Sometimes it works without any changes and I don't understand why.
Thanks for sharing this context. It is indeed a formidable challenge to find good taps for open datasets. Most require at least an API key for rate limiting reasons and to prevent abuse. There are a few trivial examples (Rick and Morty API, The Lord of the Rings API, PokeAPI) but it depends how robust/large of a dataset you are looking for. I can spin off a new topic for this - I know a few people (myself included) have sometimes needed these datasets for tutorials, trainings, and demos. We might get some good ideas from community members.
a
Thanks for your help, @aaronsteers What do you think about adding --no-install and install --force automatically on retry? With big warning that it could work poorly Because I've read a lot of documentation and I didn't find that workaround. Now it would be easier because the topic was discussed on slack but it could be useful to add this step anyway
a
Definitely something I think would be helpful. Do you mind logging an issue in our tracker?
a
Could you send an example please? I never did this before
a
Oh sure! For Meltano bugs, feature requests, and evening else, we have https://github.com/meltano/meltano/issues/new/choose Does this help?
(Basically just the GitHub issue tracker.)