matt_elgazar
05/25/2024, 4:54 AMactions
- set tickers: [ "AYO.F" ]
. I believe my code handles empty data by yielding an empty record with the required properties.
When I run meltano invoke tap-yfinance
the tap works properly and you can see actions
ran fine (no errors and contains the required properties). However, when running meltano el tap-yfinance target-jsonl --select actions
it fails with simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
By entering debug mode (I’m using pycharm) I was able to pinpoint the below:
breaks:
def get_records(self, context: dict | None) -> Iterable[dict]:
<http://logging.info|logging.info>(f"\n\n\n*** Running ticker {context['ticker']} *** \n\n\n")
state = self.get_context_state(context)
financial_tap = FinancialTap(schema=self.schema, ticker=context["ticker"], config=self.config, name=self.name)
df = getattr(financial_tap, self.method_name)(ticker=context["ticker"]) # returns empty df
yield {"timestamp": "2021-01-01 00:00:00"}
works fine:
def get_records(self, context: dict | None) -> Iterable[dict]:
<http://logging.info|logging.info>(f"\n\n\n*** Running ticker {context['ticker']} *** \n\n\n")
state = self.get_context_state(context)
financial_tap = FinancialTap(schema=self.schema, ticker=context["ticker"], config=self.config, name=self.name)
# df = getattr(financial_tap, self.method_name)(ticker=context["ticker"])
yield {"timestamp": "2021-01-01 00:00:00"}
In debug mode, df = getattr(<>)
returns an empty df as expected, and going through the lines of code it yields {"timestamp": "2021-01-01 00:00:00"}
as expected. I’m really stuck now, beacuse I’m not sure why commenting a line that doesn’t get used breaks only when calling meltano el
tldr; meltano el
breaks, but meltano invoke
works. Error is simplejson.scanner.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
but the data is a valid json. meltano el
breaks from a line of code that doesn’t even get usedEdgar Ramírez (Arch.dev)
05/27/2024, 5:39 PMsimplejson.scanner.JSONDecodeError
makes me think it's an error coming from a call to simplejson.load
, and that is not done anywhere in the SDK afaict. So, it's probably coming from target-jsonl
or upstream from it in the singer-python library (maybe here?), that's why meltano invoke
works but meltano el
does not.
I'd inspect the output of meltano invoke
line by line to see if there's something that's not valid JSON and is causing the target to crash.Edgar Ramírez (Arch.dev)
05/28/2024, 3:42 PMmatt_elgazar
05/28/2024, 3:44 PMyfinance
library. My hypothesis is that they integrated some sort of logging
change where if the ticker returns an empty dataset it calls logging.error
or something somewhere, but I haven’t validated that. I’ll have to dig into it further, because when I downgrade to yfinance version 0.38
then meltano works fine, but 0.40
breaks meltano el
. I’m not sure why that is since there’s not necessarily an error return from the command per say.Edgar Ramírez (Arch.dev)
05/28/2024, 3:50 PMmeltano invoke tap-yfinance > inspect.me.jsonl
and searching the file for a line that's not valid json.matt_elgazar
05/28/2024, 3:51 PMprint
to stdout!matt_elgazar
05/28/2024, 3:52 PMprint
is called?Edgar Ramírez (Arch.dev)
05/28/2024, 4:47 PMmatt_elgazar
05/28/2024, 5:53 PMimport yfinance as yf
yf.Ticker('XYZKALSFJKSDLF1231293098F').actions
XYZKALSFJKSDLF1231293098F: No timezone found, symbol may be delisted
Series([], dtype: object)
^ Of course that’s an invalid ticker. Now if we look at a ticker that used to be valid:
yf.Ticker('AYO.F').actions
AYO.F: No price data found, symbol may be delisted (1d 1925-06-21 -> 2024-05-28)
Series([], dtype: object)
^ this output may be the issue because it’s an invalid jsonmatt_elgazar
05/28/2024, 5:54 PMactions
) you may be able to reproduce
tickers: ["AYO.F"]
Edgar Ramírez (Arch.dev)
05/28/2024, 7:29 PMmatt_elgazar
05/28/2024, 7:31 PMprice_history_wide
so it doesnt use pdr
It might be here though! https://github.com/ranaroussi/yfinance/blob/930b305327e2e3769b5d62115b3ab25bc58f28de/yfinance/utils.py#L58-L62matt_elgazar
05/28/2024, 7:32 PMactions
tap calls FinancialStream
(a base stream) which calls ActionsStream