Starting off with a free API, irrespective of whether its worldwide (should cover the US) for now
Choices:
| API | API Calls | Historical Data (yrs) | Delay |
|---|---|---|---|
| Polygon | 5 / min | 2 | Real time |
| Finnhub | 60 / min | 1 | Real time |
| Alpha Vantage | 25 / day | 20 | 15 mins |
| Alpaca | 200 / min | From 2016 so ~9 | 15 mins |
The API choice doesn’t matter that much, super interchangeable, lets go with Alpaca for now as it’s only downside is its delay which doesn’t affect anything much right now as I’m only paper trading
Data retrieval/handling pipeline is going to be coded in Python
Core execution engine: I could use either C++, Go, or Python I’m going to go with C++ as I would like to build experience with it, and wayy more importantly, its super super low latency.
A couple of things I’ve just discovered:
<aside>
”IEX (Investors Exchange) is just one of many U.S. stock exchanges., you are only seeing the trades and quotes that happen on IEX. While this data is real-time, it represents only a small fraction of the total U.S. market volume (around 2-3%). For many stocks, this is perfectly fine for testing. However, you will not see the full picture of all buy and sell orders across the entire market.”
</aside>
Just looked a bit more into the downsides of having only data from IEX on my algo trader and a big one is “Do not assume a profitable backtest on IEX data will translate to a profitable live algorithm.”
It might cause the following issues (according to Perplexity (Gemini)):