23.4 C
Saturday, May 18, 2024

A stock analyzer powered by AI that uses LLM and Langchain

Ai, Lllms, Gpt, and Langchain are all contemporary buzzwords that you have probably heard. These modern technologies are all extremely helpful and revolutionary. There are countless uses for them. I attempted to create an intriguing use of language models and blockchain in the finance domain in this project.

An AI bot that can assist you in stock investment by examining all stock-related data, both current and historical, with the use of LLM


The stock analysis procedure takes a lot of time if you’re a retail investor and don’t have a background in finance or the ability to understand all the complex financial concepts. Every time I want to avoid doing all of this stuff manually, I wind up watching a YouTube video or reading a random blog on the internet. This is when I had the idea to create a Langchian and LLM-based bot that could analyze investments using both historical and real-time data.

The basic concept is to retrieve both historical and real-time data, which comprises the following:

  1. Data about historical stock prices.
  1. A financial statement of the company
  2. Most recent business-related news

And the LLM should use all of this information to conduct a fundamental analysis on a particular stock.

I mostly tried with 2 strategies for this project. Find out which one performed poorly but might be improved with a little tweaking, and which one is operating effectively.

Let’s begin exploring the code. Since I won’t be posting teeny-tiny code details to the blog, you may visit my github to get the entire code.

from bs4 import BeautifulSoup
import requests
import yfinance as yf

# Fetch stock data from Yahoo Finance
def get_stock_price(ticker,history=5):
    # time.sleep(4) #To avoid rate limit error
    if "." in ticker:
    stock = yf.Ticker(ticker)
    df = stock.history(period="1y")
    df.index=[str(x).split()[0] for x in list(df.index)]
    # print(df.columns)
    return df.to_string()

# Script to scrap top5 google news for given company name
def google_query(search_term):
    if "news" not in search_term:
        search_term=search_term+" stock news"
    return url

def get_recent_stock_news(company_name):
    # time.sleep(4) #To avoid rate limit error
    headers={'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ Safari/537.36'}

    for n in soup.find_all("div","n0jPhd ynAwRc tNxQIb nDgy9d"):
    for n in soup.find_all("div","IJl0Z"):

    if len(news)>6:
    for i,n in enumerate(news):
        news_string+=f"{i}. {n}\n"
    top5_news="Recent News:\n\n"+news_string
    return top5_news

# Fetch financial statements from Yahoo Finance
def get_financial_statements(ticker):
    # time.sleep(4) #To avoid rate limit error
    if "." in ticker:
    company = yf.Ticker(ticker)
    balance_sheet = company.balance_sheet
    if balance_sheet.shape[1]>=3:
        balance_sheet=balance_sheet.iloc[:,:3]    # Remove 4th years data
    balance_sheet = balance_sheet.to_string()
    return balance_sheet

The necessary data is fetched by the functions get_stock_price, get_financial_statements, and get_recent_stock_news using the Yahoo Finance api and bs4 scraping. This function can be customized to meet your needs, for instance, you can scrape current stock data from several sources and retrieve data that is one month or one year old.


In Langhian, agents are essentially something that is in charge of making decisions. I employed a zeroshot ReaAct agent, short for response and action, which essentially thinks continuously and acts in response to the notion. The issue with this strategy is that it becomes caught in an unending cycle of thought and action since the goal of stock analysis seems too complex for it to confidently pick the next course of action, leading to an endless cycle or poor outcomes that are not very linked to the initial query.

Let’s examine the code:

from langchain.tools import DuckDuckGoSearchRun

# Making tool list

        name="get stock data",
        description="Use when you are asked to evaluate or analyze a stock. This will output historic share price data. You should input the the stock ticker to it "
        name="DuckDuckGo Search",
        description="Use only when you need to get NSE/BSE stock ticker from internet, you can also get recent stock related news. Dont use it for any other analysis or task"
        name="get recent news",
        description="Use this to fetch recent news about stocks"

        name="get financial statements",
        description="Use this to get financial statement of the company. With the help of this data companys historic performance can be evaluaated. You should input stock ticker to it"

from langchain.agents import initialize_agent 

# new_prompt="<Plz refere github repo>"
# zero_shot_agent.agent.llm_chain.prompt.template=new_prompt


zero_shot_agent("Is Bajaj Finance a good investment choice right now?")

Keep in mind that this code is an extension of what we previously covered. Here, all we’re doing is creating a list and transforming the data scraping routines into langchain tools so that the agent can access them. An agent is defined in the later section using the initialize_agent class. It accepts three arguments: llm, tool list, and agent type. This strategy appears to produce a passable result. This strategy might or might not be successful, but by changing the prompt, we can make even more progress.


The ReAct agent struggled to make the right decisions because stock analysis is a difficult undertaking. So I attempted specifying the phases before the analysis itself in this method. All of the data is first downloaded, and it is then fed into a llm for thorough analysis.

#Openai function calling

import json
        "name": "get_company_Stock_ticker",
        "description": "This will get the indian NSE/BSE stock ticker of the company",
        "parameters": {
            "type": "object",
            "properties": {
                "ticker_symbol": {
                    "type": "string",
                    "description": "This is the stock symbol of the company.",

                "company_name": {
                    "type": "string",
                    "description": "This is the name of the company given in query",
            "required": ["company_name","ticker_symbol"],

def get_stock_ticker(query):
    response = openai.ChatCompletion.create(
                "content":f"Given the user request, what is the comapany name and the company stock ticker ?: {query}?"
            function_call={"name": "get_company_Stock_ticker"},
    message = response["choices"][0]["message"]
    arguments = json.loads(message["function_call"]["arguments"])
    company_name = arguments["company_name"]
    company_ticker = arguments["ticker_symbol"]
    return company_name,company_ticker

def Anazlyze_stock(query):
    #agent.run(query) Outputs Company name, Ticker

    # available_information=f"Stock Price: {stock_data}\n\nStock Financials: {stock_financials}\n\nStock News: {stock_news}"
    available_information=f"Stock Financials: {stock_financials}\n\nStock News: {stock_news}"

    analysis=llm(f"Give detail stock analysis, Use the available data and provide investment recommendation. \
             The user is fully aware about the investment risk, dont include any kind of warning like 'It is recommended to conduct further research and analysis or consult with a financial advisor before making an investment decision' in the answer \
             User question: {query} \
             You have the following information available about {Company_name}. Write (5-8) pointwise investment analysis to answer user query, At the end conclude with proper explaination.Try to Give positives and negatives  : \
              {available_information} "

    return analysis

Open AI just added a function call that is incredibly useful for getting the structured output we want from LLM in json format. The same is applied in this method. Function calls are used to retrieve the first stock ticker because the majority of the ensuing code relied on this one argument. React agent in approach 1 was only failing in this phase, delaying all subsequent steps. Once the stock ticker has been successfully extracted, stock information, news, and financial statements may then be retrieved by simply entering the ticker symbol. When all the stock-related data is accessible, the LLM uses it to perform a thorough stock analysis.

Eg- sample input and output of the bot-

Anazlyze_stock("Is it a good time to invest in Yes Bank?")


'Query': 'Is it a good time to invest in Yes Bank?', 'Company_name': 'Yes Bank', 'Ticker': 'YESBANK'
Investment Thesis for Yes Bank:
1. Financial Performance: Yes Bank has shown improvement in its financials over the past three years. The net debt has increased, indicating higher borrowing, but the tangible book value and common stock equity have also increased, suggesting a stronger financial position.
2. Total Capitalization: The total capitalization of Yes Bank has been consistently increasing, indicating a growing investor base and potential for future growth. This can be seen as a positive sign for investors considering investing in the bank.
3. Total Assets: Yes Bank's total assets have also been increasing steadily, indicating the bank's ability to attract and manage a larger pool of assets. This growth in assets can contribute to the bank's profitability and potential for future expansion.
4. Stock News: Recent news about Yes Bank suggests that the stock has seen a marginal increase in price and has been holding steady. This stability in the stock price can be seen as a positive sign for investors, indicating a potential for future growth.
5. Weak Underlying Business: However, it is important to note that there are concerns about the bank's weak underlying business, as indicated by the soft quarter expected in Q1. This may lead to a decline in profitability, which could impact the stock price in the short term.
6. Overall Market Conditions: It is also important to consider the overall market conditions and the banking sector as a whole before making an investment decision. Factors such as economic conditions, regulatory changes, and competition can significantly impact the performance of Yes Bank and its stock price.
Based on the available data and information, it can be concluded that investing in Yes Bank at this time carries

Additional improvements that can be made include: a) Adding more tools. For instance, a math tool for performing intricate technical analysis
Support for other opensource software b) More reliable prompting for consistent output c) LLMS
Note: This is just a fun hobby project; I am not a professional in finance. Please feel free to contribute any recommendations or modification.

Thank You

Related Articles


Please enter your comment!
Please enter your name here

Latest Articles