Project objective & overview
The objective of the ENT-GPT project was to develop a virtual ENT doctor capable
of answering any queries related to ENT (Ear, Nose, and Throat) diseases. This
project was undertaken for a renowned ENT surgeon based in Noida, India. The
virtual doctor was designed to provide accurate and informative responses to
patient queries, leveraging the power of AI and advanced natural language
processing techniques.
Challenges
The primary challenge in this project was the aggregation and integration of
extensive medical data from diverse sources. We utilized web scraping to collect
data from authoritative sources such as Wikipedia, Mayo Clinic, enthealth.org, and
renowned ENT books. Managing and preprocessing this large volume of data to
ensure its quality and relevance was a significant hurdle. Another challenge was
implementing the Retrieval-Augmented Generation (RAG) technique effectively,
which required sophisticated handling of the vector database and seamless
integration with the large language model (LLM) to ensure accurate responses.
Methodology and Implementation
We began by gathering comprehensive data on ENT diseases through data
scraping from authentic and reliable sources. This data was meticulously
processed and stored in a vector database. Our approach involved the following
steps:
Data Collection: Aggregated data from Wikipedia, Mayo Clinic, enthealth.org,
and esteemed ENT books.
1. Data Processing: Ensured the collected data was cleaned, structured, and
stored efficiently in a vector database.
2. RAG Technique Implementation: Employed the Retrieval-Augmented
Generation technique where user queries were first matched with similar
paragraphs in the vector database. The retrieved information was then used to
generate detailed responses using a large language model (LLM).
3. Model Development: Developed and fine-tuned the model to ensure it
provided precise and informative answers to user queries.
4. User Interface: Built an interactive and user-friendly UI where patients could
easily ask their questions related to ENT health.
Technologies used
- Pinecone
- Chat GPT
- Langchain
- Streamlit
- RAG
- Beautifulsoup