Exoplanet RAG πŸ”

less than 1 minute read

A Retrieval Augmented Generation (RAG) system for querying exoplanet research papers from ArXiv. This framework demonstrates how RAG can be applied to scientific literature, enabling natural language queries about exoplanet research.

View on GitHub

Features

  • Fetch and process ArXiv papers about exoplanets automatically
  • Create vector embeddings for efficient similarity search
  • Answer questions using retrieved context and local LLM
  • Web interface built with Streamlit
  • Local deployment with Ollama for privacy
  • Configurable models and parameters

Technology Stack

  • Framework: LangChain for RAG pipeline
  • LLM: Ollama (Gemma3:1b) for local inference
  • Embeddings: Sentence Transformers (all-MiniLM-L6-v2)
  • Vector Store: ChromaDB
  • Frontend: Streamlit web interface
  • Data Source: ArXiv API for research papers

Example Queries

  • β€œWhat are sub Neptunes?”
  • β€œHow are exoplanets detected?”
  • β€œWhat is the habitable zone?”
  • β€œWhat is an atmospheric retrieval?”
RAG Interface Query Results
Exoplanet RAG System Interface

Updated: