Exoplanet RAG π
A Retrieval Augmented Generation (RAG) system for querying exoplanet research papers from ArXiv. This framework demonstrates how RAG can be applied to scientific literature, enabling natural language queries about exoplanet research.
Features
- Fetch and process ArXiv papers about exoplanets automatically
- Create vector embeddings for efficient similarity search
- Answer questions using retrieved context and local LLM
- Web interface built with Streamlit
- Local deployment with Ollama for privacy
- Configurable models and parameters
Technology Stack
- Framework: LangChain for RAG pipeline
- LLM: Ollama (Gemma3:1b) for local inference
- Embeddings: Sentence Transformers (all-MiniLM-L6-v2)
- Vector Store: ChromaDB
- Frontend: Streamlit web interface
- Data Source: ArXiv API for research papers
Example Queries
- βWhat are sub Neptunes?β
- βHow are exoplanets detected?β
- βWhat is the habitable zone?β
- βWhat is an atmospheric retrieval?β

