The RAG Challenge: A GitHub Copilot Experiment
Table Of Contents
Tim Kitchens recently released an enlightening video comparing three coding assistants - Aider, Cursor, and Windsurf - in building a RAG (Retrieval-Augmented Generation) application, without any coding. Just a detailed prompt and then simple prompt. The prompt is shared by Tim Kitchens here and it uses textual version of Langchain’s RAG tutorial. However, one notable player was missing from this comparison: GitHub Copilot. As someone who’s been exploring various LLM tools, I couldn’t help but wonder how Copilot, especially with its recent improvements especially with the availability of Claude 3.5 sonnet would fare in this challenge. So, I decided to embark on this journey myself, armed with GitHub Copilot and its newly released Edits functionality.
The Challenge
The task seemed straightforward: recreate Tim’s RAG application using GitHub Copilot. But as they say, the devil is in the details. The application needed to process documents, create embeddings, and provide a clean interface for querying - all while following specific constraints about package versions and project structure.
First Attempts: When Things Go South
Remember when Thanos said “Reality is often disappointing”? Well, my first few attempts with Copilot felt exactly like that. Let me break down the journey:
Attempt #1: The UI Nightmare
My first attempt turned into what I’d call “The Great Virtual Environment Debacle.” As someone who isn’t primarily a Python developer, watching Copilot struggle with .venv setup and requirements.txt was like watching a cat try to swim - technically possible, but not pretty.
The problems started stacking up: virtual environment issues, dependency conflicts, and a seemingly endless stream of configuration tweaks. There came a point where my developer instincts screamed “Start over!” And like any good developer who knows when to cut their losses, I listened.
Attempts #2 & #3: The Directory Structure Saga
The next two attempts were what I’d call “The Tale of Two Directories” - or more accurately, too many directories. Copilot seemed determined to create a complex directory structure that would make even the most ardent microservices architect blush. By the third attempt, I realized I needed to be more explicit in my requirements for a flat structure.
The Fourth Time’s the Charm
Finally, on the fourth attempt, everything clicked. The key difference? Just four messages, and only one overlapping with Tim’s original attempts - dealing with BeautifulSoup’s content extension permissions. What made this attempt successful was the combination of:
- Using the
#terminalLastCommand
helper to give Copilot context about terminal operations - Leveraging GitHub Copilot Vision (through the ‘Vision for Copilot Preview’ extension) - pasting a screenshot turned out to be the game-changer. While this requires setting up your API keys, having visual context capabilities puts Copilot on par with tools like Aider and Cursor in terms of understanding visual error contexts
The Power of Copilot Edits
Copilot Edits, while still in preview, showed promising capabilities. It’s like having a junior developer who’s eager to learn and quick to adapt. The integration with VSCode’s terminal, allowing direct command insertion, was particularly impressive. While it might not yet match Aider’s prowess in multi-file editing, it’s definitely getting there.
Tools and Models: A Level Playing Field
In this experiment, like Tim’s original comparison, I used Claude 3.5 Sonnet as the underlying model. This created a level playing field for comparing the tools themselves - Aider, Cursor, Windsurf, and in my case, GitHub Copilot. It’s interesting to note that while all these tools can leverage the same powerful model, their approaches to interfacing with it and handling multi-file projects differ significantly.
These differences in approach highlight both the strengths and areas for improvement in each tool. While Aider excels in file management and Cursor shines in its visual feedback, Copilot brings its own advantages through tight VSCode integration and familiar interface.
Looking Ahead
Building on these strengths, while GitHub Copilot might still be playing catch-up in some areas, particularly multi-file editing capabilities compared to tools like Aider, recent improvements are promising. The latest VSCode release has heavily focused on enhancing the Copilot experience, and Copilot Edits, though in its early stages, shows potential.
A Developer’s Guide to Copilot RAG Development
After playing around with both Copilot and Aider, I’ve discovered some tricks to make Copilot work more like its command-line cousin. Here’s how you can level up your Copilot game:
Setting Up Read-Only Context
One of Aider’s superpowers is its ability to handle read-only files in a session - something that developers often need but rarely get right. VSCode has recently caught up with this functionality1, and here’s how you can leverage it:
{
"github.copilot.chat.codeGeneration.instructions": [
{
// Add prompt file that contains RAG implementation details
"file": ".prompts/rag_app_prompt.md"
},
{
// Add reference documentation for context
"file": "reference/langchainRag.txt"
}
// Instead of passing multiple files like this, we can also pass direct text:
// {
// "text": "Follow these coding conventions: ..."
// }
// This is particularly useful when we want to provide read-only examples,
// coding conventions, or other reference material like DB schemas
// or API documentation
]
}
You can add this either to your global settings.json
for all projects, or in .vscode/settings.json
for project-specific context. I took Tim’s prompt from his video description and added LangChain’s RAG guide as reference material. This ensures Copilot always has this context available during our chats - like having a senior developer who’s always read the documentation.
Managing File Access with Copilot Edits
While Aider excels at managing file access during sessions, Copilot Edits has its own approach. You can explicitly control which files Copilot can modify - in my case, I limited it to .py
files and requirements.txt
. Everything else, including the crucial .env
file, was off-limits. This is particularly important given Tim’s experience where Cursor accidentally exposed his OpenAI key - a reminder that even AI needs boundaries!
The Magic of Minimal Prompting
Once you’ve set up the proper guardrails, the actual development becomes surprisingly straightforward. Here’s what worked for me:
- Start with a simple
Create RAG app
prompt - Use the
#terminalLastCommand
helper to feed error messages back to Copilot and only sayingfix it
- Make use of visual references - While Windsurf currently can’t process images, both Aider and Cursor handle them well. Thanks to the ‘Vision for Copilot Preview’ extension, Copilot has joined this club. After installing the extension and configuring your API keys, you can share screenshots of errors or expected outputs, making debugging sessions much more intuitive.
A funny moment came during the second iteration when the app couldn’t parse HTML pages. Instead of diving into complex debugging, I just pointed out the obvious: “It’s impossible that three different sites can’t be parsed with this logic.” Sometimes, a little common sense goes a long way in AI-assisted development!
Lessons Learned
- Tool Interface Matters: How each tool interfaces with the model can significantly impact the development experience
- Visual Context Helps: Copilot Vision can significantly improve understanding and solution generation
- Keep It Simple: Sometimes, a flat structure is better than an over-engineered solution
- Iteration is Key: Don’t be afraid to start over if things aren’t working
Final Thoughts
While GitHub Copilot might not yet be the one-stop solution for complex multi-file projects, it’s evolving rapidly. The combination of Copilot Edits, Vision, and integration with powerful models like Sonnet makes it a formidable tool in the AI-assisted development landscape. We’re witnessing the evolution of coding assistants, each finding their unique strengths and use cases.
May the AI Force be with you…