How does this compare to ingesting all your code into some RAG tool and using that in a chat? I understand the citations part, which is a cool feature indeed, but especially tools for graph-RAG, such as graphiti https://github.com/getzep/graphiti can deliver so much more information that can be stored in a graph versus the code-repository alone, such as info about collaborators, infrastructure, metrics, logs, etc. pp.
You certainly could create an embedding of your code and then hooking it up to OpenWeb UI or equivalent as a chat interface - we've actually spoked to some teams that have rolled their own custom solution like that!
From a product POV: our main focus with Sourcebot is providing a world-class DX and UX so that it is really easy to use. Practically speaking, for DX: a sys-admin should be able to throw Sourcebot up into their cluster in minutes with minimal maintenance overhead. For UX: provide a snappy interface that is minimal and gets out of your way.
From a technology POV: vector embeddings (and techniques like graph-RAG) are definitely something we are going to investigate as a means of improving the agent's ability to find relevant context fast. Bringing in additional context sources (like git history, logs, GitHub issues, etc.) is also something we plan to investigate. It's a really fascinating problem :)
Yes, thanks! I opened an issue on your support site. I got stuck on a file ownership error when trying to mount local repos. Excited to try it if I can get it to work :)
hey I'm Michael (the other cofounder). If the products are purely internal[1] then you're able to use, modify, and distribute the code as you please (even if you're a commercial org). If you have any additional questions about the license feel free to reach out at license@sourcebot.dev
The Fair Source website is a great resource to learn more: https://fair.io/
[1] The only restriction on the code is that it cannot be used for a commercial product that substitutes for our software. We have a few teams that have connected Sourcebot into internal dev dashboards! This is 100% allowed by the license
In reading the docs, it doesn't look like the MCP server supports the Ask Sourcebot capability. Is that correct or am I missing something in the docs? Is that planned to be added?
Yea they are currently separate - the MCP server exposes out the same tools that Ask Sourcebot uses, but the actual LLMs call is on the MCP client. It would be interesting to merge them though - maybe have a Exa style MCP tool that lets MCP clients ask questions similar to how we are doing it with Ask Sourcebot.
Would be great to hear more about your use case though.
This looks pretty neat. Just spotted in the docs that it has an MCP server too, however, I haven't found anything in the docs about using a locally hosted model. Running this on a box in the corner of the office would be great, but external AI providers would be a deal breaker.
I got this set up and working in basically 5 minutes. Going to try to set it up at work. Super cool! It seems like the open source version already has a bunch of features, how do you plan on making sure you can sustainably support it?
That is an orthogonal solution to SSO. I have many apps in my home lab. It doesn't make sense to have individual credentials for everything, even if it is effectively free to keep track of them. Rotating dozens of passwords (even spread out over time) is not my idea of a fun day, nor is supporting individual logins for friends/family who use the apps in my network.
SSO is the quick and easy way, especially when other people are involved.
I see you use the Zoekt project for code search. Why did you choose this over alternatives and how has been your experience so far?
congrats guys, this new feature looks really cool :)
How does this compare to ingesting all your code into some RAG tool and using that in a chat? I understand the citations part, which is a cool feature indeed, but especially tools for graph-RAG, such as graphiti https://github.com/getzep/graphiti can deliver so much more information that can be stored in a graph versus the code-repository alone, such as info about collaborators, infrastructure, metrics, logs, etc. pp.
You certainly could create an embedding of your code and then hooking it up to OpenWeb UI or equivalent as a chat interface - we've actually spoked to some teams that have rolled their own custom solution like that!
From a product POV: our main focus with Sourcebot is providing a world-class DX and UX so that it is really easy to use. Practically speaking, for DX: a sys-admin should be able to throw Sourcebot up into their cluster in minutes with minimal maintenance overhead. For UX: provide a snappy interface that is minimal and gets out of your way.
From a technology POV: vector embeddings (and techniques like graph-RAG) are definitely something we are going to investigate as a means of improving the agent's ability to find relevant context fast. Bringing in additional context sources (like git history, logs, GitHub issues, etc.) is also something we plan to investigate. It's a really fascinating problem :)
I was very excited for a strong off-the-shelf code vector embedding search tool.
I wanted to encourage you to explore that direction, since it's a) very powerful, b) annoying to hand-roll, and thus c) sorely needed as open source.
Love this idea, docs are good I just need to read them better :)
Trying it out now. Keep it fully open source and nicely pluggable and I'll keep being a fan!
Ah I was just replying to your previous comment - I'm guessing you found this? ;) https://docs.sourcebot.dev/docs/connections/local-repos
Thanks for the support!
Yes, thanks! I opened an issue on your support site. I got stuck on a file ownership error when trying to mount local repos. Excited to try it if I can get it to work :)
So can I use Functional Source licensed code in internal products if I’m a commercial org?
hey I'm Michael (the other cofounder). If the products are purely internal[1] then you're able to use, modify, and distribute the code as you please (even if you're a commercial org). If you have any additional questions about the license feel free to reach out at license@sourcebot.dev
The Fair Source website is a great resource to learn more: https://fair.io/
[1] The only restriction on the code is that it cannot be used for a commercial product that substitutes for our software. We have a few teams that have connected Sourcebot into internal dev dashboards! This is 100% allowed by the license
In reading the docs, it doesn't look like the MCP server supports the Ask Sourcebot capability. Is that correct or am I missing something in the docs? Is that planned to be added?
Yea they are currently separate - the MCP server exposes out the same tools that Ask Sourcebot uses, but the actual LLMs call is on the MCP client. It would be interesting to merge them though - maybe have a Exa style MCP tool that lets MCP clients ask questions similar to how we are doing it with Ask Sourcebot.
Would be great to hear more about your use case though.
Just tried it, very cool!
Love that it’s free to use
I thought this had anything to do with Perplexity
We used Perplexity as a mental mapping since there is some overlap, e.g., LLMs using search and citing its sources, it's a webapp, etc.
This looks pretty neat. Just spotted in the docs that it has an MCP server too, however, I haven't found anything in the docs about using a locally hosted model. Running this on a box in the corner of the office would be great, but external AI providers would be a deal breaker.
Running Sourcebot with a self-hosted LLM is something we plan to support and have documented in the golden path very soon, so stay tuned.
We are using the Vercel AI SDK which supports Ollama via a community provider, but doesn't V5 yet (which Sourcebot is on): https://v5.ai-sdk.dev/providers/community-providers/ollama
Nioce
I got this set up and working in basically 5 minutes. Going to try to set it up at work. Super cool! It seems like the open source version already has a bunch of features, how do you plan on making sure you can sustainably support it?
awesome glad to hear! We are monetizing enterprise features like audit logging and SSO. The core product will remain free and under a FSL license.
SSO is not an enterprise feature :( https://sso.tax
I'm using OIDC SSO (via Pocket ID) just for my own sanity. I don't want or need multiple sets of credentials for my home lab applications.
Why not use a password manager instead?
That is an orthogonal solution to SSO. I have many apps in my home lab. It doesn't make sense to have individual credentials for everything, even if it is effectively free to keep track of them. Rotating dozens of passwords (even spread out over time) is not my idea of a fun day, nor is supporting individual logins for friends/family who use the apps in my network.
SSO is the quick and easy way, especially when other people are involved.