mirror of
https://github.com/zebrajr/privateGPT.git
synced 2025-12-06 00:19:52 +01:00
* Dockerize private-gpt * Use port 8001 for local development * Add setup script * Add CUDA Dockerfile * Create README.md * Make the API use OpenAI response format * Truncate prompt * refactor: add models and __pycache__ to .gitignore * Better naming * Update readme * Move models ignore to it's folder * Add scaffolding * Apply formatting * Fix tests * Working sagemaker custom llm * Fix linting * Fix linting * Enable streaming * Allow all 3.11 python versions * Use llama 2 prompt format and fix completion * Restructure (#3) Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> * Fix Dockerfile * Use a specific build stage * Cleanup * Add FastAPI skeleton * Cleanup openai package * Fix DI and tests * Split tests and tests with coverage * Remove old scaffolding * Add settings logic (#4) * Add settings logic * Add settings for sagemaker --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> * Local LLM (#5) * Add settings logic * Add settings for sagemaker * Add settings-local-example.yaml * Delete terraform files * Refactor tests to use fixtures * Join deltas * Add local model support --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> * Update README.md * Fix tests * Version bump * Enable simple llamaindex observability (#6) * Enable simple llamaindex observability * Improve code through linting * Update README.md * Move to async (#7) * Migrate implementation to use asyncio * Formatting * Cleanup * Linting --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> * Query Docs and gradio UI * Remove unnecessary files * Git ignore chromadb folder * Async migration + DI Cleanup * Fix tests * Add integration test * Use fastapi responses * Retrieval service with partial implementation * Cleanup * Run formatter * Fix types * Fetch nodes asynchronously * Install local dependencies in tests * Install ui dependencies in tests * Install dependencies for llama-cpp * Fix sudo * Attempt to fix cuda issues * Attempt to fix cuda issues * Try to reclaim some space from ubuntu machine * Retrieval with context * Fix lint and imports * Fix mypy * Make retrieval API a POST * Make Completions body a dataclass * Fix LLM chat message order * Add Query Chunks to Gradio UI * Improve rag query prompt * Rollback CI Changes * Move to sync code * Using Llamaindex abstraction for query retrieval * Fix types * Default to CONDENSED chat mode for contextualized chat * Rename route function * Add Chat endpoint * Remove webhooks * Add IntelliJ run config to gitignore * .gitignore applied * Sync chat completion * Refactor total * Typo in context_files.py * Add embeddings component and service * Remove wrong dataclass from IngestService * Filter by context file id implementation * Fix typing * Implement context_filter and separate from the bool use_context in the API * Change chunks api to avoid conceptual class of the context concept * Deprecate completions and fix tests * Remove remaining dataclasses * Use embedding component in ingest service * Fix ingestion to have multipart and local upload * Fix ingestion API * Add chunk tests * Add configurable paths * Cleaning up * Add more docs * IngestResponse includes a list of IngestedDocs * Use IngestedDoc in the Chunk document reference * Rename ingest routes to ingest_router.py * Fix test working directory for intellij * Set testpaths for pytest * Remove unused as_chat_engine * Add .fleet ide to gitignore * Make LLM and Embedding model configurable * Fix imports and checks * Let local_data folder exist empty in the repository * Don't use certain metadata in LLM * Remove long lines * Fix windows installation * Typos * Update poetry.lock * Add TODO for linux * Script and first version of docs * No jekill build * Fix relative url to openapi json * Change default docs values * Move chromadb dependency to the general group * Fix tests to use separate local_data * Create CNAME * Update CNAME * Fix openapi.json relative path * PrivateGPT logo * WIP OpenAPI documentation metadata * Add ingest script (#11) * Add ingest script * Fix broken name refactor * Add ingest docs and Makefile script * Linting * Move transformers to main dependency * Move torch to main dependencies * Don't load HuggingFaceEmbedding in tests * Fix lint --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> * Rename file to camel_case * Commit settings-local.yaml * Move documentation to public docs * Fix docker image for linux * Installation and Running the Server documentation * Move back to docs folder, as it is the only supported by github pages * Delete CNAME * Create CNAME * Delete CNAME * Create CNAME * Improved API documentation * Fix lint * Completions documentation * Updated openapi scheme * Ingestion API doc * Minor doc changes * Updated openapi scheme * Chunks API documentation * Embeddings and Health API, and homogeneous responses * Revamp README with new skeleton of content * More docs * PrivateGPT logo * Improve UI * Update ingestion docu * Update README with new sections * Use context window in the retriever * Gradio Documentation * Add logo to UI * Include Contributing and Community sections to README * Update links to resources in the README * Small README.md updates * Wrap lines of README.md * Don't put health under /v1 * Add copy button to Chat * Architecture documentation * Updated openapi.json * Updated openapi.json * Updated openapi.json * Change UI label * Update documentation * Add releases link to README.md * Gradio avatar and stop debug * Readme update * Clean old files * Remove unused terraform checks * Update twitter link. * Disable minimum coverage * Clean install message in README.md --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local> Co-authored-by: Iván Martínez <ivanmartit@gmail.com> Co-authored-by: RubenGuerrero <ruben.guerrero@boopos.com> Co-authored-by: Daniel Gallego Vico <daniel.gallego@bq.com>
159 lines
7.8 KiB
Markdown
159 lines
7.8 KiB
Markdown
# 🔒 PrivateGPT 📑
|
|
|
|
<img width="900" alt="demo" src="https://lh3.googleusercontent.com/drive-viewer/AK7aPaBasLxbp49Hrwnmi_Ctii1oIM18nFJrBO0ERSE3wpkS-syjiQBE32_tUSdqnjn6etUDjUSkdJeFa8acqRb0lZbkZ6CyAw=s1600">
|
|
|
|
PrivateGPT is a production-ready AI project that allows you to ask questions to your documents using the power
|
|
of Large Language Models (LLMs), even in scenarios without Internet connection. 100% private, no data leaves your
|
|
execution environment at any point.
|
|
|
|
The project provides an API offering all the primitives required to build private, context-aware AI applications.
|
|
It follows and extends [OpenAI API standard](https://openai.com/blog/openai-api),
|
|
and supports both normal and streaming responses.
|
|
|
|
The API is divided into two logical blocks:
|
|
|
|
**High-level API**, which abstracts all the complexity of a RAG (Retrieval Augmented Generation)
|
|
pipeline implementation:
|
|
- Ingestion of documents: internally managing document parsing,
|
|
splitting, metadata extraction, embedding generation and storage.
|
|
- Chat & Completions using context from ingested documents:
|
|
abstracting the retrieval of context, the prompt engineering and the response generation.
|
|
|
|
**Low-level API**, which allows advanced users to implement their own complex pipelines:
|
|
- Embeddings generation: based on a piece of text.
|
|
- Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested documents.
|
|
|
|
In addition to this, a working [Gradio UI](https://www.gradio.app/)
|
|
client is provided to test the API, together with a set of useful tools such as bulk model
|
|
download script, ingestion script, documents folder watch, etc.
|
|
|
|
> 👂 **Need help applying PrivateGPT to your specific use case?**
|
|
> [Let us know more about it](https://forms.gle/4cSDmH13RZBHV9at7)
|
|
> and we'll try to help! We are refining PrivateGPT through your feedback.
|
|
|
|
## 🎞️ Overview
|
|
DISCLAIMER: This README is not updated as frequently as the [documentation](https://docs.privategpt.dev/).
|
|
Please check it out for the latest updates!
|
|
|
|
### Motivation behind PrivateGPT
|
|
Generative AI is a game changer for our society, but adoption in companies of all size and data-sensitive
|
|
domains like healthcare or legal is limited by a clear concern: **privacy**.
|
|
Not being able to ensure that your data is fully under your control when using third-party AI tools
|
|
is a risk those industries cannot take.
|
|
|
|
### Primordial version
|
|
The first version of PrivateGPT was launched in May 2023 as a novel approach to address the privacy
|
|
concern by using LLMs in a complete offline way.
|
|
This was done by leveraging existing technologies developed by the thriving Open Source AI community:
|
|
[LangChain](https://github.com/hwchase17/langchain), [LlamaIndex](https://www.llamaindex.ai/),
|
|
[GPT4All](https://github.com/nomic-ai/gpt4all),
|
|
[LlamaCpp](https://github.com/ggerganov/llama.cpp),
|
|
[Chroma](https://www.trychroma.com/)
|
|
and [SentenceTransformers](https://www.sbert.net/).
|
|
|
|
That version, which rapidly became a go-to project for privacy-sensitive setups and served as the seed
|
|
for thousands of local-focused generative AI projects, was the foundation of what PrivateGPT is becoming nowadays;
|
|
thus a simpler and more educational implementation to understand the basic concepts required
|
|
to build a fully local -and therefore, private- chatGPT-like tool.
|
|
|
|
If you want to keep experimenting with it, we have saved it in the
|
|
[primordial branch](https://github.com/imartinez/privateGPT/branches) of the project.
|
|
|
|
> It is strongly recommended to do a clean clone and install of this new version of
|
|
PrivateGPT if you come from the previous, primordial version.
|
|
|
|
### Present and Future of PrivateGPT
|
|
PrivateGPT is now evolving towards becoming a gateway to generative AI models and primitives, including
|
|
completions, document ingestion, RAG pipelines and other low-level building blocks.
|
|
We want to make easier for any developer to build AI applications and experiences, as well as providing
|
|
a suitable extensive architecture for the community to keep contributing.
|
|
|
|
Stay tuned to our [releases](https://github.com/imartinez/privateGPT/releases) to check all the new features and changes included.
|
|
|
|
## 📄 Documentation
|
|
Full documentation on installation, dependencies, configuration, running the server, deployment options,
|
|
ingesting local documents, API details and UI features can be found here: https://docs.privategpt.dev/
|
|
|
|
## 🧩 Architecture
|
|
Conceptually, PrivateGPT is an API that wraps a RAG pipeline and exposes its
|
|
primitives.
|
|
* The API is built using [FastAPI](https://fastapi.tiangolo.com/) and follows
|
|
[OpenAI's API scheme](https://platform.openai.com/docs/api-reference).
|
|
* The RAG pipeline is based on [LlamaIndex](https://www.llamaindex.ai/).
|
|
|
|
The design of PrivateGPT allows to easily extend and adapt both the API and the
|
|
RAG implementation. Some key architectural decisions are:
|
|
* Dependency Injection, decoupling the different componentes and layers.
|
|
* Usage of LlamaIndex abstractions such as `LLM`, `BaseEmbedding` or `VectorStore`,
|
|
making it immediate to change the actual implementations of those abstractions.
|
|
* Simplicity, adding as few layers and new abstractions as possible.
|
|
* Ready to use, providing a full implementation of the API and RAG
|
|
pipeline.
|
|
|
|
Main building blocks:
|
|
* APIs are defined in `private_gpt:server:<api>`. Each package contains an
|
|
`<api>_router.py` (FastAPI layer) and an `<api>_service.py` (the
|
|
service implementation). Each *Service* uses LlamaIndex base abstractions instead
|
|
of specific implementations,
|
|
decoupling the actual implementation from its usage.
|
|
* Components are placed in
|
|
`private_gpt:components:<component>`. Each *Component* is in charge of providing
|
|
actual implementations to the base abstractions used in the Services - for example
|
|
`LLMComponent` is in charge of providing an actual implementation of an `LLM`
|
|
(for example `LlamaCPP` or `OpenAI`).
|
|
|
|
## 💡 Contributing
|
|
Contributions are welcomed! To ensure code quality we have enabled several format and
|
|
typing checks, just run `make check` before committing to make sure your code is ok.
|
|
Remember to test your code! You'll find a tests folder with helpers, and you can run
|
|
tests using `make test` command.
|
|
|
|
Interested in contributing to PrivateGPT? We have the following challenges ahead of us in case
|
|
you want to give a hand:
|
|
|
|
### Improvements
|
|
- Better RAG pipeline implementation (improvements to both indexing and querying stages)
|
|
- Code documentation
|
|
- Expose execution parameters such as top_p, temperature, max_tokens... in Completions and Chat Completions
|
|
- Expose chunk size in Ingest API
|
|
- Implement Update and Delete document in Ingest API
|
|
- Add information about tokens consumption in each response
|
|
- Add to Completion APIs (chat and completion) the context docs used to answer the question
|
|
- In “model” field return the actual LLM or Embeddings model name used
|
|
|
|
### Features
|
|
- Implement concurrency lock to avoid errors when there are several calls to the local LlamaCPP model
|
|
- API key-based request control to the API
|
|
- CORS support
|
|
- Support for Sagemaker
|
|
- Support Function calling
|
|
- Add md5 to check files already ingested
|
|
- Select a document to query in the UI
|
|
- Better observability of the RAG pipeline
|
|
|
|
### Project Infrastructure
|
|
- Create a “wipe” shortcut in `make` to remove all contents of local_data folder except .gitignore
|
|
- Packaged version as a local desktop app (windows executable, mac app, linux app)
|
|
- Dockerize the application for platforms outside linux (Docker Desktop for Mac and Windows)
|
|
- Document how to deploy to AWS, GCP and Azure.
|
|
|
|
##
|
|
|
|
## 💬 Community
|
|
Join the conversation around PrivateGPT on our:
|
|
- [Twitter (aka X)](https://twitter.com/PrivateGPT_AI)
|
|
- [Discord](https://discord.gg/bK6mRVpErU)
|
|
|
|
## 📖 Citation
|
|
Reference to cite if you use PrivateGPT in a paper:
|
|
|
|
```
|
|
@software{PrivateGPT_2023,
|
|
authors = {Martinez, I., Gallego, D. Orgaz, P.},
|
|
month = {5},
|
|
title = {PrivateGPT},
|
|
url = {https://github.com/imartinez/privateGPT},
|
|
year = {2023}
|
|
}
|
|
```
|