Commit Graph

337 Commits

Author SHA1 Message Date
PromptEngineer
47e59d29fc
Create LICENSE 2025-07-25 20:41:47 -07:00
PromptEngineer
84df73b853
Merge pull request #904 from PromtEngineer/devin/1753129860-add-unstructured-file-support
feat: Add support for DOCX and HTML file formats using docling
2025-07-21 15:40:00 -07:00
Devin AI
8f3417af15 feat: Update frontend file validation to accept MD, DOC, TXT, and HTML formats
- Update IndexForm.tsx to accept .md, .doc, .txt, .html, .htm file extensions
- Update IndexWizard.tsx file input accept attribute for new formats
- Update chat-input.tsx validation logic to handle new MIME types and extensions
- Update empty-chat-state.tsx validation logic for comprehensive file support
- Update test-upload.html to accept all supported file formats
- Resolves frontend file upload restrictions for unstructured document formats

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-21 21:50:46 +00:00
Devin AI
583c72e340 feat: Add TXT and MD file format support to DocumentConverter
- Add .txt and .md extensions to SUPPORTED_FORMATS mapping
- Add _convert_txt_to_markdown method for plain text files
- Support docling's native MD InputFormat for markdown files
- Add proper format detection and routing logic
- Preserve existing PDF OCR detection and multi-format support

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-21 20:47:24 +00:00
Devin AI
d5929ce29b feat: Add support for DOCX and HTML file formats using docling
- Rename PDFConverter to DocumentConverter with multi-format support
- Add SUPPORTED_FORMATS mapping for PDF, DOCX, HTML, HTM extensions
- Update indexing pipeline to use DocumentConverter
- Update file validation across all frontend components and scripts
- Preserve existing PDF OCR detection logic
- Add format-specific conversion methods for different document types

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-21 20:40:39 +00:00
PromptEngineer
6f69e61473
Merge pull request #874 from PromtEngineer/localgpt-v2
Localgpt v2
2025-07-19 20:47:36 -07:00
PromptEngineer
a4e5087aef
Merge pull request #871 from PromtEngineer/fix/lancedb-nan-handling
Fix: Add comprehensive NaN handling for LanceDB indexing
2025-07-18 00:56:47 -07:00
PromptEngineer
acf6efb5a4 fix: Add comprehensive NaN handling for LanceDB indexing
- Add NaN and infinite value detection in QwenEmbedder and OllamaEmbedder
- Implement LanceDB table creation with on_bad_vectors='drop' parameter
- Add fallback strategy with on_bad_vectors='fill' and fill_value=0.0
- Add pre-filtering of chunks with invalid embeddings before indexing
- Add NaN validation to LateChunkEncoder
- Add detailed logging for skipped chunks and error handling
- Resolves LanceDB error: 'Vector column has NaNs' during indexing

This fix ensures robust handling of edge cases in embedding generation
and prevents indexing failures due to invalid vector values.
2025-07-18 00:26:39 -07:00
PromptEngineer
a810687e3d
Merge pull request #870 from PromtEngineer/fix/database-path-auto-detection
fix: implement automatic database path detection for multi-environment compatibility
2025-07-17 22:22:00 -07:00
PromptEngineer
1a2bedd642
Merge branch 'main' into fix/database-path-auto-detection 2025-07-17 22:20:45 -07:00
PromptEngineer
35697b23a4 fix: implement automatic database path detection for multi-environment compatibility
- Add environment auto-detection in ChatDatabase class
- Support both local development and Docker container paths
- Local development: uses 'backend/chat_data.db' (relative path)
- Docker containers: uses '/app/backend/chat_data.db' (absolute path)
- Maintain backward compatibility with explicit path overrides
- Update RAG API server to use auto-detection

This resolves the SQLite database connection error that occurred
when running LocalGPT in local development environments while
maintaining compatibility with Docker deployments.

Fixes: Database path hardcoded to Docker container path
Tested: Local development and Docker environment detection
Breaking: No breaking changes - maintains backward compatibility
2025-07-17 22:13:25 -07:00
PromptEngineer
a3402f4274
Merge pull request #853 from PromtEngineer/devin/1737063744-docker-setup-fix
Fix Docker container SQLite database path issue (#849)
2025-07-15 22:51:35 -07:00
Devin AI
fb75541eb3 fix: resolve Docker networking issue for Ollama connectivity
- Modified OllamaClient to read OLLAMA_HOST environment variable
- Updated docker-compose.yml to pass OLLAMA_HOST to backend service
- Changed docker.env to use Docker gateway IP (172.18.0.1:11434)
- Configured Ollama service to bind to 0.0.0.0:11434 for container access
- Added test script to verify Ollama connectivity from within container
- All backend tests now pass including chat functionality

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-15 21:34:17 +00:00
Devin AI
f21686f51c fix: resolve Docker container SQLite database path issue
- Updated database path from relative 'backend/chat_data.db' to absolute '/app/backend/chat_data.db'
- Modified docker-compose.yml to mount entire backend directory for proper database persistence
- Updated Dockerfile.backend to ensure backend directory exists in container
- Fixes GitHub issue #849: sqlite3.OperationalError unable to open database file

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-15 21:15:28 +00:00
PromptEngineer
3e3e83c41a
Update README.md 2025-07-15 01:04:36 -07:00
PromptEngineer
91c34392d2
Merge pull request #848 from PromtEngineer/devin/1752564985-fix-markdown-excessive-newlines
Fix excessive empty lines in streaming markdown responses
2025-07-15 00:48:31 -07:00
Devin AI
dc9722de28 fix: normalize excessive whitespace in streaming markdown responses
- Create comprehensive text normalization utility to clean up excessive newlines
- Apply normalization to streaming tokens in session-chat.tsx
- Apply normalization to rendered text in conversation-page.tsx
- Add test case demonstrating the fix for excessive empty lines
- Preserve proper markdown formatting while removing visual gaps

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-15 07:30:00 +00:00
PromptEngineer
6af1165894
Merge pull request #847 from PromtEngineer/devin/1752559300-fix-token-based-chunking
Fix: Default to token-based chunking for accurate chunk sizing
2025-07-15 00:07:32 -07:00
Devin AI
a13a71d247 fix: make both chunking methods token-based
- Update MarkdownRecursiveChunker to use tokenizer for token-based sizing
- Update DoclingChunker to use tokenizer with proper error handling
- Ensure IndexingPipeline passes tokenizer_model to both chunkers
- Update UI tooltips to reflect that both modes now use tokens
- Keep Docling as default for enhanced granularity features
- Add fallback to character-based approximation when tokenizer fails

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-15 06:38:55 +00:00
Devin AI
3b648520c9 fix: default to token-based chunking for accurate chunk sizing
- Change default chunker_mode from 'legacy' to 'docling' for token-based chunking
- Update UI to reflect new default with DoclingChunk enabled by default
- Improve tooltips to clarify token vs character chunking behavior
- Fixes issue where 512 token setting was using character-based chunking

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-15 06:05:22 +00:00
PromptEngineer
9d9d051d5c
Update README.md 2025-07-14 08:59:47 -07:00
PromptEngineer
92c247b88f
Update README.md 2025-07-14 08:58:28 -07:00
PromptEngineer
a4c318aad0 docs: refresh badges, quick-start notes, and clean up acknowledgments 2025-07-13 14:00:51 -07:00
PromptEngineer
1e42b46683
Merge pull request #846 from PromtEngineer/devin/1752388447-readme-improvements
Updated README with installation, API, and configuration details
2025-07-12 23:48:35 -07:00
Devin AI
07d68571e9 docs: remove graph queries reference from Smart Routing feature
- Updated Smart Routing description to remove mention of 'graph queries'
- Changed from 'RAG, direct LLM, or graph queries' to 'RAG and direct LLM responses'
- Addresses PR feedback that graph-related features are not yet enabled

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-13 06:44:26 +00:00
Devin AI
934891e625 docs: add advanced features documentation and improve dependency info
- Document Query Decomposition with API examples
- Add Answer Verification system documentation
- Document Contextual Enrichment during indexing
- Add Late Chunking configuration details
- Document Multimodal Support with vision model integration
- Add detailed dependency information in installation section
- Improve batch processing documentation with real config examples
- Update Advanced Features section with comprehensive API examples

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-13 06:37:13 +00:00
Devin AI
7a6d75a74d docs: fix installation instructions and update API documentation
- Fix incorrect repository URL in Docker deployment section
- Update API endpoints to match actual backend implementation
- Add session-based chat endpoints and management
- Document real index management endpoints
- Update model configuration with actual OLLAMA_CONFIG and EXTERNAL_MODELS
- Replace generic pipeline configs with actual default/fast configurations
- Add system launcher documentation and service architecture
- Improve health monitoring and logging documentation
- Update environment variables to match code implementation

Co-Authored-By: PromptEngineer <jnfarooq@outlook.com>
2025-07-13 06:35:35 +00:00
PromptEngineer
0037eec98c docs: add UI preview images and section to README 2025-07-12 22:47:36 -07:00
PromptEngineer
bf406cf549 docs: add retrieval agent mermaid diagram and clean up README sections 2025-07-12 22:22:39 -07:00
PromptEngineer
cd6e569377 docs: update README with badges, features, and repository links 2025-07-12 21:28:49 -07:00
PromptEngineer
6d73a61e5c refactor: Remove unused imports across codebase
Removed unused import statements from various Python files to improve code clarity and reduce unnecessary dependencies.
2025-07-12 02:34:17 -07:00
PromptEngineer
9f7a62b4f1 fix: Correct run_system script and markdown rendering 2025-07-12 02:18:29 -07:00
PromptEngineer
150896340a Merge branch 'localgpt-v2' of https://github.com/PromtEngineer/localGPT into localgpt-v2 2025-07-12 01:52:36 -07:00
PromptEngineer
c93b8639ab fix(db): Correct database path and chat history logic 2025-07-12 01:51:57 -07:00
PromptEngineer
7eb58d19f0
Update requirements-docker.txt 2025-07-12 01:08:37 -07:00
PromptEngineer
3cd11bc617
Update requirements.txt 2025-07-12 01:08:16 -07:00
PromptEngineer
2421514f3e Integrate multimodal RAG codebase
- Replaced existing localGPT codebase with multimodal RAG implementation
- Includes full-stack application with backend, frontend, and RAG system
- Added Docker support and comprehensive documentation
- Enhanced with multimodal capabilities for document processing
- Preserved git history for localGPT while integrating new functionality
2025-07-11 00:17:15 -07:00
PromptEngineer
4e0d9e75e9
Merge pull request #839 from arvind-elayappan/main
Some checks failed
/ precommit (push) Has been cancelled
Updated_HF_version
2025-03-01 18:42:50 -08:00
arvind-elayappan
deb7369fc3 Updated_HF_version
Updated huggingface_hub version
huggingface_hub==0.25.0
The latest versions conflict with other packages
2025-02-09 17:20:14 +00:00
PromptEngineer
732c57d326
Update README.md
Some checks failed
/ precommit (push) Has been cancelled
2024-11-05 05:53:24 -08:00
PromptEngineer
b48c62e7b3
Merge pull request #832 from jxmai/patch-1
Restore context windows size to 8096 tokens in contants config
2024-11-04 07:57:27 -08:00
jxmai
c3620d972b
Restore context windows size to 8096 tokens 2024-10-28 17:24:40 -05:00
PromptEngineer
51eb664c43
Merge pull request #828 from siddhivelankar23/main
Some checks failed
/ precommit (push) Has been cancelled
Add hpu support for Intel® Gaudi®
2024-10-27 20:30:18 -07:00
Siddhi Velankar
c83249b77c
default model llama3 2024-10-24 14:13:06 -05:00
Siddhi Velankar
f396f76c60
cleanup readme 2024-10-23 14:46:46 -05:00
Siddhi Velankar
fcd15fc316
add hpu details to README 2024-10-23 14:43:59 -05:00
siddhivelankar23
c443d3f108 Merge branch 'main' of https://github.com/siddhivelankar23/localGPT 2024-10-23 19:34:03 +00:00
siddhivelankar23
d5635adc38 cleanup 2024-10-23 19:32:15 +00:00
Siddhi Velankar
2cb80aeb58
delete gaudi_spawn.py 2024-10-23 14:27:17 -05:00
siddhivelankar23
6d89108b47 add dockerfile for hpu 2024-10-23 19:26:01 +00:00