Introduction to DealAI.lt
What is DealAI.lt?
Section titled “What is DealAI.lt?”DealAI.lt is a comprehensive Lithuanian e-commerce intelligence platform that aggregates, processes, and provides advanced search capabilities for over 60,000 products from multiple Lithuanian online retailers. The platform combines sophisticated web scraping, real-time data processing, and intelligent search technology to create a powerful product discovery and price comparison system.
Platform Purpose
Section titled “Platform Purpose”DealAI.lt serves as a centralized product intelligence hub that:
- Aggregates product data from multiple Lithuanian e-commerce websites
- Normalizes diverse product information into a unified database
- Provides advanced search capabilities with Lithuanian language support
- Tracks price changes and availability over time
- Monitors data quality and freshness across all sources
- Delivers business intelligence through comprehensive dashboards
Key Features
Section titled “Key Features”🔍 Advanced Product Search
Section titled “🔍 Advanced Product Search”Elasticsearch-powered search engine with:
- Full-text search across multiple fields (title, brand, description, SKU)
- Lithuanian language analysis with snowball stemming
- Fuzzy matching for typo tolerance
- Multi-faceted filtering (price, category, brand, availability)
- Real-time search suggestions and highlighting
💰 Price Intelligence
Section titled “💰 Price Intelligence”Comprehensive price tracking system:
- Historical price data with time-series analysis
- Discount detection and alert capabilities
- Price trend visualization with Chart.js
- Availability monitoring across retailers
- Competitive pricing insights
🤖 Automated Data Collection
Section titled “🤖 Automated Data Collection”Enterprise-grade web scraping infrastructure:
- Distributed Scrapyd architecture
- Automated job scheduling via cron
- Queue-based processing for reliability
- Error handling and retry mechanisms
- Real-time job monitoring and management
📊 Real-time Dashboards
Section titled “📊 Real-time Dashboards”Administrative interfaces for:
- Scrapyd job monitoring and control
- Elasticsearch synchronization tracking
- Product catalog management
- Analytics and reporting
- System health monitoring
📸 Visual Documentation
Section titled “📸 Visual Documentation”Automated screenshot system:
- Product page capture for visual records
- Change detection and comparison
- Quality assurance tools
- Historical visual archive
Target Audience
Section titled “Target Audience”This documentation is designed for:
Developers
Section titled “Developers”- Backend developers working with PHP, WordPress, and PostgreSQL
- Frontend developers building user interfaces
- DevOps engineers managing infrastructure
- Integration developers connecting to APIs
Administrators
Section titled “Administrators”- System administrators managing the platform
- Data administrators overseeing product catalog
- Operations teams monitoring performance
- Business analysts using dashboards
Decision Makers
Section titled “Decision Makers”- Technical leads evaluating the architecture
- Product managers planning features
- Business stakeholders understanding capabilities
- Compliance officers reviewing security
Documentation Structure
Section titled “Documentation Structure”This documentation is organized into the following sections:
Getting Started
Section titled “Getting Started”Learn about the platform architecture, technology stack, and setup process.
Core Systems
Section titled “Core Systems”Deep dive into database architecture, Elasticsearch integration, web scraping, and data processing pipelines.
Features
Section titled “Features”Detailed documentation of product search, price tracking, categories, and analytics.
Dashboards & Admin
Section titled “Dashboards & Admin”User guides for administrative interfaces and control panels.
API & Integration
Section titled “API & Integration”Technical reference for database functions, Elasticsearch API, and AJAX endpoints.
File Structure
Section titled “File Structure”Comprehensive guide to the codebase organization and key files.
Development
Section titled “Development”Best practices, debugging, performance optimization, and troubleshooting.
Reference
Section titled “Reference”Database schema, configuration options, environment variables, and glossary.
System Requirements
Section titled “System Requirements”Server Requirements
Section titled “Server Requirements”- OS: Linux (Ubuntu 20.04+ / Debian 10+ recommended)
- PHP: 8.0 or higher
- Web Server: Apache 2.4+ or Nginx 1.18+
- Memory: 4GB RAM minimum, 8GB recommended
- Storage: 50GB minimum (grows with product data)
Database Requirements
Section titled “Database Requirements”- PostgreSQL: 13.0 or higher
- Elasticsearch: 7.x (7.10+ recommended)
- Database Storage: 20GB minimum for 60K products
Service Requirements
Section titled “Service Requirements”- Scrapyd: Latest stable version
- Python: 3.8+ for scrapers
- Cron: System-level job scheduling
Client Requirements
Section titled “Client Requirements”- Modern web browser (Chrome, Firefox, Safari, Edge)
- JavaScript enabled
- Minimum 1024x768 screen resolution
Architecture Principles
Section titled “Architecture Principles”DealAI.lt is built on several key principles:
Separation of Concerns
Section titled “Separation of Concerns”- Clear separation between data collection, storage, search, and presentation
- Modular PHP includes for different functionalities
- Reusable template parts for UI components
Scalability
Section titled “Scalability”- Batch processing for large datasets
- Connection pooling for database efficiency
- Strategic indexing for query performance
- Distributed crawling architecture
Reliability
Section titled “Reliability”- Error handling throughout the stack
- State management for resumable operations
- Failed job tracking and retry logic
- Comprehensive logging for debugging
Performance
Section titled “Performance”- Sub-second search response times
- Optimized database queries
- Elasticsearch for full-text search
- Caching strategies for frequently accessed data
Security
Section titled “Security”- WordPress authentication and authorization
- SQL injection prevention via parameterized queries
- Input sanitization and validation
- Admin-only access to sensitive operations
Next Steps
Section titled “Next Steps”Now that you understand what DealAI.lt is, continue with:
- Quick Start Guide - Get up and running quickly
- Architecture Overview - Understand the system design
- Technology Stack - Learn about the technologies used
- Installation & Setup - Detailed setup instructions
Need Help?
Section titled “Need Help?”- Troubleshooting: See our Troubleshooting Guide
- API Reference: Check the API Documentation
- File Structure: Navigate via Directory Overview