Skip to content

Introduction to DealAI.lt

DealAI.lt is a comprehensive Lithuanian e-commerce intelligence platform that aggregates, processes, and provides advanced search capabilities for over 60,000 products from multiple Lithuanian online retailers. The platform combines sophisticated web scraping, real-time data processing, and intelligent search technology to create a powerful product discovery and price comparison system.

DealAI.lt serves as a centralized product intelligence hub that:

  • Aggregates product data from multiple Lithuanian e-commerce websites
  • Normalizes diverse product information into a unified database
  • Provides advanced search capabilities with Lithuanian language support
  • Tracks price changes and availability over time
  • Monitors data quality and freshness across all sources
  • Delivers business intelligence through comprehensive dashboards

Elasticsearch-powered search engine with:

  • Full-text search across multiple fields (title, brand, description, SKU)
  • Lithuanian language analysis with snowball stemming
  • Fuzzy matching for typo tolerance
  • Multi-faceted filtering (price, category, brand, availability)
  • Real-time search suggestions and highlighting

Comprehensive price tracking system:

  • Historical price data with time-series analysis
  • Discount detection and alert capabilities
  • Price trend visualization with Chart.js
  • Availability monitoring across retailers
  • Competitive pricing insights

Enterprise-grade web scraping infrastructure:

  • Distributed Scrapyd architecture
  • Automated job scheduling via cron
  • Queue-based processing for reliability
  • Error handling and retry mechanisms
  • Real-time job monitoring and management

Administrative interfaces for:

  • Scrapyd job monitoring and control
  • Elasticsearch synchronization tracking
  • Product catalog management
  • Analytics and reporting
  • System health monitoring

Automated screenshot system:

  • Product page capture for visual records
  • Change detection and comparison
  • Quality assurance tools
  • Historical visual archive

This documentation is designed for:

  • Backend developers working with PHP, WordPress, and PostgreSQL
  • Frontend developers building user interfaces
  • DevOps engineers managing infrastructure
  • Integration developers connecting to APIs
  • System administrators managing the platform
  • Data administrators overseeing product catalog
  • Operations teams monitoring performance
  • Business analysts using dashboards
  • Technical leads evaluating the architecture
  • Product managers planning features
  • Business stakeholders understanding capabilities
  • Compliance officers reviewing security

This documentation is organized into the following sections:

Learn about the platform architecture, technology stack, and setup process.

Deep dive into database architecture, Elasticsearch integration, web scraping, and data processing pipelines.

Detailed documentation of product search, price tracking, categories, and analytics.

User guides for administrative interfaces and control panels.

Technical reference for database functions, Elasticsearch API, and AJAX endpoints.

Comprehensive guide to the codebase organization and key files.

Best practices, debugging, performance optimization, and troubleshooting.

Database schema, configuration options, environment variables, and glossary.

  • OS: Linux (Ubuntu 20.04+ / Debian 10+ recommended)
  • PHP: 8.0 or higher
  • Web Server: Apache 2.4+ or Nginx 1.18+
  • Memory: 4GB RAM minimum, 8GB recommended
  • Storage: 50GB minimum (grows with product data)
  • PostgreSQL: 13.0 or higher
  • Elasticsearch: 7.x (7.10+ recommended)
  • Database Storage: 20GB minimum for 60K products
  • Scrapyd: Latest stable version
  • Python: 3.8+ for scrapers
  • Cron: System-level job scheduling
  • Modern web browser (Chrome, Firefox, Safari, Edge)
  • JavaScript enabled
  • Minimum 1024x768 screen resolution

DealAI.lt is built on several key principles:

  • Clear separation between data collection, storage, search, and presentation
  • Modular PHP includes for different functionalities
  • Reusable template parts for UI components
  • Batch processing for large datasets
  • Connection pooling for database efficiency
  • Strategic indexing for query performance
  • Distributed crawling architecture
  • Error handling throughout the stack
  • State management for resumable operations
  • Failed job tracking and retry logic
  • Comprehensive logging for debugging
  • Sub-second search response times
  • Optimized database queries
  • Elasticsearch for full-text search
  • Caching strategies for frequently accessed data
  • WordPress authentication and authorization
  • SQL injection prevention via parameterized queries
  • Input sanitization and validation
  • Admin-only access to sensitive operations

Now that you understand what DealAI.lt is, continue with:

  1. Quick Start Guide - Get up and running quickly
  2. Architecture Overview - Understand the system design
  3. Technology Stack - Learn about the technologies used
  4. Installation & Setup - Detailed setup instructions