The model of DanVit search project now developed

Obviously, the search engine should include the following modules:

  •     Effective web-crawler, which will be regularly circumvent the specified area Internet (Web resources related to Sevastopol), avoiding repetition and any tweaks SEO. Downloaded all the information (html, text, doc, xml, pdf, flash, graphics) decided to keep the database. Do not crack? We'll see.
  •     The indexer, which analyzes the information crawlers, optimize it (except for the repetition, again - all SEO-tricks), ranks and prepare for high-quality full text search.
  •     In fact, a search engine that accepts user queries in natural language (with a syntax as close to Google, Yandex, etc.) and quickly issuing the relevant search results. Paradigm: "The perfect search engine is able to determine exactly what the user has in mind, and show exactly the results that you want."
  •     Web-based user interface, easy and convenient. Likely to be similar to the interfaces of global search engines, so as not to confuse the user.
  •     Admin interface for managing all system parameters. Will be implemented as a web application.
  •     Any additional modules: analytics, statistics, backup, optimization, testing, etc.


Further questions deal with specific implementation, the choice of tools, learning experiences of colleagues.

© 2010 - 2019 D@nVitLabs