General outline of lucene.net engine usage with database and ORM problems

| | August 6, 2015

The question may look a bit confusing, but this because my lack of experience. I made a few lucene usage tutorials and have some basic knowledge. Gona implement this engine in .net as a simple application showing most common usage:

SE – Search Engine

  • load existing data from file/database
  • put this into the lucene store
  • put some values in lucene’s index
  • add data search (queries)
  • update/delete data

this usage looks most common. Lets say we have some data set from the net. As I understand a whole process and would like to create an android(java)/web app this would be something like:

  1. Create some database structure
  2. Perform an ETL process to transform bare data set to the database inputs (eg MySQL dbms)
  3. Implement lucene (lucene.net)
    Use some ORM(ADO.NET/NHibernate) to transform database data to objects understandable by lucene (eg lucene Document corresponds to database table records [SE Document structure = database table structure] ?) (never performed any ORM)

If we have existing relational database do we need to create new one more understandable for SE. I have never used the ORM mapping so that dont really know how should it be done. Lets say that we have some basic forum with simple relational database of users and their posts. If user want to search for some post he retrieves data from base using SE. If he want to add/delete some post, (as I understand), he would do this directly using database with no SE usage. After adding/del data to base, we have to inform our SE, update(delete current documents and add whole database from the beginning), make new index, optimize it. I even wonderd if application with SE would ever exist without database. I understand SE has its own binary flat file structure but in users/post data would it be possible not using any dbms?

I know it looks a bit messy, but the topic regards to different areas and it is better ask now than wate time later because of general missunderstanding.

Appreciate any information from someone who already faced this.

Thanks

EDIT:
Just lets say we want test some usefull SE usage. We will need database with data to test it, so there will be ORM to some .net objects or directly to lucene Documents(?) and later put it to lucene specific storage.

One Response to “General outline of lucene.net engine usage with database and ORM problems”

  1. Using an ORM purely to retrieve data and add it to a Lucene index is overkill. Lucene indexes documents, which are in themselves not much more than field value pairs. You would be better off using ADO.NET directly or a micro ORM to get the data out and into Lucene documents ready for indexing.

    If your data is not already in a relational database, then you might also consider whether you need a RDBMS at all. Lucene can store data as well as index it.

Leave a Reply