Cloud Search & Analytical SQL Server

Overview

Cloud search provides advanced search and indexing of JSON documents or you your mongoDB data without writing any data integration code. The search engine is built from the ground up in c++ and optimized for speed and also realtime updates, deletes to the posting list index. The search engine operates as either a standalone mongodb compatible database or as a replica instance and syncs all your data in realtime by continously reading the mongod oplog.

What is the difference between Cloud Search and native mongoDB Search?

  • Cloud search offers more boolean operators such as ANY, AND, ATLEAST, NOT and allows you to combine these operators in a way impossible in mongoDB search
  • Cloud search also supports proximity (ordered and unodered), prefix, fuzzy search and advanced boosting using expressions.
  • Cloud search does not restrict you to a single text index on a collection.
  • Cloud search support highlighting & Snippetting using postings-highlighter.
  • Cloud search can intersect, union, reverse complement several special (text and geo) and normal indexes together in memory sparse bitmap datastructures for blazing fast searches. This feature is a game changer for search performance on documents.
  • Cloud search allows you to do multi-term and single term type ahead suggestions using prefix search.
  • Cloud search supports boosting and weights at query time instead of index time as in mongoDB search, you can boost one index over the other. Such as boosting the title field over description in an ecomerce search application.
  • Cloud search features can be accessed using the $textx operator instead of the mongoDB $text operator.
  • Cloud search supports SQL Select instead of the mongodb aggregation pipeline.

How does cloud Search work with the mongoDB database.

Cloud search syncs your data from mongoDB by tailing the database oplog. All updates to your mongoDB database are automatically reflected in your search index in a few milliseconds.

The search engine only uses the in-memory skiplist indexes to process queries. For regex queries for example, the index keys are scanned and the regex operation applied on each key and then a set union operation is performed on the set associated with each key to complete the query. The Textx and text operator uses an inverted index. Geo spatial queries leverages google's S2 library. All other attributes perform a btree search to find the skip list associated with each key. Logical operators execute using the resulting skip list indexes from btree scans.

Cloud search as your primary database

Because cloud search is mongoDB compatible and supports almost all CRUD operations except mongodb map-reduce, it is possible to use cloud search as your only database. The search engine supports CRUD operations the same way mongoDB does. You could also keep only your search documents in the cloud search engine and your other documents in mongoDB.