NoSql Database - MongoDB



What is NoSQL?



A NoSql or Not Only SQL database is a Non-Relational database which provides a mechanism for storage and retrieval of data in a modeled form rather than a tabular form.

It is a term coined by Eric Evans and Johan Oskarsson who gives this as a open source-Distributed architecture database.


Why NoSQL?


1. One of the biggest reason to use NoSQL is you have a                    
    Big Data project to tackle which is characterized as:
  • High data velocity – lots of data coming in very quickly, possibly from different locations.
  • Data variety – storage of data that is structured, semi-structured and unstructured.
  • Data volume – data that involves many terabytes or petabytes in size.
  • Data complexity – data that is stored and managed in different locations or data centers.


2. Nature of Data:

In traditional RDBMS structured data is used, and nowadays 80% of the data is unstructured. The relational database model normalizes the data into a user table and define a rigid schema. A NoSQL data model is schema-less  and is able to accept any kind of data i.e structured, semi structured and unstructured data.

3. Performance:

In RDBMS, we can make 'n' number of tables and can join tables. Joining of tables degrades the system performance  as it slows down the system, if  data is  huge i.e in terabytes or even more humongous data, then it will respond very late.
In NoSqL, there is no concept of joining  and  we can retrieve the data with Key-Value pair. Also, here data is stored in objects which can map easily irrespective of the data size.


4. Scalibility:

To deal with the concurrent users and the amount of BIG DATA, applications and their  databases need to scale using 2 choices: Scale Up or Scale Out.

  1. Scale up:

Scaling up implies a centralized approach that relies on bigger and bigger servers. It is associated with SQL Database and adds more resources to a single, larger machine.


 2.Scale out:

It is associated with NoSql Database. A distributed set of nodes known as cluster is used in this architecture and we can add new nodes to this server and split the data horizontally. Here all the nodes are independent, data is  evenly distributed or partitioned across these nodes through a process called 'sharding'.

MongoDB


MongoDB is  a open source database, developed by 10gen. It is an agile database that allows schemas to change quickly as applications evolve. It is a NoSQL database.


MongoDB is a powerful, flexible and scalable general-purpose database. It combines the ability to scale out with features such as secondary indexes, range queries, sorting, aggregations etc.

It is a document-oriented database replaces the concept of a "row" with a more flexible model, the"document". 

A document oriented approach makes possible to represent complex hierarchical relationships with a single record.


There are also no predefined schemas: a document's keys and values are not of fixed types or sizes. Without a fixed schema, adding or removing fields becomes very easy.


It supports object oriented languages as JSON, BSON. As there is no ORM Layer unlike RDBMS have, i.e object relational mapping, data is stored in objects directly. There is no need of mapping to store data in MongoDB.


Features of MongoDB

MongoDB is intended to be a general-purpose database, so aside from creating, reading, updating and deleting data, it provides other features such as:


a. Indexing :

It supports generic secondary indexes which searches the database very fast and provides unique, compound indexing capabilities as well.


b. Aggregation:

MongoDB supports an "aggregation pipeline". It allows to aggregate the chunks of data, so as to reduce the complexity of data.

c. Special Collection Types:

 It supports Time-to-live (TTL) collections for data that expire at a certain time. It can generate information for specific time interval.

d. File Storage:

It stores large amount of data upto petabytes or even more.

e. Built in replication for high availability

 MongoDB provides high availibility with replica sets.The replica sets consists of two or more copies of the data. If any one of the nodes gets down due to some error then these replicas are used to retrieve the data.


Industries Where it is Used

There are various domains where MongoDB is used:

 1. Financial Services

  • Risk Analytics
  • Reference data management
  • Time series Data


2. Media and Entertainment

  • Content management and delivery
  • User data management
  • Mobile and social applications


3. Health care

  •  Electronic Health Record (EHR)
  •  Mobile applications for Doctors


4. Government

  • Surveillance data aggregation
  • Crime data management and analytics
  • Healthcare record and management


As the data which is used today is mostly unstructured data and relational databases are not capable of handling unstructured or semi-structured data, or even it can store structure data upto a certain limit, need for NoSQL Database arose. MongoDB is a NoSQL database which overcomes with the problems of Relational databases and also helped in using with all kinds of data and that too at a very fast rate.
Previous
Next Post »

Popular Posts