sub-title

Also check Orama's Quora and Orama's GitHub
I shall not claim to know so much, but only that I learn new things everyday

Friday, 10 June 2022

NoSQL (Not only SQL)

Introduction

When I woke up this morning, I thought I should write a little about something that I have not yet had the opportunity to use, and this is none other than NoSQL. This is a very interesting subject, especially for those who are already in the SQL space, me inclusive.

To some, it may sound unusual that I should be writing about something that I have not used. On the contrary, that should be the absolute motivation for me to write about NoSQL.

Since I currently do not use NoSQL (but I use SQL a lot), one way for me to keep abreast with developments in the NoSQL space is to write and read about it, and also try practicing it. That is a proven way to learn any new technology.

It is a matter of time that I will one day be faced with a requirement to use NoSQL database. This is precisely the reason for this post.

Note: SQL (the old cousin of NoSQL) is second nature to me. I have used it since time immemorial, and I continue to use it as a matter of routine. By the time I finish writing this post, I will have already learnt how to do some basic queries on a NoSQL database. No mean achievement!


A little background to Database Management System (DBMS)

A Database Management System (DBMS) is a software package that allows us to store and manage data electronically. The data may be structured (in rows and columns) or unstructured (free-format). There are many types of DBMSs, but a typical classification of DBMSs is in terms of Relational (or SQL) and Non-Relational (NoSQL).

Relational Database Management System (RDBMS) stores tabular or structured data in tables that are related (relations). It uses SQL as a language to manipulate the database. Examples are: Oracle, MySQL, MS SQL Server, MS Access, PostgreSQL, MariaDB, SQLite, etc.

Non-Relational Database Management System (NRDBMS) or NoSQL Database stores non-tabular data in a way different from Relational Databases. It provides flexible schema and scale easily with large amounts of data and high user loads. Examples are: MongoDB, CouchDB, Cassandra, etc.


The advent of NoSQL

NoSQL stands for “Not Only SQL”, right?. It is used for non-relational databases, and it emerged in the late 2000s following growing need to manage large amounts of unstructured data.

The main types of NoSQL databases are:

1. Document (JSON or JSON-like format, XML, etc). Document data model maps naturally (like ORM) to objects in application code, so it is easy for developers to adapt.
2. Key-value (unique keys for easy retrieval)
3. Wide-column (flexible tabular format that allows variable data in dynamic columns)
4. Graph (nodes as main objects, and edges for relationships)

NoSQL is a run-way from the complex, normalized data model that is characteristic of the traditional relational database. With NoSQL, the attention shifts away from database complexity, but possibly at a cost to data integrity.

Because of reduced database complexity, queries run faster and take less resources compared to the traditional SQL queries. Also, Developer cost may reduce by extension (and Developer productivity may increase), while storage requirements increase since duplication is common and left unattended to.

NoSQL database can easily scale horizontally (scale-out) and vertically unlike the relational counterpart that scales only vertically (scales-up). Scaling is easy because NoSQL does away with table relationships. Scaling is possible because the database can be split into partitions among multiple servers. Each document has a key (_id) whose HASH value falls within a range that designates the server where the particular document is stored. The keyspace (range of keys) helps to identify where to store or read an item (document). If the database is small, then the whole database collection is stored on only one server.

NoSQL makes it possible to store unstructured data, eliminating the need to define the database schema in advance. NoSQL is schemaless. Contrast with SQL database, where changing structure after loading data is challenging, requiring database migration, as well as multiple teams across development, DBA, and Ops to coordinate changes; and also resulting into downtime and possible data loss. NoSQL database doesn't suffer these challenges.

Knowledge of the traditional SQL database, dictionary data structure, JSON, OOP, ORM can be helpful in understanding and implementing the various types of NoSQL databases.


Wrap-up on NoSQL

As already highlighted above, NoSQL is used for unstructured data. Possible use-cases are situations where you have large amounts of unstructured data. This can cut across industry, e.g. healthcare, financial, social-media, etc.

Clearly, my time will come (may be sooner than later) that I will be required to use NoSQL. What a modest way to prepare for that time!

As promised in the Introduction, I am already able to write some basic NoSQL queries (MongoDB query language in this case) such as:

-- users is a collection of documents (compare to table of rows in SQL). Collection is table.
-- documents are synonymous with records, but are independent of each other
-- user_name is a key (just like a field in SQL)


-- insert document
db.users.insert({
user_id: 1,
user_name: 'richardorama'
})


-- find all documents
db.users.find()


-- find a given document
db.users.findOne({user_name: "richardorama"})


-- update document
db.users.update(
{ user_name: 'richardorama'},
{ $set: { notes: "first user"} },
{ multi: false } -- To update multiple documents, set a parameter 'multi' to true
)


-- transaction
session.startTransaction();
db.users.insert ({
user_id: 2,
user_name: 'anotheruser',
age: 20
})
-- reduce age of anotheruser by 2
db.user.update (
{ user_name: { $eq: 'anotheruser', } },
{ $inc: { age: -2 } }
})
session.commitTransaction();


-- JOINS
-- there are two ways to perform joins: using the $lookup operator and de-normalization

-- lookup (like left join user to products)
db
.users.aggregate([{$lookup: { from: "products", localField: "product_id", foreignField: "_id", as: "products" } }])


-- de
-normalization: note that product and customer are de-normalized, otherwise they would be separate collections
{ "_id": 49854, "product": { "name": "45' Yacht", "price": "250000", "description": "A luxurious oceangoing yacht." }, "customer": { "name": "John Q. Millionaire", "address": "1947 Mt. Olympus Dr.", "city": "Los Angeles", "state": "CA", "zip": "90046" }, "quantity": 3, "notes": "Three 45' Yachts for John Q. Millionaire. One for the east coast, one for the west coast, one for the Mediterranean". }

... just like an ORM query.

Now we can see that with prior ORM knowledge, NoSQL query is quite easy for developers to adapt to.

Safe journey to the NoSQL Land.

No comments:

Post a Comment