How to handle many to many relationship in MongoDB

MongoDB is one type of NoSQL database (Not Only SQL). MongoDB was generated by 10gen in 2009 to bridge the gap between key-value stores (which are fast and scalable) and relational databases (which have rich functionality). Its name is from “Humongos”. It is a document oriented NoSQL database, which is in JSon like format.

The terminology comparison between MongoDB and Relational database

MongoDB can set up relationship between collections (tables) through reference or embed. Let us see how it handle many to many relationship through an example.

First we set up 3 collection using student course data.

  • course collection
    var coursedata = [
    {_id:”a”, coursename:”English”, cost: 200 },
    {_id:”b”, coursename:”Big Data”, cost: 600 },
    {_id:”c”, coursename:”Business”, cost: 400 },
    {_id:”d”, coursename:”Biology”, cost: 300 }
    ];
    db.course.insert (coursedata);

  • studentcourse collection
    var studentcoursedata =[
    {sid:1, cid : “a”},
    {sid:1, cid : “b”},
    {sid:2, cid : “d”},
    {sid:3, cid : “a”},
    {sid:3, cid : “c”},
    {sid:4, cid : “b”},
    {sid:4, cid : “c”},
    ];
    db.studentcourse.insert(studentcoursedata);

  • student collection var studentdata =[
    {_id:1, studentname:”John” },
    {_id:2, studentname:”Mary” },
    {_id:3, studentname:”Adam” },
    {_id:4, studentname:”Terry” }
    ];
    db.student.insert (studentdata);

Now, we have a question such as “who take course of big data?”

In RDBMS, you would use join or subquery to solve the issue. Since MongoDB did not support multiple collection join yet, I would try the subquery approach first.

  • method 1 subquery
    Let us write SQL to show the logic

select studentname
from student
where _id in (select sid from studentcourse where cid in (select _id from course where coursename = “Big Data” ))

Unlike RDBMS, there is no one step solution, we have to use variable in MongoDB

//1. get course big data info, this is one to one relation, use findOne, notice courseid = ‘b’
var course_BigData = db.course.findOne({coursename:”Big Data”});
course_BigData

//2. get studentid because this is one to many relation, need to use loop to put all sid into one array for later use (this step identify studentid is 1 and 4)
var StudentCourses = db.studentcourse.find({cid:course_BigData._id});
var studentIDs = [];
while (StudentCourses.hasNext() == true) {
var StudentCourse = StudentCourses.next();
studentIDs.push(StudentCourse.sid);
}
studentIDs

//3. get studentname list, so John and Terry take the big data course
db.student.find({_id:{$in:studentIDs}}, {studentname:1, _id:0});

  • Method 2 join
    Again let us write SQL to show logic
    select studentname
    from student s
    inner join studentcourse sc on s._id = sc.sid
    inner join course c on sc.cid = c._id where c.coursename = ‘Big Data’
  1. From MongoDB 3.2, it starts to support join using aggregation $lookup. it works like pipeline. In brief, $match fiter course collection with coursename = ‘Big Data’, $lookup join course and studentcourse collection with _id and cid and embed the join result as studentcourse_docs. Since I only need sid, I $project this filed. Because it is in the array format, I use $unwind to unwind the array
  2. extract the sid as an array for later use, this is similar to previous method

  3. same as last method

//1 var studentcoursedocs = db.course.aggregate([
{$match: { coursename: “Big Data” }} ,
{
$lookup:
{
from: “studentcourse”,
localField: “_id”,
foreignField: “cid”,
as: “studentcourse_docs”
}
},
{ $project : { _id: 0, “studentcourse_docs.sid” : 1 } },
{ $unwind : “$studentcourse_docs”}
]);
//2
var studentIDs = [];
while (studentcoursedocs.hasNext() == true) {
var studentcoursedoc = studentcoursedocs.next();
studentIDs.push(studentcoursedoc.studentcourse_docs.sid);
}
studentIDs;

Hope this post can help someone, :)

Wenlei

Written on May 20, 2017