Relationships

2022-04-15

2022-11-12

MongoDB

Association (similar to sql)

In sql, the association may be 1-to-many, and the many parts may be 0, 1, and Many. The data in Mongodb is stored in the form of an array, so it may be 0 data, regardless of whether it is 0 or not. The principals associated in Mangodb are documents in the collection.

There are one-to-one, one-to-many, and many-to-many associations in MongoDB. Associations are further divided into embedded and referenced.

Embedded: directly embed another document into the current document and become an Object of it.

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
name: "derek"
address: Object
city: "sydney"
address: "01234 Yesenia Loop"
postcode: 3456

The above is a one-to-one situation. In the one-to-many case, multiple additional documents form an array, each element of which is an object.

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
name: "derek"
address: Array
  0: Object
city: "sydney"
address: "01234 Yesenia Loop"
postcode: 3456
  1: Object
city: "brisbane"
address: "166 adelaide street"
postcode: 4000

The many-to-many case is more complicated than the reference form. Because when you need to modify the embedded Object, because of many-to-many, it corresponds to multiple documents, then you need to scan the entire collection. So in many-to-many, the reference form is generally used.

Reference : The reference is in one document and stores the _id of another associated document. One-on-one time:

Reference

1
2
3

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
name: "derek"
address: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxx111")

In another collection

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
city: "sydney"
address: "01234 Yesenia Loop"
postcode: 3456
student: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")

In a one-to-many case, the _ids of multiple other documents will form an array, each element being an _id.

Reference

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
name: "derek"
address: Array
  0: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx1")
  1: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx2")

In another collection

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx1")
city: "sydney"
address: "01234 Yesenia Loop"
postcode: 3456
student: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
------------------------------------------------
_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx2")
city: "brisbane"
address: "166 adelaide street"
postcode: 4000
student: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")

There is no change in the many-to-many case, except that the other multiple documents also use an array to represent the associated documents.

Reference

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
name: "derek"
address: Array
  0: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx1")
  1: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx2")

In another collection

_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx1")
city: "sydney"
address: "01234 Yesenia Loop"
postcode: 3456
student: Array
  0: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
  1: ObjectId("yyyyyyyyyyyyyyyyyyyyyyyyy")
---------------------------------------------
_id: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx2")
city: "brisbane"
address: "166 adelaide street"
postcode: 4000
student: Array
  0: ObjectId("xxxxxxxxxxxxxxxxxxxxxxxxxxx")
  1: ObjectId("yyyyyyyyyyyyyyyyyyyyyyyyy")

Two-way binding and one-way binding in Reference

If you use document1 (such as students), you only need to know document2 (such as address), and you don’t need to know which document1 (student) is in document2 (address) at all, just use a one-way association.

But if you want to search document1’s information from document2, you need two-way binding.

Normalization and de-Normalization

De-Normalization is duplication, such as duplicating another document data in Embedded, and normalization is the opposite. In practice, it is common to copy a small piece of data in another document, such as city in address, without copying other details in address.

The advantage of this is that there is no need to use the populate form to fetch data through the associated operation document (performance-intensive), and the query time is faster. But the disadvantage is that if the address is updated, the copied information needs to be updated. There is a trade-off, considering which operations users use read/write more.

one to millions relationship

An extreme example of unilateral binding.

There is a 16mb limit for a single document in Mongodb, some smaller in other databases. For example, there is one data file and one million sensor files. At this time, data cannot be stored in the sensor file because it takes up too much space. At this time, the sensor should be bound one-way with data.

Embedded and referenced usage scenarios
1. If you can do embedded, try to do it, especially in the case of one-to-one relationship.
2. If it is one to several, you can consider embedded, but it depends on the scene. If you think the number of couples in one to couple will not expand to hundreds, you can use it.
3. If it’s a pair of hundreds or more, consider using citations
4. If it is a pair of millions, consider unilateral binding references
5. If reading requirements are much greater than writing requirements, de-normalization should be considered to copy frequently read data to improve query efficiency.

Index

The purpose of index is to help sort and find faster. Two common types of indexes are sort ordering and unique uniqueness.

By default, there is only _id index (guaranteed uniqueness) in the collection.

Aggregation

Aggregation pipeline is to modify or add to a certain raw data, similar to middleware, here called stage. Data can be filtered/added and passed to subsequent stages.

Aggregation operations can be implemented entirely on the server side, but the performance is much worse. Because databases are designed to manipulate data, and Node.js is not. Therefore, the operation of data should be done on the database side as much as possible. Also, all data is manipulated on the server side, a lot of data is transferred, and a lot of data is useless. The best way is to operate the data on the database side now, and then pass it to the server.

Transaction

Transactions are combined data operations, such as bank transfers, adding money and subtracting money to be combined in a transaction. It can guarantee that different data operations succeed or fail at the same time. The transaction function can be added later.

Derek Zhu

DatabaseMongoDB

Relationships