Detailed Explanation of Docker's Five Storage Driving Principles

Docker first adopted aufs as the file system. Thanks to the concept of aufs layering, multiple containers can share the same image. However, aufs is not incorporated into the Linux kernel and only supports Ubuntu. In consideration of compatibility, storage drivers are introduced into docker version 0.7. At present, docker supports five storage drivers: aufs, Btrfs, device mapper, overlayfs and ZFS. As stated on docker's official website, there is no single driver suitable for all application scenarios. Only by selecting the appropriate storage driver according to different scenarios can docker's performance be effectively improved. How to select a suitable storage driver needs to understand the storage driver principle before making a better judgment. This paper introduces the detailed explanation of the five storage driver principles of docker and the comparison of application scenarios and IO performance tests. Before talking about the principle, let's talk about the two technologies of write time replication and write time allocation.

Copy on write (cow)

The technology used by all drivers - copy on write (cow). Cow is copy on write, which means to copy only when writing is needed. This is a modification scenario for existing files. For example, if multiple containers are started based on an image, if each container is allocated a file system with the same image, it will occupy a lot of disk space. Cow technology allows all containers to share the file system of the image, and all data is read from the image. Only when the file is to be written, the file to be written is copied from the image to its own file system for modification. Therefore, no matter how many containers share the same image, the write operation is performed on the replica copied from the image to their own file system, and the source file of the image will not be modified. If multiple containers operate on the same file, a replica will be generated in the file system of each container. What each container modifies is its own replica and isolated from each other, They don't affect each other. Using cow can effectively improve disk utilization.

Allocate on demand

Write time allocation is used in scenarios where there is no such file. Space is allocated only when a new file is to be written, which can improve the utilization of storage resources. For example, starting a container does not pre allocate some disk space for the container, but allocates new space as needed when a new file is written.

AUFS

Aufs (another union FS) is a union FS, which is a file level storage driver. Aufs can transparently cover the layered file system of one or more existing file systems, and merge multiple layers into a single-layer representation of the file system. Simply put, it supports mounting different directories to the file system under the same virtual file system. This file system can overlay and modify files layer by layer. No matter how many layers below are read-only, only the top file system is writable. When a file needs to be modified, aufs creates a copy of the file, uses cow to copy the file from the read-only layer to the writable layer for modification, and the results are also saved in the writable layer. In docker, the lower read-only layer is image, and the writable layer is container. The structure is shown in the figure below:

Overlay

Overlay is supported by Linux kernel after 3.18. It is also a kind of union FS. Unlike aufs, overlay has only two layers: an upper file system and a lower file system, representing the image layer and container layer of docker respectively. When a file needs to be modified, use cow to copy the file from the read-only lower to the writable upper for modification, and the results are also saved in the lower layer. In docker, the lower read-only layer is image, and the writable layer is container. The structure is shown in the figure below:

Device mapper

Device mapper is supported by Linux kernel after 2.6.9. It provides a mapping framework mechanism from logical devices to physical devices. Under this mechanism, users can easily formulate and implement storage resource management strategies according to their own needs. Aufs and overlayfs mentioned earlier are file level storage, while device mapper is block level storage. All operations are direct operations on blocks, not files. The device mapper driver will first create a resource pool on the block device, and then create a basic device with a file system on the resource pool. All images are snapshots of the basic device, and the container is a snapshot of the image. Therefore, the file system in the container is a snapshot of the file system of the basic device in the resource pool, and there is no space allocated for the container. When a new file is to be written, a new block is allocated in the container's image and data is written. This is called time allocation. When you want to modify an existing file, use cow to allocate block space for the container snapshot, copy the data to be modified to a new block in the container snapshot, and then modify it. The device mapper driver will create a 100g file by default, including images and containers. Each container is limited to 10g volumes and can be configured and adjusted by itself. The structure is shown in the figure below:

Btrfs

Btrfs is called the next generation write time copy file system and incorporated into the Linux kernel. It is also a file level storage, but it can directly operate the underlying device like device mapper. Btrfs configures a part of the file system as a complete sub file system, which is called subvolume. With subvolume, a large file system can be divided into multiple sub file systems. These sub file systems share the underlying device space and allocate it from the underlying device when disk space is needed, just like an application calling malloc () to allocate memory. In order to make flexible use of the device space, Btrfs divides the disk space into multiple chunks. Each chunk can use different disk space allocation policies. For example, some chunks only store metadata, and some chunks only store data. This model has many advantages. For example, Btrfs supports dynamic addition of devices. After adding a new disk to the system, you can use the Btrfs command to add the device to the file system. Btrfs regards a large file system as a resource pool and configures it into multiple complete sub file systems. It can also add new sub file systems to the resource pool. The basic image is the snapshot of the sub file system. Each sub image and container has its own snapshot, and these snapshots are the snapshots of subvolume.

When a new file is written, a new data block is allocated in the snapshot of the container. The file is written in this space, which is called time allocation. When you want to modify an existing file, use cow copy to allocate a new original data and snapshot, change the data in the newly allocated space, and then update the relevant data structure to point to the new sub file system and snapshot. The original original data and snapshot have no pointer and are overwritten.

ZFS

ZFS file system is a revolutionary new file system. It fundamentally changes the management mode of file system. ZFS completely abandons "volume management" and no longer creates virtual volumes. Instead, it centralizes all devices into one storage pool for management and uses the concept of "storage pool" to manage physical storage space. In the past, file systems were built on physical devices. In order to manage these physical devices and provide redundancy for data, the concept of "volume management" provides a single device image. ZFS is created on a virtual storage pool called "zpools". Each storage pool consists of several virtual devices (vdevs). These virtual devices can be raw disks, a RAID1 mirror device, or multi disk groups with non-standard RAID levels. The file system on zpool can then use the total storage capacity of these virtual devices.

Let's take a look at the use of ZFS in docker. First, a ZFS file system is allocated from zpool to the basic layer of the image, while other image layers are clones of the ZFS file system snapshot. The snapshot is read-only and the clone is writable. When the container is started, a writable layer is generated at the top level of the image. As shown in the figure below:

When you want to write a new file, use on-demand allocation. A new data is quickly generated from zpool, the new data is written to this block, and the new space is stored in the container (ZFS clone).

When you want to modify an existing file, use copy on write to allocate a new space and copy the original data to the new space to complete the modification.

AUFS VS Overlay

Both aufs and overlay are federated file systems, but aufs has multiple layers, while overlay has only two layers. Therefore, when copying on write, if the file is large and there are lower layers, Ausf may be slower. Moreover, overlay is incorporated into the Linux kernel mainline, and aufs does not, so it may be faster than aufs. However, overlay is still too young and should be used cautiously in production. As the first storage driver of docker, aufs has a long history, is relatively stable, has been practiced in a large number of production, and has strong community support. Currently, open source DC / OS specifies to use overlay.

Overlay VS Device mapper

Overlay is a file level storage and device mapper is a block level storage. When a file is very large and the modified content is very small, overlay will copy the whole file regardless of the modified content size. Modifying and displaying a large file takes more time than a small file, while block level only copies the blocks that need to be modified, not the whole file, In this scenario, device mapper is obviously faster. Because the block level directly accesses the logical disk, it is suitable for IO intensive scenarios. The performance of overlay is relatively stronger in the scenario of complex internal program, large concurrency but less io.

Device mapper VS Btrfs Driver VS ZFS

Both device mapper and Btrfs operate directly on blocks and do not support shared storage, which means that when multiple containers read the same file, they need to live multiple copies, so this storage driver is not suitable for use on the PAAS platform of high-density containers. Moreover, when many containers are started and stopped, it may lead to disk overflow and make the host unable to work. Device mapper is not recommended for use in production. Btrfs can be very efficient in docker build.

ZFS was originally designed for salaris servers with a large amount of memory. It will affect the memory when used. It is suitable for environments with large memory. ZFS cow makes the fragmentation problem more serious. For large files generated by sequential writing, if some of them are randomly changed in the future, the physical address of the file on the hard disk will no longer be continuous, and the performance of sequential reading will become poor in the future. ZFS supports multiple containers to share a cache block, which is suitable for PAAS and high-density user scenarios.

Test tool: iozone (a benchmark tool for file systems, which can test the read and write performance of file systems in different operating systems).

Test scenario: sequential and random IO performance from 4K to 1g files.

Test method: start the container based on different storage drivers, install iozone in the container, and execute the command:.

Definition and interpretation of test items

Write: tests the performance of writing to a new file.

Re write: test the performance of writing to an existing file.

Read: test the performance of reading an existing file.

Re read: test the validity of reading a recently read file

Detailed Explanation of Docker's Five Storage Driving Principles 1

hot searches
US Laptop Keyboard For Acer Aspire one ZA3 Notebook Keyboards Replacement Black Original Laptop Replacement Keyboard For Asus Eee PC EPC700 Notebook Keyboards Black Laptop Keyboard For Samsung NP370R5E Notebook Keyboards Replacement Black Original Laptop keyboard For Dell Latitude D620 Notebook Keyboards Rplacement Black Laptop Keyboard For Acer Aspire V5-571 Notebook Keyboards Replacement Black Original Laptop Replacement Keyboard For Asus Q304 Notebook Keyboards Silver Laptop Adapter 90W 19.5V 4.7A DC 6.0*4.4mm Laptop AC Adapter Power Charger For Sony Vaio VGP-AC19V10 VGP-AC19V11 VGN-AX VGN-BX Laptop Keyboard Replacement For Dell Latitude E5540 Notebook Keyboards Laptop Keyboard Replacement For Apple Macbook A1502 Notebook Keyboards NEW US Layout Laptop Keyboard For Acer SW5-013 Notebook Keyboards Replacement Black
hot articles
Laptop Battery External Charger Connector Cable - How to Use the Best One for Your Needs
53
How You Can Make Money on Laptop Screen Protective Film Products
52
What Are the Advantages and Disadvantages of Replacement Laptop Keyboard Keys?
52
Why to Choose a Custom Laptop Keyboard for Your Home
51
A Complete Guide to the Different Kinds of Dell Inspiron Keyboard Key
51
Shenzhen Led Small Spacing Display Screen Manufacturer
50
5 Top Tips When It Comes to Laptop Screen Supplier
50
Why to Choose a Lenovo Laptop Keyboard for Your Home
50
What Is the RF Connector_ RF Connector Classification and Specification Introduction
50
What Are the Best Dell Laptop Keyboard for 2021?
50
five storage driving principles related articles
Shouzhouming Technology Provides Naked Eye 3D Display Screen for the First World Aviation Business C
Case Study of LED Display Screen Project
Shenzhen Led Small Spacing Display Screen Manufacturer
How to Quickly Solve the Heat Dissipation Problem of LED Display Screen
How to Choose a Led Display Screen for Your Premies to Attract Customer
Contact US

    Bao’an District  Shenzhen City, China

     +86 189 3806 5764

NEWSLETTER SUBSCRIBE
no data
Copyright © 2021 CrowBerry

   

chat online
NEED HELP? WE'RE HERE!