Definition and context
Definition of containers: containers are intended to address the problem of how to ensure software runs correctly when switching runtime environments.
Containers and Docker remain prominent topics in technology. Stateless service containerization is a prevailing trend, and it has prompted debate: should the MySQL database be containerized?
Overview
Supporters often argue from the general advantages of containers without validating their views against specific business scenarios. Opponents cite factors such as performance and data safety and provide examples of scenarios where containerizing MySQL may not be appropriate. The following summarizes several reasons why Docker may be unsuitable for running MySQL.
Data safety
Do not store data inside containers. Containers can be stopped or removed at any time, and data inside a removed container is lost. To avoid data loss, users can mount volumes.
However, Docker volumes are designed to provide persistent storage around union filesystem image layers, and they do not guarantee data safety in all failure modes. If a container crashes and the database is not shut down cleanly, data corruption can occur. In addition, shared data volume groups in containers may cause increased wear on physical hardware.
Performance
MySQL, as a relational database, is I/O intensive. When a single host runs multiple instances, their I/O demands add up and can create I/O bottlenecks, significantly reducing MySQL read/write performance.
In a panel on Docker application challenges, a bank architect noted: "Database performance bottlenecks typically occur at the I/O layer. If multiple Docker instances follow the same approach, those I/O requests will still converge on the storage. Many internet databases use a share-nothing architecture, which may be a reason not to migrate to Docker."
There are strategies to mitigate this issue, for example:
- Separate database binaries from data: run the database binaries in containers and place data on shared storage. If a container or MySQL service fails, a new container can be started. It is generally not recommended to store critical data on the host filesystem that is directly shared with containers, as this can increase risk to the host.
- Run lightweight or distributed databases in containers: Docker platforms typically handle service failures by starting a new container automatically rather than repeatedly restarting the service inside the same container.
- Place high-I/O databases on dedicated hosts: for applications with high I/O requirements, deploying the database on physical machines or full virtual machines is often more appropriate than using containers. Some large-scale distributed database deployments are run directly on physical machines rather than on Docker.
Statefulness
Horizontal scaling in Docker is mainly suitable for stateless compute services, not for databases. A core characteristic of Docker-based rapid scaling is statelessness; services that maintain state are not suited to being placed directly inside containers without separate storage services. If a database is containerized, storage must be provided independently of the container.
Resource isolation
Docker does not provide the same level of resource isolation as full virtual machines. Docker uses cgroups to limit resource usage, which constrains maximum consumption but does not prevent other processes from occupying resources. If other applications on the host consume excessive resources, MySQL performance inside a container will be affected.
The more isolation required, the greater the resource overhead. Compared with a dedicated environment, one advantage of Docker is easy horizontal scaling; however, as noted, that scaling is primarily applicable to stateless services rather than databases.
When can MySQL run in containers?
MySQL is not categorically unsuitable for containerization. Possible scenarios where containerized MySQL can be appropriate include:
- Workloads that are tolerant to occasional data loss (for example, some search or caching services) can use sharding to increase instance count and throughput.
- Lightweight or distributed databases that are designed for container environments can benefit from container orchestration, which starts new containers when services fail rather than repeatedly restarting services inside the same container.
- Databases managed by middleware and container orchestration systems that provide automatic scaling, failover, and multi-node redundancy can be containerized, provided storage and state management are handled outside the container.