Alibaba Cloud's Canal Mesh Claims to Outperform Rivals, But Is It Only for Chinese Customers?
Alibaba Cloud, the Chinese cloud giant, has claimed its in-house Kubernetes service mesh, Canal Mesh, significantly outperforms Google's Istio and other competing tools. The company unveiled Canal Mesh at last week's Association for Computing Machinery SIGCOMM conference in Sydney, Australia, presenting both a detailed paper and a demonstration.
The presentation focused on the challenges of connecting Kubernetes pods in microservices environments, highlighting the reliance on service meshes and their associated proxy "sidecars" for managing network communication and collecting traffic telemetry. Alibaba Cloud argues that sidecars introduce various problems, including intrusion into user pods, excessive resource consumption, overhead in managing numerous sidecars, and performance degradation due to traffic routing through the sidecar.
The company conducted a case study analysing the resource usage of Istio in a Kubernetes cluster with 500 nodes and 15,000 pods, finding it consumed 1,500 cores and 5,000 gigabytes of memory â a substantial 10% of the hardware resources. In other scenarios, Alibaba Cloud observed that the sidecar's CPU and memory demands sometimes even exceeded those of the application itself.
Recognising these issues, Google introduced Ambient Mesh in 2022, an Istio data plane mode aimed at reducing the reliance on sidecars. While Ambient Mesh improved performance and resource usage, it still required some proxies to reside within the user cluster.
Alibaba Cloud, however, believes complete decoupling of the service mesh from user clusters is the most effective solution. To demonstrate this, they developed Canal Mesh, positioning it as a superior alternative to Istio and Ambient Mesh.
Their research paper claims Canal Mesh achieved the following impressive results:
Throughput: 12.3x and 2.3x higher than Istio and Ambient, respectively, with latency 1.7x and 1.3x lower.
CPU Consumption: 12x-19x and 4.6x-7.2x lower than Istio and Ambient, respectively.
Configuration Time: 1.5x-2.1x and 1.2x-1.5x faster than Istio and Ambient, respectively, when creating hundreds of pods.
Southbound Bandwidth: 9.8x and 4.6x lower than Istio and Ambient, respectively.
These impressive figures are achieved by moving proxies out of the user cluster, while retaining a minimal on-node proxy for security and observability tasks. Canal Mesh also leverages eBPF-based kernel bypass and remote mTLS acceleration, and utilizes Alibaba Cloud's hyperscale capabilities to strategically place proxies across its resource pools.
While the paper and presentation state that Canal Mesh has been operational within Alibaba Cloud for a year, they do not confirm whether it is currently in production. Furthermore, both sources omit any mention of code availability for review or implementation. However, the presentation does include contact information for those interested in learning more.
It remains to be seen whether Alibaba Cloud will make Canal Mesh available to the wider community. If they choose to keep it exclusively for their own use, it could potentially provide a significant performance advantage over competitors. However, given Alibaba Cloud's focus on the Chinese market, it is unlikely to be widely adopted outside the country, where its rivals, AWS, Google, and Azure, hold a strong presence. While Alibaba Cloud competes with these companies in some markets, its customer base outside China primarily consists of companies with existing ties to China, making them more comfortable with Alibaba Cloud than buyers from other regions.
The future of Canal Mesh remains uncertain, but its impressive claims and innovative approach to service mesh architecture will undoubtedly be closely watched by the industry.