How Is China Telecom Advancing Its Large Language Models?

Overview

Today, whether state-owned or private, organizations often discuss large models. However, according to a research team at Communications First, relatively few are actually building them: many are waiting, unsure how to proceed, or merely talking about it.

Major internet companies (Baidu, Alibaba, Tencent) have mature large models and are among the first national-level contributors.

Core requirements for large models

Developing large models requires three core elements: data, models, and compute. Among central state-owned enterprises, telecom operators are well positioned because the three major carriers hold massive, native data; model size can be scaled to the hundreds of billions of parameters; and they have in-house IDC capacity to provide baseline compute. Thus, carriers have structural advantages. The question is not whether they can build models, but how to build, optimize, and scale them effectively.

Organization

China Telecom established China Telecom Artificial Intelligence Technology Co., Ltd. in August 2023 to expand into the artificial intelligence domain. The company is positioned to manage and operate the group’s big data and AI capability development. China Telecom is integrating internal and external resources, pursuing in-house algorithm and model research, and developing layered services including atomic services, industry applications, platform management, and intelligent hardware product systems to support scaled deployment and application of AI products and solutions.

Personnel

According to Communications First’s internal sources at China Telecom, the new AI unit currently has a research team of over 200 people. Although the team size is still growing, the group has allocated additional headcount to support expansion, indicating organizational commitment. China Telecom emphasizes in-house development of high-end technologies rather than full reliance on third parties, while remaining open to partner collaboration for talent and capability complementarity.

Structural advantages

China Telecom’s advantages include: 1) data resources; 2) cloud-network integration and an existing cloud footprint that support model development; and 3) the ability to integrate large models with core services and digital products, starting internally and extending to industry applications. Turning these advantages into competitive outcomes depends on execution.

Progress to date

1. Xingchen model at the hundred-billion parameter scale

China Telecom’s Xingchen large model series began in October 2021 with a city-governance research model and has evolved through semantic, multimodal, speech, and next-generation digital-human stages. In June 2023 a 10-billion-parameter model was released, and four months later a 100-billion-parameter model was announced, reflecting rapid development.

2. Model quality and capabilities

Model quality is commonly measured by hallucination rate. China Telecom reports that the Xingchen semantic model introduced a multi-turn hallucination mitigation approach that reduced hallucination by about 40%. The group is also developing the Xingchen speech model and a multimodal generative model, emphasizing more efficient training, improved understanding, richer generation, and better controllability to support a range of digital applications. China Telecom’s AI unit continues to advance in-house large-model research and parameter scaling to improve cross-domain generalization and feature learning.

3. Next-generation 3D digital humans

China Telecom’s 3D digital-human generation pipeline supports minute-level fully automatic reconstruction and adaptive skinning transfer. Reported metrics include an average 3D vertex error below 1 mm. A photorealistic 3D digital human can be generated from a few photos, with facial shape, wrinkles, features, and skin texture closely reproduced. Production time can be reduced from about one month to three days, and manual steps are reported to decrease by around 80%. The company has developed a semi-automatic topology binding workflow capable of stable, fine-grained topology at the eye and mouth corners, supporting micro-expression level motion. Combined with in-house driving and rendering engines, this enables more nuanced motion and emotion expression.

4. Ecosystem openness

At a recent Digital Ecology Technology Cooperation Conference, China Telecom announced that the Xingchen conversational model is being made externally accessible. The company also indicated plans to release multiple base-model variants at 3B, 7B, and 12B parameter scales, along with over 1T of training data, and stated an intention to open-source a hundred-billion-parameter model in early 2024. These steps are intended to support a more open development and application ecosystem.

Summary

China Telecom has aligned organizational structure, personnel, data, and compute resources to pursue large-model development. It has released scaled models, introduced mitigation measures for hallucination, advanced multimodal and speech capabilities, developed generative 3D digital-human techniques, and announced openings for parts of its model ecosystem. These actions indicate a sustained and organized effort rather than a temporary initiative.