Tag Archive: instruction set architecture



China's homegrown instruction set and CPU

According to reports from various industry sources, the Chinese government has begun the process of picking a national computer chip instruction set architecture (ISA). This ISA would have to be used for any projects backed with government money — which, in a communist country such as China, is a fairly long list of public and private enterprises and institutions, including China Mobile, the largest wireless carrier in the world. The primary reason for this move is to lessen China’s reliance on western intellectual property.

There are at least five existing ISAs on the table for consideration — MIPS, Alpha, ARM, Power, and the homegrown UPU — but the Chinese leadership has also mooted the idea of defining an entirely new architecture. The first meeting to decide on a nationwide ISA, attended by government officials and representatives from academic groups and companies such as Huawei and ZTE, was held in March. According to MIPS vice president Robert Bismuth, a final decision will be made in “a matter of months.”

Shenwei SW1600, Alpha CPU found in Sunway BlueLight MPPChina has a long history with MIPS and Alpha. Loongson processors, which power millions of Chinese school computers, use MIPS — and the ShenWei processors (pictured right) found in China’s first homegrown supercomputer, theSunway Bluelight MPP, are based on the Alpha ISA. MIPS Technologies (the company) hasn’t been doing very well recently, and it’s rumored that the Sunnyvale-based company could be up for sale — a purchase I’m sure the Chinese government could afford.

According to EE Times, there are some 34 ARM licensees in China, but at $5 million for a single Cortex-A9 core license, it’s unlikely that ARM will be China’s choice. The Power ISA is cheaper, but lacks the software ecosystems that ARM and MIPS enjoy. ShenWei/Alpha is also a possibility, but again it cannot compete with MIPS’ installed base.

The other option, of course, is developing a brand new ISA — a daunting task, considering you have to create an entire software (compiler, developer, apps) and hardware (CPU, chipset, motherboard) ecosystem from scratch. But, there are benefits to building your own CPU architecture. China, for example, could design an ISA (or microarchicture) with silicon-level monitoring and censorship — and, of course, a ubiquitous, always-open backdoor that can be used by Chinese intelligence agencies. The Great Firewall of China is fairly easy to circumvent — but what if China built a DNS and IP address blacklist into the hardware itself?

Taking a leaf out of South Korea’s hardcore gaming scene, what if the Chinese government decided to implement a hardware-level 10pm curfew for video games? Or some code that automatically turns negative mentions of Hu Jintao (the Chinese president) into positives, and inserts a few honorifics at the same time. Or a latent botnet of hundreds of millions of computers that can be activated upon the commencement of World War III. Or, or, or…


Surya R Praveen ICube UPU processor

New CPU architectures don’t come along very often — which is why more details on the Harmony Unified Processing Architecture being built by Chinese developer ICube are so interesting. Historically, instruction set architectures (ISAs) are risky bets. Not only are they exceptionally difficult to design, it takes an enormous additional effort to create tools that can leverage new capabilities. Even then, companies face an uphill fight to persuade vendors and software developers to recompile existing software to take advantage of the new design.

ICube is led by Fred Chow and Simon Moy. Chow is primarily a software designer and was chief architect of the Open64 compiler and the specific Pathscale iteration of that product, while Moy was a top-line engineer with Nvidia for seven years and worked on both the first GPUs as well as the G80. Details on ICube’s silicon are still limited, but the expertise of the two men helps shed a bit of light on what the chip looks like.

The Harmony Unified Processing Architecture (and the first iteration of that architecture, the IC1) are described as consisting of “the Multi-Thread Virtual Pipeline parallel computing core (MVP), an independent instruction set architecture, an optimizing compiler, and the Agile Switch dynamic load balancer.” Elsewhere, the chip is described as a “parallel computing stream processor core.” We also know, based on available literature, that the chip uses both SMP (Symmetric Multi-Processing) and SMT (Simultaneous Multi-Threading).

Surya R Praveen IC1 Comparison

VR-Zone describes the chip as an “elegant 32-bit RISC core, not unlike the original MIPS.” The IC1 implements 4-way SMT; each core can operate on up to four threads. The UPU approach means that execution resources, memory space, and register data is shared across the entire chip — there’s no such thing as a “CPU workload” versus a “GPU workload.”

Surya R Praveen iCube Roadmap

The IC1 is designed for handheld and mobile devices and runs Android. The company’s efforts in this area could be seen as the “other” arm of China’s initiative to develop its own competitive CPU architectures. Much of the research to date has focused on the country’s Loongson/Godson-3 processors, which can be found in China’s homegrown supercomputers, but these are chips intended for mainstream PC form factors and homegrown supercomputers. ICube’s IC1 gives China a homegrown alternative for building its own phones and devices rather than being beholden to foreign companies for hardware.

Where are the x86 versions?

In AMD’s case, on the way — but not for a few years. Intel’s plans on this front are less clear. Larrabee, Intel’s onetime GPU project that became the basis for the Knights Corner Many Integrated Core (MIC) co-processor, was a CPU-GPU hybrid. There’s no reason Intel couldn’t eventually integrate a MIC-style design alongside a conventional CPU architecture.

The question of whether or not AMD and Intel would ever adopt a homogeneous approach to CPU and GPU calculations is interesting — but we’re inclined to think they wouldn’t. The entire reason GPUs evolved in the first place is that it makes more sense to do certain types of work with specialized architectures.

Shrinking process technology may have made it cost efficient to reintegrate those functions on-die, but no one has yet designed a traditional x86 CPU that delivered high-end GPU performance. It simply may not make sense to do so.

Source