2026-01-06 15:21:29

The new generation AI supercomputing chip architecture has been officially released, achieving significant breakthroughs in performance metrics. Compared to the previous generation, inference phase costs have been reduced to one-tenth, marking a turning point for the economics of large-scale model deployment. At the same time, the number of GPUs required for training has been cut by 75%, meaning enterprises can accomplish the same computational tasks with less hardware. Energy efficiency has increased fivefold, significantly reducing power consumption and heat dissipation under the same computing power.

Innovations at the technical architecture level are equally impressive—this is the first time confidentiality computing capabilities have been achieved at the rack level. The interconnection bandwidth between GPUs has reached an astonishing 260 TB/s, a data flow rate sufficient to support ultra-large-scale parallel computing scenarios. The entire platform has been thoroughly redesigned, abandoning traditional cable hoses and fan solutions in favor of a more compact and efficient hardware organization. The core engine consists of six modular components, offering greater flexibility for customization and expansion. The release of this generation will undoubtedly reshape the cost structure and deployment methods of the AI computing market.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

18 Likes

Reward
18
7
Repost
Share

Comment

0/400

GateUser-6bc33122

· 01-07 00:38

One-tenth of the cost? Now big model startups really have a chance.

View OriginalReply0

RektHunter

· 01-06 15:51

Wow, reducing inference costs to one-tenth? Small businesses can now play with large models too. The previous monopoly on computing power is about to break. 260TB/s is incredible; communication between GPUs is so smooth... But can it really run stably? A 75% reduction in GPUs—what does that mean? The saved electricity and hardware costs... Never mind, I don't want to think about it. It's going to hype up again. If this thing really performs this well, the industry landscape will have to change.

View OriginalReply0

SandwichTrader

· 01-06 15:50

One-tenth of the cost? Now large models are really going to compete intensely --- 260TB/s sounds great, but can the cooling really be handled? --- GPU cut by 75%, what does this mean? Small and medium-sized enterprises can finally play with AI? --- Both modular and confidential computing, this architecture doesn't seem that simple --- Fivefold increase in energy efficiency? So all that electricity was wasted before, haha --- Reshaping the cost structure, isn't it just to grab market share? Same old story --- Is 260TB/s real? With this speed, it can run anything, right? --- I believe one-tenth of the cost, but has the upstream hardware cost really decreased? --- Abandoning fan solutions, is the new cooling method reliable? Don't let there be problems again --- Finally, someone is working on costs. The previous solutions were ridiculously expensive

View OriginalReply0

tokenomics_truther

· 01-06 15:44

260 TB/s? That number sounds unbelievable, but if we can really cut the inference cost down to one-tenth, miners will have a chance.

View OriginalReply0

MEVictim

· 01-06 15:41

One-tenth of the cost? If that's true, it should have appeared long ago. Don't let it be just on paper again.

View OriginalReply0

OnchainArchaeologist

· 01-06 15:39

One-tenth of the cost? This means big model startups are no longer burning money, finally able to breathe GPU costs cut by 75%, is this really true... corporate expenses are directly halved 260 TB/s bandwidth is outrageous, data flow is no longer a bottleneck Fivefold improvement in energy efficiency, cooling is finally not so crazy, amazing Modular design is imaginative, with large customization potential in the future Inference costs reduced to one-tenth, this update truly rewrites the game rules

View OriginalReply0