18 5 月, 2024


Processing Machinery

Nvidia’s true face

13 min read

After GTC 2022, the general public rushed to tell the news that Huang Renxun brought his new “nuclear bomb” to “bomb the streets” again.

The new products and technologies released have been introduced in detail in many articles. In one word: awesome! Among them, the most explosive is the H100 GPU, which uses TSMC’s 4-nanometer process technology and integrates 180 billion transistors. The floating-point computing power is three times faster than the previous generation A100. It is regarded as a new generation of Nvidia’s “nuclear bomb”.

All of a sudden, with the sound of gongs and drums and firecrackers, the industry is looking forward to the “computing power monster”, and consumers are also ushering in the day when the price of graphics cards will be reduced.

But calm down and think about it, is Nvidia a benevolent person who “continues computing power for AI and sends warmth to everyone”? Both fans and spectators must admit that Nvidia is a business wizard. As the most legendary digital economy stock, its revenue is much lower than that of Intel or Meta, but its market value is far ahead. This is obviously not something that can be explained by “technical beliefs”.

According to Huang Renxun himself, the launch of several milestone key technologies in Nvidia’s history is actually the “generalization” of the development results of its own GPU technology, and it is found that it can do more different things. .


It is a question of “chicken and egg” whether new demand leads to new product or new product activates new demand. But we can draw a tried-and-true template from the correlation of Nvidia’s actions and results:

When the mainstream demand has not reached its peak, expand and enrich the product line, even if it is “wearing a vest”, it must be fully deployed to maximize the value of the niche market; once the mainstream demand declines and ebbs, nuclear bomb-grade products that exceed the current market demand It will be thrown out (of course, the power consumption has also exploded), which will once again ignite the imagination of the public and Wall Street on GPU, and further push Nvidia’s valuation to a new high.

Compared with the “King of Rolling Kings”, Nvidia is more like Jiang Taigong who is sitting firmly on the Diaoyutai, clearly grasping the global consumers and the industry.

Four supply and demand “rebounds”, Nvidia’s road to computing supremacy

How did Nvidia achieve its dominance in computing power? Counting several milestones in history, we will find four key “bounces”.

The First Bounce: The Personal Computer Growth Storm.

During the period from its establishment in 1993 to 1999, Nvidia did not occupy a big lead in the graphics card market full of heroes.

At that time, there were a lot of manufacturers developing display chips. In addition to semiconductor giants such as IBM, Sony, and Toshiba, vertical tracks such as Matrox, 3dfx, Trident, and S3 Graphics once led the way. The NV1 and NV2 released by Nvidia successively were not very competitive, and they almost went bankrupt. Although TNT2 (also known as “NV5”) released in 1999 has won the performance crown, its speed is only 10% to 17% higher than that of NV4, and it has not widened the gap with its main competitor 3dfx Vodoo3.

So, the first “nuclear bomb” came, and Nvidia launched NV10, which is GeForce 256-the first professional graphics processing core, which directly blasted the PC game acceleration market.


Prior to this, GPU display chips were all fixed-function chips, and the emergence of GeForce 256 became the first “single-chip processor integrating conversion, lighting, triangle setting/clipping and rendering engines”, capable of processing At least 10 million polygons allow the GPU to take over the work of a large number of geometric calculations from the CPU, solve problems that cannot be solved by general computing, and greatly promote the demand for GPUs in PC games and creative design.

In order to give full play to the computing potential of GeForce 256, Nvidia also launched the Quadro framework based on this chip, serving professional graphics workstations to help creative and technical personnel work more efficiently. Then the programmable shader was introduced, allowing developers to exert more creativity on the GPU, such as 3D rendering, game development, special effects production, etc…

In Huang Renxun’s words: Encourage or mobilize the passion of people around the world, let them know what a 3D graphics processor is, and provide them with many tools for innovation.


In the second year after the release of GeForce256, Nvidia received an order from Microsoft to develop graphics cards for the Xbox video game console. After that, it relies on the market operation of “semi-annual update and one-year replacement”. The GeForce series product line has been continuously enriched and comprehensively deployed, covering all kinds of high-end, high-end and low-end markets. It has also learned to wear a “vest” and made slight improvements and improvements on the basis of the original chips, and it was quickly introduced to the market as a new series. It has made the competitors miserable, and Nvidia has thus occupied more than 70% of the GPU market.

By 2007, Nvidia’s market value had risen by more than 500%, and it was named Company of the Year by Forbes magazine.

The second rally: the strong thrust of parallel computing.

As early as 2006, Nvidia launched the revolutionary general-purpose computing architecture CUDA, and the general-purpose computing hardware Tesla GPU. But at that time, deep learning was not as popular as it is now. Only some large enterprises and research institutions needed GPUs for high-performance computing tasks such as drug discovery, weather modeling, and financial analysis.

When did Nvidia start to increase its efforts to activate the demand for GPU parallel computing capabilities? The answer is 2009.

This year, Nvidia held the first “GPU Technology Conference” to preach to “developers, engineers, and researchers who use GPUs to solve important computing tasks.”


During 2006-2009, what changes took place in the market? Ruled by Moore’s Law, individual consumers’ performance requirements for computer graphics cards have begun to burn out.

During this period, Nvidia actually released good products, such as the heavyweight Tegra mobile processor, which integrates ARM architecture processors and Geforce GPUs, and consumes 30 times less power than ordinary PC laptops. Although the product is good, it is difficult to arouse the enthusiasm of consumers. After all, there are still so many graphics cards on the market. As long as you are willing to wait, you can start with a better price.

At the same time, because some GPU defects integrated by OEMs into Apple, Dell, and HP notebooks lead to “abnormal failure rates” and become the object of class action lawsuits. In the first quarter of 2008 alone, Nvidia’s revenue decreased by about $200 million. The stock price also fell all the way from $37 to around $6.


So Nvidia began to increase its layout in the field of high-performance computing. At the first GPU technology conference, it launched the next-generation CUDA GPU architecture code-named “Fermi” Fermi, and vigorously promoted the advantages of GPU in large-scale parallel computing tasks.

The Fermi architecture is competent as a “nuclear bomb”. On the one hand, its performance is very high. The Geforce 4 series products based on this architecture have successfully suppressed competitors in terms of performance, but the power consumption and heat generation of this architecture are also very scary. .


In any case, since then, Nvidia has been very popular in the computing field. In 2010, Tianhe-1A, the world’s fastest supercomputer, used 7168 Nvidia’s Tesla M2050 GPUs, combining massively parallel GPUs with multi-core CPUs. Representative of heterogeneous computing.

In 2012, Geoffrey Hinton, one of the three deep learning giants, and his student Alex, used GPUs to accelerate the training of deep neural networks. They became a blockbuster in the ImageNet competition, opened the curtain of the third wave of artificial intelligence, and further boosted the sales of Nvidia GPUs. .


The growth in demand for AI has also helped Nvidia develop the automotive market, followed by Geforce GTX Titan released in 2013, which represents the top level of the Kepler architecture and has become the computing power foundation for self-driving cars and advanced driver assistance systems, supporting key computer vision capabilities.

The strong demand for artificial intelligence from academia to industry has pushed up GPU prices and Nvidia’s stock price, completing a stunning “rebound” in the world.


The third rebound: After the mine disaster, Turing broke the game.

Nvidia continues to lead in the field of deep learning, but don’t forget that the huge retail market is a cash cow.

At SIGGRAPH 2018 on August 14, 2018, Nvidia’s new generation of “Turing” architecture GPU came out, supporting real-time ray tracing, and the tensor core dedicated to AI accelerated computing was added to civilian 3D computing products for the first time, bringing a new era to the gaming world. Realistic lighting and reflection effects.

Prior to this, ordinary consumers have long been “digging hard for a long time.”


The explosion of digital currency has made mining a good business. GPUs were purchased by “mining factories” in large quantities, and the supply was in short supply for a while, coupled with the hoarding of scalpers, resulting in the market price of graphics cards being several times higher than the retail price. Those who can only look at the ocean and sigh. Until the time entered 2018, the tide of digital currency gradually faded, the sales of mining cards dropped sharply, the inventory of graphics cards increased, and the situation of hard-to-find cards eased.

Graphics card manufacturers are not only under the pressure of lowering prices to kill valuations, but also urgently need to dispel consumers’ wait-and-see mood as a large number of mining cards circulate into retail channels. At this time, Huang Renxun once again stimulated the market with “Nvidia’s most important innovation in the field of computer graphics in more than a decade”.

The new technology and architecture have pushed the performance of GPU and the market value of Nvidia to a whole new level.


The fourth rebound: the currency circle is bearish, but industrial intelligence is in the ascendant.

The biggest change this year is that the lack of cores has eased. At the same time, the currency circle has gone bearish again. The prices of Bitcoin and Ethereum have hit new lows, and GPU prices have fallen again. According to foreign media reports, Nvidia has notified partners that the production cost of GPU will be reduced by 8% – 12%. If the price of graphics cards falls further, the market’s confidence in Nvidia will also be overly reflected in the stock price.


However, you can always trust the capabilities of Nvidia’s “nuclear factory”. As you can see at the GTC 2022 conference, the new generation of products and technologies are all aimed at the most imaginative industrial AI application scenarios.

The latest generation of “nuclear bomb” H100 GPU and Hopper architecture are specially designed for Transformer large parameter models, which can be said to perfectly meet the needs of pre-training large models.


NVIDIA Quantum-2, the main chip contains 57 billion transistors, providing AI computing power for cloud service providers and supercomputing centers, corresponding to the construction of AI infrastructure in full swing.

The AI supercomputer NVIDIA Jetson AGXOrin is simply a “catch” for autonomous robot developers.

Not to mention the open platform NVIDIA Omniverse, which company that wants to be a metaverse can not be tempted? Meta built its first AI supercomputer using the Nvidia DGX A100 system.


With these “big pies”, Huang Renxun also confidently pointed out at the GTC conference that Nvidia has the ability to achieve trillions of dollars in sales revenue in the future, which can be said to perfectly resolve (or let the public ignore) GPU market fluctuations.

Behind Nvidia’s “nuclear bomb” is the process of continuously creating demand and stimulating supply following changes in the GPU market. While the commercial value continues to soar, it has indeed brought a huge improvement to GPU technology.

Does this mean that Huang Renxun is a “business wizard”? I’m afraid not.

Jiang Taigong is fishing, and he understands the rebound effect

Relying on the contribution of GPU in graphics computing, Huang Renxun was also called the “Godfather of AI” by people in the world, and even once wanted to kill Moore’s Law and promote his own “Huang’s Law” to dominate the computing market.

But in fact, it can be seen from the four “rebounds” of supply and demand that Nvidia just grasped the “rebound effect” of Moore’s Law clearly.

The so-called “rebound effect”, also known as the “take-back effect”, was first proposed by William Stanley Jevons in the book “The Coal Problem”, which refers to the improvement of Efficiency of new technology, the eventual reduction in expected benefits.

In the IT field, it means “What Andy gives, Bill takes away” Andy (Intel CEO Andy Grove) brings CPU hardware performance improvement, which will eventually be passed by Bill (Microsoft CEO Bill Gates) through software/services expanding and retracting.


The result is that although Moore’s Law will promote the continuous improvement and price reduction of computing hardware, because of the “rebound effect”, new needs/applications/scenarios will continue to be generated, eating up the benefits of hardware performance improvements.

Only in this way, users will be willing to spend money to update the machine so that they can enjoy newer and more resource-consuming services.

Looking back at the development of personal computers and smart phones, all of them were developed under Moore’s Law and its rebound effect. The GPU market where Nvidia is located is obviously not out of this category.


On the one hand, Nvidia’s product performance improvement speed has exceeded the “doubling in 18 months” stipulated by Moore’s Law, and has achieved the doubling of AI computing performance year by year, which is “Huang’s Law (Huang’s Law)”. However, from the performance improvement of the latest generation of H100 GPU, it is obviously still dependent on the semiconductor process technology limitation that has lasted for more than half a century. It adopts TSMC’s 4nm (rather than the 5nm previously speculated by the industry) process, combined with a new architecture design. To reach the “nuclear bomb” level.

In addition, performance improvement must be “take back” through new services/applications, otherwise users only need to wait for the price reduction of GPUs with the same performance, and there is no need to spend a lot of money to buy new ones. This is why AI large-scale model training, autonomous driving, metaverse, robots… will appear at the same time as the “family bucket” of the GTC conference, and consume the AI ​​computing resources brought by new technologies through the intelligence of all walks of life.

Huang Renxun also mentioned in an interview that Nvidia “even used the marketing budget to help developers market the products they developed with our architecture to create market demand.”


Therefore, every time Nvidia’s “nuclear bomb” bombs the street, it actually opens up a larger demand space again when the hardware performance recovery is not going well, so as to alleviate the situation of reduced technical benefits.

“What NVIDIA gives, AI takes away”, NVIDIA relies on this skill to clearly grasp the global players and Wall Street investors, and the market value will be saved again and again and “sit firmly on the Diaoyutai”.

Willers take the bait: Is there anything wrong with rolling yourself?

From a certain point of view, Nvidia’s grasp of the “rebound effect” and public demand is “Jiang Taigong fishes, and those who wish to take the bait.” As long as researchers feel that the model training time is shortened from a dozen days to a few hours; as long as gamers feel that the more realistic rendering effect is delicious; as long as Meta thinks that GPU is indispensable for the metaverse… Isn’t that enough? ?

Indeed, at first glance, it seems that stimulating demand to avoid the reduction of technological returns is against everyone, but in fact, technology companies’ efforts to avoid the rebound effect will also bring many derivative benefits.

On the one hand, the rebound effect of new technologies is inevitable.

This will directly reduce the cost of technical products. Unless you are an enthusiast gamer chasing the latest graphics card, as long as you are willing to wait, you can always buy a higher-performance GPU at a lower price. Who has any hatred for real money?

In addition, in order to avoid a reduction in the benefits of new technologies, technology companies have to invest a lot of energy in developing next-generation products. It is not enough to overwhelm competitors, and they have to roll themselves. Now Nvidia has occupied the absolute dominant position of GPU, but it is still constantly innovating, otherwise it will be scolded for “squeezing toothpaste”. For groups with replacement needs, there are always better products to choose from.

At the same time, higher performance and resources will also give birth to many unprecedented applications and services. When computers were first born, they were extremely expensive and could only be used by top American university research institutions, and they could only play chess. Now a thousand yuan mobile phone Farmers in remote mountainous areas can live broadcast online to sell fruits. Similarly, if we look forward five to ten years, if the proportion of self-driving cars changes from less than 1% now to 50%, and people stay in the metaverse as much as they do on their phones now, how many new emerging technologies will this create? What about service and commercial companies?


Huang Renxun himself recalled the development process of the first GPU product. He said: “Maybe no one knows what they need. Because things are often like this, if you haven’t seen something, you won’t know if you need it.”

But if you think about it, many functions that we once thought were dispensable, such as video calling, online live broadcast, high-definition video… have become must-have functions today. Demand can indeed be created, but demand also has priorities. Is the “fishhook” that Nvidia keeps throwing out attractive to you?