Intel Claims Xeon Cascade Lake-AP 56 Core CPU Up To 84% Faster Than AMD’s 64 Core EPYC Rome 7742 in Real-World HPC Benchmarks

Intel has posted a series of new benchmarks claiming that their Xeon class Cascade Lake-AP CPUs run much faster than AMD’s 2nd Generation EPYC Rome CPUs. The benchmarks which Intel claims are representative of ‘real-world’ performance in the HPC segment compare Intel’s 2S (dual socket) Xeon Platinum 9282 versus AMD’s EPYCX 7742 (also in a dual-socket configuration).

The Cascade Lake-AP Xeon Platinum 9282 56 Core CPU Claims To Be 84% Faster Than AMD’s EPYC 7742 64 Core CPU in Intel’s ‘Real-World’ HPC – AI Performance Benchmark Suite

The performance metrics for both processors were posted at Medium where Intel also recently published an article about core scaling and the reliance of modern applications on the number of cores available on processors. According to Intel, 8 cores along with sustained frequencies would result in better scaling than say a 12 core or a 16 core chip. Now Intel may have provided a lot of data backing up their findings but the same Intel was reluctant to move beyond 4 cores back in 2017 when AMD was offering their Ryzen chips with up to 8 cores and 16 threads. It’s interesting how suddenly, 8 core processors have become the next-big-thing for Intel’s mainstream lineup and the same would happen with 10th Gen family which is expected to receive even more cores.

For the HPC market, Intel says that – More processor cores add compute, but the overall system or workload performance depends on other factors, including:

· The performance of each core
· Software optimizations leveraging specific instructions
· Memory bandwidth to ensure feeding of the cores
· Cluster-level scaling deployed

Anyway, coming back to the topic, Intel’s latest benchmarks compared the Xeon Platinum 9200 versus the EPYC 7742. The Xeon Platinum is one of the elusive Cascade Lake-AP processors which feature two dies instead of a single monolithic one, stacking up to 56 cores and 112 threads. The chip has a base clock of 2.60 GHz and a boost clock of 3.80 GHz along with 77 MB of cache and a TDP of 400W. The Intel Cascade Lake-AP chips feature 12 memory channels compared to AMD’s 8 memory channels per chip.

The AMD EPYC 7742 is based on a 7nm process node (vs Intel’s 14nm+++) and features 64 cores / 128 threads. The chip has a base clock of 2.25 GHz and a boost clock of 3.4 GHz with 256 MB of L3 cache, 128 PCIe Gen 4 lanes and a TDP of 225W. The pricing plays a huge role too and here we see the EPYC 7742 with a price of $6950 US while the Xeon Platinum 9282 is suggested to have a price between $25K – $50K .

So right off the bat, we can note that this isn’t a fair comparison as not only does Intel’s chip have a higher running TDP but its cost is at least 3.5x higher than the AMD processor. Yes, the EPYC 7742 is AMD’s flagship 2nd Generation Rome processor for servers but even still, this isn’t an apples to apples comparison in any possible way.

Update: ServerTheHome’s Patrick J Kennedy has found out that the GROMACS version being used by Intel is an outdated one that doesn’t utilize the 256-bit wide AVX2 SIMD instruction set that is featured on Zen 2. The GROMACS 2019.3 version was used by Intel in what they are terming as real-world benchmarks, however, the latest version available is 2019.4 which adds support for Zen 2 based EPYC Rome chips like the EPYC 7742 which Intel tested their Xeon Platinum 9282 against. It just goes off to show that even Intel’s ‘Real-World’ benchmarks aren’t indicative of actual product performance and may lead to misleading statements against competitor products. And this won’t be the first time Intel is using misleading benchmarks or statements to downplay the competition. They have termed several important performance metrics used by tech reviewers that are invalid and not indicative of actual product performance while their own performance metrics surely are.

The benchmarks show that the Xeon Platinum 9282 delivers an average performance increase of 31%, going as high as up to 84%. There are several HPC specific applications shown which Intel claims are representative of real-world performance metrics in the server market. Dissecting each application reveal the breakdown of performance in each individual workload for the benchmark and in the case of the Manufacturing application (ANSYS Fluent Workload), Intel has a 13% average performance uplift over AMD’s EPYC Rome chip. Intel also claims that having AVX-512 onboard the new Xeon chips gives them an edge in several applications such as VASP, NAMD, GROMACS, FSI & LAMMPS.

The HPC segment is broad with varying compute requirements by workload. 56 core Xeon Platinum 9282 ranges from 8% to 84% better performance (31% higher geomean) than AMD’s 64 core Rome-based system (7742) on leading real-world HPC workloads across manufacturing, life sciences, financial services and earth sciences(2).

Some of the applications and results are shown above are a geomean of several specific workloads, all with different characteristics and sensitivities. Drilling into the details of these workloads provides further insight into performance. For example, Xeon Platinum 9282 leads AMD Rome 7742 by 13% on a geomean of 14 ANSYS Fluent workloads. Across those 14 different CFD simulations, Xeon’s results range from 2% lower to 36% higher.

Medium (Intel)

Intel further goes on to claim that Xeon Platinum 9200 series processors offer a lower TCO (Total Cost of Ownership). Since the performance of Xeon Platinum 9200 series is higher, you’d have to require a fewer number of nodes which should drive down the node acquisition cost, lower fabric, switching and cabling cost. It is also mentioned that while the Xeon-AP has a higher TDP and power requirement than AMD’s EPYC Rome (225W vs 400W), it should be offset by the lower number of nodes required to reach the same performance.

Aside from raw compute power, memory bandwidth is also highlighted as the main performance measurement factor and surprisingly, major industry players are already evaluating replacing their existing Intel-based systems with EPYC processors. Just a day before Intel published their report on performance, it was revealed that Netflix may soon be switching to AMD’s EPYC based platform as the TCO is similar but the EPYC solution may actually offer higher bandwidth than an Intel Xeon based system.

There will be a lot more action next-year in the server department as AMD will launch their energy efficient 7nm+ EPYC Milan CPUs to tackle both, Intel 14nm Cooper Lake and 10nm Ice Lake lineups, simultaneously.

AMD’s comments on CPU design reflect it is in a very different place than Intel right now

While Intel is making investments to recapture process leadership, AMD says it’s more focused on architecture.

AMD just wrapped up a quarter in which it generated the highest revenue since 2005, and that in and of itself is impressive. The real story is not in the sales numbers, though. What I find more interesting than anything else, as it pertains to AMD’s current situation, is how it finds itself operating in a much different mindset than rival Intel, which for years offered up the most advanced processor technology on the market.

That is no longer the case. Yes, the newly released Core i9-9900KS is the best CPU for gaming, and before that, the regular Core i9-9900K held the distinction. But underneath the hood, it is yet another iteration of Intel’s 14-nanometer lithography (14nm++, if you’re keeping count), which dates all the way back to Intel’s 5th generation Broadwell parts released in 2014. And if you look beyond gaming, AMD is swinging more cores in the mainstream market, to hammer multi-threaded workloads. The gap will widen further when AMD’s 16-core/32-thread Ryzen 9 3950X arrives next month.

AMD’s resurgence began with the introduction of its Zen architecture. However, now in its third generation, Zen 2 (which follows Zen and Zen+) is the first to employ a 7nm process node.

Comparing process nodes from one company to another is not exactly an apples-to-apples affair, but 7nm is certainly cutting edge, and it has arrived in earnest. Meanwhile, Intel only recently (and finally) began volume shipments of its 10nm Ice Lake processors, though so far they are only available in laptops.

Process node is not everything, but it’s important. It’s also where AMD and Intel find themselves at very different places, not just in what’s currently shipping, but in where each company’s focus is, at the moment.

I bring this up because of comments made by AMD CEO Dr. Lisa Su during an earnings call with investors. She was asked to break down how much of AMD’s current advantage is really related to process technology versus the underlying architecture.

“The way I would answer that question is, we’ve made a set of choices, and the set of choices include process technology, they include architecture, our chiplet architecture, they include sort of our overall system architecture. And I think we’ve made it set of good choices,” Dr. Su said (a full transcript is available at SeekingAlpha).

“Going forward, we are not relying on process technology as the main driver. We think process technology is necessary. It’s necessary to be sort of at the leading edge of process technology. And so, today, 7-nanometer is a great node, and we’re getting a lot of benefit from it. We will transition to the 5-nanometer node at the appropriate time and get great benefit from that as well. But we’re doing a lot in architecture. And I would say, that the architecture is where we believe the highest leverage is for our product portfolio going forward.”

In a sense, Dr. Su is almost downplaying its process technology lead. It’s a luxury that would have been difficult to fathom before Zen arrived, and in particular Zen 2. But here we are, with AMD shipping 7nm CPUs and talking about architecture being where its focus is at going forward, rather than relying on process technology to be its main driver.

Contrast that sentiment with Intel, which recently said it’s “investing to recapture process leadership going forward.” I’m sure Intel would say architecture is important too, but unlike AMD, getting back on a consistent cadence with regards to its process node is where Intel is focused on.

“We’re ramping a multitude of products. We have increased confidence in 5-nanometer. And as we mentioned for 7 and 5 getting back to a two-and-a-half, two year cadence is what we’re focused on and we’re confident in the future,” Intel CEO Bob Swan said last week.

I fully expect Intel to rebound on the technology side (it already leads in sales, revenue, and market share). Getting to 10nm has been tough, largely because Intel’s goals were too aggressive at the outset. But I think things will go smoother in the next few years. We’ll see.

Regardless of how it plays out, AMD is in a good spot. Perhaps just as importantly, AMD is exuding the kind of genuine confidence that wasn’t really there in the past. AMD had that swagger in the early Athlon 64 / X2 days, and now it’s back, only this time AMD seems in a better position, having divested itself of the manufacturing side of the business. In retrospect, that move is proving to have been extremely beneficial.

Intel Ghost Canyon NUC teardown reveals removable “The Element” module inside

Intel’s upcoming “Ghost Canyon” NUC is expected to be one of the most powerful mini computers from the company to date. Expected to be powered by a 45 watt, 9th-gen Intel Core H-series processor, the small form-factor computer will be aimed at gamers, and it includes a PCIe x16 slot for a discrete graphics card.

But it turns out the GPU isn’t the only thing that’s removable.

A teardown of a pre-production sample by a member of the Koolshare forum reveals that the heart of the computer is a removable module Intel currently refers to as “The Element,” which means you may be able to upgrade the CPU, memory, storage, and other components while keeping your existing case, power supply, and graphics card.

Intel hasn’t shared many details about the upcoming Ghost Canyon system yet, so the KoolShare post also reveals some interesting details including that:

The system measures 9.4″ x 8.5″ x 3.8″ making it small for a desktop, but large for an NUC.

It will be available wot Core i5-9300H, Core i7-9750H and Core i9-9980HK processor options

In addition to a cooling fan on The Element module, there are dual cooling fans on top of the system.

The computer has a 500 watt internal power supply.

There are two Thunderbolt 3 ports, six USB 3.1 Gen 2 Type-A ports, dual Gigabit Ethernet jacks, WiFi 6, Bluetooth 5, an SD card reader, an HDMI 2.0a port, a headset jack on the front, and a 3.5mm audio line out on the back.

According to an Intel spec sheet, the system supports up to three 4K displays if you’re using Intel UHD graphics alone. You may be able to connect additional displays if you add a discrete GPU.

Speaking of GPUs, the computer supports up to an 8″ long, double-width graphics card that uses up to 225 watts of power.

Other features include support for up to 32GB of DDR4-2666 memory, three M.2 slots with support for NVMe and/or Intel Optane memory, and a plastic case with a metal frame.

Intel will offer a 3-year warranty.

Last I’d heard, Intel was planning to bring the Ghost Canyon NUC to market early next year.

Intel to Discontinue Nearly All Desktop Kaby Lake CPUs

Intel has announced End-of-Life plan for most of its desktop Kaby Lake and remaining Skylake processors. The boxed and tray versions of the chips will be available for interested parties for one more year and then will become history. The move will enable Intel to cut the number of product SKUs it offers to partners and reduce pressure on its factory network, which will help to increase supply of newer products made using various versions of Intel’s 14 nm process technology.

Introduced early in 2017, Intel’s desktop 7th Generation Core processors (Kaby Lake) have been around for nearly three years now. The CPUs certainly served their purpose, but it is time for them to go and Intel recommends its partners to place their final orders on these products by April 24, 2020. The final shipments will be made by October 9, 2020. Some of Intel’s Kaby Lake and Skylake products will be moved to Internet of Things (IoT) status and will be available for a little longer to IoT customers and probably some PC makers as there are still previous-generation motherboards on the market that need to be sold.

Intel’s desktop 6th Generation Core CPUs were launched in 2016 and most of them have been in EOL status for a while. This week, Intel said it would stop taking orders on the remaining desktop Skylake products on April 24, 2020, and will cease their shipments by October 9, 2020.

Winding down production of desktop Skylake and Kaby Lake processors in the next few months will free manufacturing capacities for newer Intel products and will enable the company to increase shipments of newer CPUs, such as 8th and 9th Generation Coffee Lake, that are also made using Intel’s 14 nm fabrication technology.

How to improve PC performance by fixing the Intel Rapid Storage Technology driver

Sometimes a Windows 10 update will not replace an old driver like you would expect. In those cases, you must troubleshoot and solve the problem yourself.

Let’s face it, for most users, if a computer boots up and they can start using it, then the computer is working fine—well, at least good enough. However, this apathetic view of computer health and maintenance can lead to serious problems down the road, especially if your computer is using an outdated device driver.

I recently repurposed an old gaming laptop as my new business PC. This process included updating the OS from Windows 7 to Windows 10. After the update, I noticed that the CPU fan was constantly running, and the PC performance was sluggish at best. So, I did some troubleshooting with the Windows Task Manager.

I discovered that, for unexplained reasons and even after a lengthy update process, the computer was still using an outdated and deprecated device driver. This was causing major performance issues and was obviously unacceptable.

This tutorial shows you how to use the Task Manager to identify bad acting devices, how to troubleshoot the problem, and then how to fix it.

How to troubleshoot and fix a faulty device driver

The Windows Task Manager, best reached by the infamous keyboard shortcut of CTRL-ALT-Delete, is one of the first troubleshooting tools you should turn to when your PC starts behaving badly. By thoroughly examining what processes are running and how much power they are using, you can get a general idea of what devices or services are causing problems and which are working properly.

As you can see in Figure A, the driver for Intel’s Rapid Storage Technology chipset is continuously tasking the CPU at close to 20%, even when the PC is sitting idle. This device driver is also continuously claiming almost 7MB of RAM for some reason–this should not be happening; however, it does explain the constant running of the CPU fan and the PC’s overall sluggishness.

After checking the Intel support website, it turns out there is an updated 64-bit driver for the Intel Rapid Storage Technology chipset available, which I downloaded. Before we can install the new driver, we must first uninstall the old driver.

Type “control panel” into the Windows 10 desktop search box and choose the appropriate Control Panel application from the list of results to reach a screen similar to Figure B, which shows the small icon view. Click the link labeled Programs And Features.

Scroll down the list of devices until you find Intel Rapid Storage Technology (Figure C)–click that entry and then uninstall it. You will have to restart your PC to finish the process.

Upon restart, a generic driver will take care of running the Intel chipset for you. This generic driver will work, but you will sacrifice a noticeable level of performance, so it is best to install the new Intel Rapid Storage Technology driver.

Right-click the Start Menu button and select the Device Driver item from the list of configuration services. Scroll down to Storage controllers, expand the entry, and select the appropriate device (Figure D).

Double-click the Intel Chipset SATA RAID Controller entry to reveal the Properties screen and then click the Driver tab (Figure E). Click the Update Driver button and select the Have Disk option to install the driver downloaded previously.

When the installation process is complete, you will be prompted to restart your PC once again.

Now, when you check the Task Manager (Figure F) after the restart, you will notice that the Intel Rapid Storage Technology process is quiet when your PC is idle, using 0% of the CPU and only .8MB of RAM–in other words, behaving like it is supposed to behave. Problem troubleshooted and solved.

If your Windows 10 computer is not performing as well as you would like, check the running processes using the Task Manager. That basic built-in app can provide the information you need to troubleshoot the problem and develop a solution.