![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
|
|
![]() |
|
|||||||||
By William Van Winkle |
|||||||||||
A usual approach with virtualization is to use shadow page tables, whereby two sets of page tables exist—one visible to hardware and maintained by the hypervisor and one invisible to hardware used by the guest OS. The trouble with shadow page tables is that page faults are very costly in terms of performance, which is part of why memory issues can consume up to 75% of the hypervisor’s time. The nested page tables employed by early AMD-V virtualize the memory management unit. This helps to cache the mappings between the guest OS and physical hardware, which in turn reduces the overall virtualization overhead. Barcelona’s Rapid Virtualization Indexing takes this one step further. “About a month ago, I was in Beijing on press tour,” says AMD’s Fruehe. “The reporter asked a question, the translator had to translate into English for me, I gave an answer back to the translator, and he turned around and put it back into Mandarin—a very slow, arduous process, but we got through the interview. If I was talking to you, I could’ve done the exact same interview in about a quarter of the time, because there’s no back and forth. We both speak the same language. Rapid Virtualization Indexing allows the virtual machine to access memory directly, so it doesn’t have to cache every memory call through the hypervisor layer. Instead of saying, ‘Here, hypervisor, I need this from memory,’ then have the hypervisor go get it, bring it back, etc., we now hit the memory directly. It gives you much better performance because the biggest overhead for virtualized systems is memory. It’s all the memory swapping as each machine goes to take focus of the hardware for the split-second that it runs its instructions. It has to take control of the memory, flush out the information in there, load its memory and commands, get the answer, and then somebody else goes in and loads. All of that swapping back and forth reduces performance in the virtual world. But with Rapid Virtualization Indexing, each virtual machine can carve out an area in memory, and it doesn’t have to translate back and forth and swap in and out.” One misconception some people have about virtualization is that hardware resources in the server get divided up like a pie, with each guest OS getting its own isolated slice. But the process is actually more dynamic, with resource allocations varying according to guest OS load and resource blocks flipping constantly from one application to another. It is possible with a virtual machine to specify how much of the host’s resources each guest OS can have. You could allow it to use one processor and 1GB of memory, period. But that doesn’t eliminate the swapping of other guest software through those resources. It’s just that the first virtual machine only has to swap out 1GB of memory with the other virtual machine(s). You’ve limited resource conflicts to a particular area, but you haven’t eliminated them.
This is why AMD-V incorporates other commands and capabilities for resource protection, such as Device Exclusion Vectors (DEV). This securely prohibits different parts of the system from being able to access devices, so you don’t have one virtual machine trying to access the memory space of another virtual machine. Essentially, the DEV is an area of memory that grants or denies permission for a device to access a page. The DEV delivers a performance gain through the granting or denying of access to pages in memory through hardware, not software. Additionally, the DEV prevents unauthorized accesses to memory by device drivers within a virtual machine The upshot is that the improvements in AMD-V are drawing support from the virtualization community. Rapid Virtualization Indexing is now or will soon be supported by KVM, Microsoft, Novell SUSE Linux with Xen, RHEL 5.1 with Xen, Solaris 10 with Xen, Virtual Iron, VMware Workstation and ESX Server, Xen and XenSource, and others. With virtualization projected to grow by a factor of 10 in the near future, it will pay to have the best hardware for the job on hand and in your knowledge base. Only then will you be able to reap the full profit potential from your future server systems. LOOKING FORWARD TO PHENOM Following its tradition, AMD will release its new K10 architecture in the consumer space shortly after delivering it into commercial segments. The K10 will port into the consumer sphere under the Phenom label, knocking the Athlon line down to the mainstream and value-end segments. All Phenom models, like their Barcelona counterparts, will feature 512KB of L2 cache per core and 2MB of L3 cache shared across all cores. Phenom will arrive in three versions. There will be a dual-core version (code-named Kuma) branded as Phenom X2 that can use both the AM2 and AM2+ sockets. This X2 family will, in turn, have three groups delineated by power envelope: 45W (models starting with GE), 65W (GS), and 89W (GP-6xxx). Web buzz out of Asia shows initial frequencies for X2 ranging from 1.9 to 2.8 GHz and HyperTransport speeds spanning from 1,400 to 2,100 MHz. As the lowest cost family, Phenom X2 will launch after its higher-end cousins. Next comes the Phenom X4 (code-named Agena), a quad-core part that also uses Sockets AM2 or AM2+. The “+”, by the way, primarily notes that the socket is compatible with Split Plane power. You can plant a Phenom into an AM2 socket and, provided you update the BIOS, it’ll still work fine. Two launch SKUs are expected: the GP-7000 (2.2 GHz, 1600 MHz HT) and GP-7100 (2.4 GHz, 1800 MHz HT). Both chips have an 89W TDP. At the top of the Phenom ladder sits Phenom FX (codenamed Agena FX). These chips, like the Opterons, are designed for multi-processor support and so work under AMD’s Quad FX architecture. The FX-80 is a 2.6 GHz part with a 1900 MHz HyperTransport bus designed for the 1207+ socket. The FX-90 sports the same specs but shifts into the Socket F format. Finally, the FX-91, also for Socket F, specs out at 2.8 GHz with a 2100 MHz HT link.
TDP numbers for Phenom FX have not been announced yet, which harkens back to our earlier comments about performance and power envelopes. AMD states that it is still tinkering with the inner workings, although an unofficial rating of 125W is already floating about. In theory, Phenom could mimic the power envelopes of Barcelona, but market demands can vary. Gamers may be willing to tolerate a higher power envelope to get that extra nudge in clock speeds. At the other end, a living room media server may benefit from a lower power envelope than an eco-friendly server because low thermals are essential for dropping fan noise. The Phenom and Barcelona dies are nearly identical, although AMD notes that there are some small differences in the memory controller. The primary differences are (in the non-FX parts) the socket and the power/frequency configurations. This unified design should aid resellers in their sales efforts, since once you master the architectural and competitive benefits of Barcelona, you’ve pretty much got your Phenom pitch in the bag. ASSESSING THE QUAD OPP We drew a line early on between environmental and mechanical security. In fact, there is a gray zone between these two in which channel resellers can fit quite comfortably: physical asset protection. According to Doug Bone, vice president of SYNNEX’s server group, there are three key markets for memory-intensive server apps, meaning those best suited to Barcelona. The first is engineering and scientific. This market codes for large data sets constantly being handled in RAM for modeling and simulations. The second—no surprise here—is databases, which keep as much information in memory as possible so as not to delay reads and writes by accessing disk storage. And third is virtualization for the reasons we outlined above. The challenge is to match the market to the right chip. Barcelona model naming will follow the traditional path. According to information published by Digitimes in mid-August, Barcelona will arrive as the Opteron 2300 and 8300 series for 2P and MP systems. AMD will not comment on these numbers until launch, but the 2300 line is expected to open with the 2347 (1.9 GHz, 95W, $320 in 1,000-unit quantities) and 2350 (2.0 GHz, 95W, $390). The highest SKU in this series so far is the 2360 (2.5 GHz, 120W), which has no pricing information yet. Similarly, no pricing is yet available for the 8300 series, although we do have some other information. The 8350 and 8347 will mirror the 23xx products in clock speed and TDP. AMD will simultaneously introduce the 2347 HE (1.9 GHz), 2346 HE (1.8 GHz), 2344 (1.7 GHz), and the 8347 HE and 8346 HE—all at 68W. There is no 8344 HE. The 8000 series skews toward the high end, but there will be a few low bin 8000 models produced.
The Opteron 1xxx line for 1P servers will also receive a Barcelona refresh (code-named Budapest) in the fourth quarter. This line will stay in the AM2+ socket. “In going to market, the first thing you have to figure out is if quad-core is right for a customer’s application,” says AMD’s John Fruehe. “It would be very disingenuous of us to say that quad-core solves all ills in the computing world and thus everything you purchase should be quad-core. The reality is that most applications are threaded to some degree but may not be threaded to fully take advantage of quad-core processors. You need to choose the right hardware for the application.” To help push along the transition to multi-threaded software, AMD, like Intel, has released a collection of libraries and compiling tools to help ISVs code for multi-thread. Additionally, AMD’s math libraries will help developers code for 128-bit floating point support. AMD is fully aware that as core counts and processor capabilities increase, software is likely to become the bottleneck of the near future. Without these coding tools, the transition to quad-core and beyond will be slower and more painful, which in turn will mean fewer system sales. Threads are one big consideration in the quad-core opportunity, but resellers working to design optimal Barcelona servers for clients need also to pay particular attention to memory configuration. We covered the power implications of memory earlier, but how Barcelona balances with memory amounts is no less important. “When I started out in the server business,” says Fruehe, “512MB was a lot of memory for a server. Now, you’re seeing 8GB being the standard starting point. The scalability is becoming much more critical to servers. And you’re more than likely to outgrow a server in an environment where memory is scaling up quickly. The biggest misconception that people have is that throwing more cores at a problem will solve it. They think twice as many cores will solve a problem twice as fast. The reality is that if you throw twice as many cores at a problem, you also need to be doubling the memory to keep the core-to-memory ratio consistent, otherwise you just create a new bottleneck. You need to think about it in terms of memory per core. If you’ve got two dual-core processors with 8GB of memory, as you go to quad-core, you have to think about 16GB, keeping a consistent 2GB of memory per core.” Keep in mind that this sort of balanced memory scaling is a platform issue. Because of its design, Barcelona is more likely to benefit from system memory increases than Xeon. The application should indicate which platform is best, and the platform will dictate how to optimally allocate memory. More memory is not always better.
Plenty of manufacturers are revamping their AMD lines for AM2+ and enhanced Socket F support. The general theme is one of face lifting. Same boards, new BIOS and socket capabilities. One of the tricks in taking Barcelona to market is to look for standout platforms that can really make the new core’s benefits pop. One example would be Supermicro’s new SuperBlade, based on the quad-core Opteron 8000 family. Using the SBE-710 series 7U enclosure, each enclosure can fit ten blades. That’s four sockets per board, making 16 cores per blade, 160 cores per enclosure, and 960 cores in a 42U rack. Add to this the fact that Supermicro’s SBA-7141M blade motherboard supports 64GB of memory, making up to 640GB per enclosure. The enclosure’s 3+1 redundant power supplies offer 93% efficiency, which is a great tie-in with Barcelona’s many power-saving advances. “I think the reseller opportunity with Barcelona is huge,” says Michael Kalodrich, spokesman for Supermicro. “The people who like AMD are certainly anxious for quad-core. Some people’s applications are going to run better on AMD than Intel, and I think AMD is now right up there with Intel on power efficiency. Anybody involved in servers should find the Barcelona proposition very compelling.” On the Phenom side, the story is just as strong for consumers–if not better. Applications that benefit from AMD’s new architecture span not only multi-threaded games but also content creation titles, such as those from Adobe, high-def video players, compression encoders/recoders (think Nero and Cyberlink), never mind the operating system. Plus you have the quad-core and memory performance messages to take in front of everyone from gamers to mega-taskers, but you can add AMD’s ATI Radeon HD 2000 GPU line to the mix. The GPUs may not perform any better on a Phenom platform than a Core 2 Quad, but many customers see value in procuring as many components as possible from a single vendor source. Consumers can be just as ROI-minded as businesses. “You’ve got Crysis, Bioshock—a lot of titles out now or coming soon that are going to take advantage of a native quad-core processor,” says Ian McNaughton, AMD senior product manager for desktop CPUs and GPUs. “This release is happening at the right time for our consumers, and it’s the right time for system builders. Schedule-wise, we think we’re bang-on. Plus, we think our backward- and forward-compatibility message is very strong. Customers don’t want to have to buy a new chipset and motherboard for every processor release, and builders don’t want that extra inventory pressure. Someone who spent $1,800 on an AM2 system can simply upgrade the processor if they so choose.” Critics will say that AMD’s release schedule slipped; AMD will say that its timing meshes with today’s application environment. In the end, who cares? What matters is that resellers now have a new reason to get in front of customers and start pitching fresh solutions. Even if buyers don’t care about quad-core or power savings or floating point ops, AMD’s new processors are conversation starters. Maybe a discussion about memory bandwidth evolves into a network infrastructure upgrade. Maybe a talk about simulation crunching turns into converting 100 desktop stations into multi-monitor deployments. You never know. The migration into higher performance is the door opener. Once that door is open, though, the sky is the limit. |
|||||||||||
|
|||||||||||
Copyright © 2007 RAM Magazine. All rights reserved.
Do not duplicate or redistribute in any form. |
|||||||||||