![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
|
|
![]() |
|
|||||||
By Chris Angelini |
||||||||
When INTEL FIRST UNVEILED its HYPER-Threading technology, making it possible for one processor with one core and one set of execution resources to operate on two separate threads concurrently, everyone started getting excited about simultaneous multi-threading and the performance improvements it would introduce. Desktop enthusiasts anticipated their first taste of affordable multitasking accelerated in hardware, while enterprise customers looked forward to multiplying the compute horsepower of single- and dual-processor servers. But Hyper-Threading wasn't a fire-and-forget feature that yielded immediate results. Instead, software had to be well-optimized. More recently, dual-core processors have almost completely replaced their single-core predecessors in AMD's and Intel's respective product lineups. At first, those chips came with a caveat. Because dual-core processors doubled up on resources (and, in turn, complexity), they didn't hit the same clock speeds as single-core CPUs built using the same architecture and manufacturing process. That meant single-threaded software, in many cases, executed more slowly on the newer chips. Of course, both vendors have since tweaked their architectures to outpace what any single-core chip could do. But the underlying issue remains: Without software to complement all of the work AMD and Intel put into multi-core designs, the benefit to your customers remains minimal. What's the Problem? Perform a task the same way for long enough, over and over, and you'll develop a habit. Habits are often very difficult to break, whether you're talking about smoking, eating fast food, or writing inefficient code. In the past, a software developer could count on AMD and Intel to ratchet up processor performance by tuning their execution pipelines, boosting clock speed, and piling more cache onboard. Critics called that approach taking advantage of a free lunch as developers rode advances in hardware to accelerate exceedingly bloated programs. It turns out that as we usher in dual- and now quad-core CPUs, incremental speed bumps won't mean nearly as much as they once did. Instead, more cores and cache will propel processor performance forward. Cache is the near-term answer. Cores are the more long-term solution.
"It used to be that everything was a battle of speeds and feeds," says Jim Prelack, channel marketing manager in Intel's developer products division. "More gigahertz bought you additional performance. Developers weren't motivated to open up their code and optimize because better application performance was virtually guaranteed by new hardware. That is no longer the case, really. If you look at the load that is applied these days on computing platforms, it's only going to rise. If you look at the number of threads executing at any given point in time, you're seeing email function calls, antivirus calls, and management calls that are reaching into the system—there could be 20, 40, 80 threads running at a time. Those threads run better the more cores you have present on the platform." The problem, then, is that CPU vendors are abandoning the serial scaling that promised us 10 GHz chips. We're looking at increasingly parallel designs. Software that would have performed very well given a linear frequency increase isn't scaling as smoothly. Making the most of today's hardware advancements means writing code in a new and different way. Mainly, software must be written with a mind to threading if it's to harness the potential of an Athlon 64 X2 or Core 2 Duo processor. Although threaded software has been around for many years, single-threaded apps are still far more common. Developers in the habit of writing these now need to make a transition so that resellers can start leveraging parallelism as a powerful selling point. If the transition were easy, it would have happened back when Intel unveiled Hyper-Threading and everyone's software would already be optimized. But it isn't. The most daunting challenge presented by multi-core processors is the difficulty level tied to writing complementary applications. Many programmers compare the change in mindset to learning object-oriented coding from a structured background, but with less forgiving consequences when something goes wrong.
Fortunately, software developers don't have to tackle the issue of writing threaded code on their own. In fact, resellers can do their customers a tremendous service by introducing them to a handful of tools specifically written to facilitate concurrent applications. It should come as no surprise that the lion's share of those tools is offered by Intel, the company most intimately familiar with the multi-core hardware foundation you've been advocating. They range in scope from the compilers used to write code to performance libraries loaded with both threading-optimized functions and the analysis tools that help with fine-tuning. Stepping Up With Solutions Software development is a multi-stage process, which isn't necessarily linear. One customer might have a proprietary interface used to record the flow through gas pipelines while another wants to build a property listing service from scratch. Both developers need to be turned on to the benefits of threading so that they can analyze potential gains. But from there, they'll likely take divergent paths as they optimize. No matter the stage in development, anyone looking to get elbow-deep in code will want to start with the right compiler. Intel's C++ and Fortran compilers support Windows, Linux, and Apple's Mac OS, optimizing for any environment where your customer expects to find Intel hardware. The tools' features aren't specific to the NetBurst or Core architectures, though. Rather, the compilers accommodate the x86-32 and x86-64 instruction sets, supporting AMD and Intel CPUs equally. While system builders may bias from one hardware vendor to the next, ISVs are more concerned with compatibility, so it's important to remember that development tools are a play on maximizing multi-core hardware above all else. You can also expect your customers to use more than one compiler package. It might seem natural to employ Microsoft's Visual Studio package when writing an app that'll run on Windows Server 2003 boxes, for example. After all, you can download the Express versions of Visual Basic, Visual C++, Visual C#, and Visual J# directly from Microsoft's Web site free of charge. At the same time, bundling the Intel solution gives you cross-platform support for Linux and Mac operating systems in addition to enabling the most comprehensive threading optimizations. "Intel's compiler requires Visual Studio, which includes Microsoft's compiler," says Intel's Prelack. "So while in some ways Intel competes against Microsoft, the two products are more complementary than anything. Developers will use the Microsoft compiler in certain stages of their project and then switch to the Intel compiler when it comes to writing code that touches hardware." Although the compilers represent a solid first step toward helping ISVs take better advantage of multi-core processors, actually writing thread-friendly code is still very challenging. Intel offers a trio of performance libraries loaded with optimized functions to help cut back on development time while fully exploiting x86 hardware. The library sits between silicon and your customer's application, dynamically determining the best function call to make based on the processor it finds. Naturally, the most complete libraries are those able to recognize the newest hardware and handle it accordingly. Best of all, Intel's libraries are modular. Once an application is adapted for Intel's software tools, it does not need to be rewritten to take advantage of an upcoming processor architecture. The developer simply needs to get the updated library, link it, and recompile the app. The Integrated Performance Primitives, for example, zero in on multimedia performance, including functions that optimize audio, video, imaging, speech recognition, and codec components. Software developer partners get access to downloadable code samples and application updates to keep the Performance Primitives current with the latest hardware architectures. Intel's Math Kernel Library is similarly set up to help developers optimize for threading. Most of the package's functions gravitate around threaded math routines, which stand to benefit immensely from the compute power of dual- and quad-core processors. Research labs, government organizations, and banks use the Math Kernel Library as a means of ensuring top-notch performance in environments sensitive to speed. Finally, the Threading Building Blocks package targets a third demographic of developers getting their feet wet in concurrent software. It includes several parallelized algorithm templates that can quickly be customized, freeing developers from rewriting older, non-threaded code. So rather than wasting time on low-level thread mechanics, your customer can focus on task-oriented programming.
All of the applications addressed by Intel's Performance Primitives are suited to concurrency—that is, they're very likely to demonstrate handsome gains when they're executed on an Athlon 64 X2, Core 2 Duo, or Core 2 Quad platform. Other applications don't enjoy the same boost. Intel's Thread Checker 3.0 helps ISVs determine where to most effectively leverage threading and, once the code is threaded, plays an integral role in catching errors in threaded software. If your customer uses Thread Checker with an Intel compiler, threading errors can be tracked down to a line of code. Of course, compatibility extends to Microsoft's Visual Studio and Visual C++ environments, as well, with explicit support for 64-bit processor architectures. The complement to Thread Checker 3.0 is Intel's Thread Profiler, which then takes concurrent code and combs over it, looking for areas where performance might be improved. By comparing cores to processor utilization, the Thread Profiler determines the parallel performance of your customer's software. Once the libraries have been called, the functions analyzed, and the code compiled, software developers want to get an in-depth look at how their product is behaving. Intel's vTune is the envelope that wraps around the application and provides feedback on how system resources are putting demands on the application and vice versa. It's designed to accommodate very large programs, so there are several ways to generate reports for easier viewing, from "trapping" through functions individually to viewing a graphical map. vTune is incredibly comprehensive, supporting 32- and 64-bit profiling through Visual Studio 2005, Microsoft Vista, the Core 2 Duo/Quad architecture, and multiple operating environments, including Linux. Find Value in Development Given the complexity of writing software able to benefit from multi-core processors, it's no wonder that many developers are dragging their feet on adoption. As AMD and Intel put an end to the free lunch of rapid-fire clock speed increases and gravitate toward more scalable parallel architectures, applications need to take better advantage of the gains those processors offer. Writing efficient code and optimizing that code for threading will make the most marked difference in performance moving forward. And without those optimized apps, the dual- and quad-core whiteboxes you sell today won't show their full value. We've already established that the paradigm shift isn't an easy sell. Games, arguably the most demanding applications your customer will run, are still predominantly single-threaded. An increasing number of multimedia apps do incorporate some heavy multi-core optimizations, and Microsoft has even tuned Excel 2007 for concurrent hardware. But smaller programmers—the ones working on custom software in mission critical environments—could use some help from the reseller community as they evaluate the potential of threading and look for the tools to make their jobs easier. So even if smaller developers don't justify a migration to threading in today's code, they'll almost certainly want to optimize for it in the future. That means making an investment in the software tools able to make threading easier. As a means of protecting the investment your customers make in development tools, Intel extends support to each of its offerings. The Performance Libraries, for instance, include a year of support services, highlighted by application notes, documentation, and every product update made available through Intel's site. VTune similarly boasts Intel's Premier Support package. There's also something in it for resellers. Intel works with its Channel Partner Program members, who provide feedback on the development products. In return, those VARs get access to leads, marketing funds, extra margin, and a dedicated sales support team. The value in software development is clear: By selling the tools your customers can use to write better multi-threaded applications, you increase the value of your multi-core platforms and their threaded software. And although learning to write concurrent apps is challenging, the free lunch is a thing of the past, meaning it's time for developers to get more proactive about their code. Be the guiding hand that shows your customers what they have to gain by picking up a few good tools and rolling up their sleeves. |
||||||||
Copyright © 2007 RAM Magazine. All rights reserved.
Do not duplicate or redistribute in any form. |
||||||||