Google
 

 

 

A White Paper

Media Processors
"Software-Driven Multimedi"

by
Chromatic Research, Inc.
Sunnyvale, CA


Editor's Preface: In several issues of 21st, we have been discussing the need for a new kind of multimedia PC architecture. Despite Intel's MMX approach, there is still only so much one can do when working within the confines of a traditional CPU. Digital signal processors (DSPs) have been mentioned as being good candidates for use in these new multimedia systems. Some new DSP designs are being cast into even more specialized media processing roles; e.g., the Philips TriMedia. Apple Computer has recently announced that it will be using the TriMedia in some of its systems in 1997. Another specialized media processor is the Chromatic Research "Mpact" chip described in this white paper. Both the Philips device, and the Chromatic's chip use VLIW (Very Long Instruction Word), a software execution technique which enables on-chip parallelism. Apple's adoption of the TriMedia, and more significantly from Intel's perspective, the selection of the Chromatic Mpact by a number of large PC vendors, validates this MMX-alternative approach.--Francis Vale



Multimedia was already difficult enough for PC OEMs to manage when "multimedi" meant taking a standard PC and simply adding a GUI accelerator, a sound card, a FAX/modem, speakers and a CD-ROM drive.

Today, the proliferation of integer operation-hungry multimedia technologies -- DVD (MPEG-2 video, Dolby Surround AC-3(tm) audio), 3D graphics, home movie editing (MPEG encoding), and videoconferencing, to name just a few -- make it increasingly more costly and risky to commit to dedicated silicon or add-in boards for every new technology the market demands. Many multimedia technologies are fast-moving targets due to changing standards, evolving APIs, and shifting consumer tastes, further increasing the risk that today's fixed-function silicon won't meet tomorrow's needs.

Making matters more difficult, the PC is penetrating deeper into the heart of the consumer electronics market, where communications, graphics, audio and video are the key buying factors. Consumer electronics buyers are far less tolerant of the "PC-quality" video and audio offered by mainstream PCs, and even less so of the high prices and incompatibilities forced on PCs by the melange of add-in cards and ASICs used to provide adequate quality. The home PC must compete for the same home entertainment budgets and in the same perceived quality space as a panoply of present and near -- future digital age products -- from video game consoles and digital satellite television to DVD players and set-top boxes.

With such large markets at stake, one would expect a race within the semiconductor industry to develop a low-cost, mass-market approach to consumer electronics -- quality multimedia for graphics, video, audio and communications; one that allows enough versatility to rapidly adapt to changing standards and consumer demands. In fact, the race is already on: Chromatic Research, Toshiba, LG Semicon, Philips, Mitsubishi, Fujitsu, Samsung, IBM and NEC have all announced plans to bring media processors to market.

A media processor can be defined as a software-programmable processor dedicated to simultaneously accelerating several multimedia data types. The software that enables specific multimedia functionality running on the media processor is called mediaware. The data types accelerated by the media processor and mediaware include graphics, video, audio and communications functions.

Because of their software-driven flexibility and ability to adapt to a wide range of applications, media processors may prove to be as significant for multimedia over the next 25 years as the microprocessor was for PCs 25 years ago. Indeed, MicroDesign Resources, the semiconductor research firm that publishes Microprocessor Report, estimates that the media processor market will quickly grow to 55.5 million units and $1.6 billion by the year 2000.

Supercomputer on a Chip

It is not enough for a media processor to change personalities on the fly, acting as an audio chip one microsecond and a video chip the next. Multimedia applications require multiple data types to be accelerated concurrently. In a likely 1997 scenario, a user may be playing a multi-player 3D game over the Internet, requiring the simultaneous acceleration of 3D graphics, 3D positional audio, GUI acceleration, and modem or other communications functions.

To ensure high-quality performance under a wide range of conditions, a system must be capable of performing thousands of MOPS (millions of integer operations per second). This level of performance enables software-driven acceleration of a wide range of multimedia functions, although some, such as HDTV playback, "Jurassic Park" quality high-end 3D graphics, and virtual reality, elude today's first-generation media processors.

Increasing Integer Operations/s Required For Various Multimedia Technologies

As the market demand for more multimedia quality and functionality continues to increase over time, the multimedia technologies that meet that demand require increasing integer compute power, even taking into account the inevitable improvements in algorithm efficiency.

To obtain this level of sustained MOPS performance with the flexibility of software-driven functionality, a new approach is needed. Media processors are the result of new thinking, a fresh approach to the challenge of bringing high-quality multimedia functionality to the mainstream. Unbridled by legacy requirements of older technology, media processor manufacturers have had the luxury of starting with a clean slate when designing their components and how they integrate into systems.

For example, media processors use architectures akin to supercomputers, including high-bandwidth, parallel processing, single-instruction/multiple-data (SIMD), very long instruction word (VLIW) execution and vector processing. Also critical to media processor performance are off-chip memory bandwidth, a real-time kernel, and a well-integrated software environment. (See Mpact: A New Multimedia Performance Level.)

Small, dedicated logic blocks that perform highly performance-sensitive operations, such as those found in graphics, are selectively used to complement the media processor's programmable core. The media processor's instruction sets also typically contain several specialized instructions aimed at the performance-sensitive "inner-loop" functions used extensively in discreet cosine transforms, bit block transfers, adaptive filtering and other multimedia-related data processing functions.

The immense power of media processors enables a single subsystem to replace the multiple subsystems currently used to perform all the required multimedia functions, including:

  • VGA, Windows GUI and video acceleration
  • 3D graphics rendering
  • MPEG encoding and decoding
  • Standard and advanced audio functions, e.g. Dolby Digital(tm) AC-3
  • FAX/modem
  • Telephony
  • Videoconferencing

. The media processor can perform combinations of these individual multimedia tasks simultaneously.

By using a single, high-powered media processor with re-targetable transistors to handle all multimedia functions, the PC OEM can eliminate much of the expensive hard-wired, function-specific circuitry that sits idle when its particular function is not needed by the application. A programmable approach also eliminates much of the hard-wired silicon that is typically redundant across several subsystems.

Sharing the same memory space among all the multimedia functions provides additional benefits as well. With video and 3D graphics sharing the same memory space in PC applications, for example, texture mapping of real-time video frames onto 3D objects becomes more manageable. It becomes possible for game developers to use MPEG-compressed video sequences to represent characters instead of the more limited number of uncompressed bitmaps they now use, resulting in more realism and a better game experience for the players.

Media Processors And PCs

The significant performance and cost efficiencies made possible by the media processor's unique approach is making multimedia extremely attractive for a wide range of commercial and consumer electronics products. Media processors' broad applicability is reflected in the different markets being targeted by these first-generation devices, such as home and corporate PCs, PC add-in boards, PC/TVs, set-top DVD/gaming/PC machines and more.

Several factors have converged recently to make the media processor practical in today's PC marketplace. First, the availability of 0.50 and 0.35 micron semiconductor fabrication facilities enables the required number of transistors to fit on a die small enough to be economically feasible. Because media processors can be used in so many different applications, their huge volumes will also drive costs down significantly. Second, high-bandwidth memories such as Rambus DRAM provide the media processor with the required data throughput to accelerate multiple data types simultaneously. Third, Microsoft's introduction of DirectX(tm) APIs establishes a common standard for many multimedia functions, relieving the media processor vendor from the chipmaker's traditional burden of evangelizing their proprietary standards to the ISV community to ensure compatibility.

Media processors designed for the personal computer market have some specific characteristics and requirements that may make them differ from media processors designed for other applications. For example, appropriate real-time support is lacking at the Ring 0 (or kernel) level of PC operating systems, so a real-time kernel is required at the Ring 0 level to support concurrent, low-latency multimedia. Thus, a media processor provides maximum benefit when used with its own real-time kernel.

Also, knowing that Pentium-class or better host CPU will be present allows the media processor designer to simplify the design and keep costs down by letting the host CPU do some of the work. A PC-based media processor must also support legacy hardware standards, such as sound cards and VGA, and adapt well to the PC architecture, including key host bus interfaces. A multimedia subsystem using a media processor could easily be implemented directly on the PC motherboard, or as a plug-in card.

Other Multimedia Acceleration Technologies

A single chip that is capable of performing different multimedia functions is not a new idea. Digital signal processors (DSPs) can be programmed to perform a number of multimedia functions and have been used in audio and telephony applications, including modems, speech processing and sound generation. However, DSPs lack the power and bandwidth to perform more than one multimedia task at a time, and cannot typically be programmed to perform graphics or video-based functions.

Other multifunction solutions simply combine multiple, static functions on a single chip. While these chips have the bandwidth to perform complex graphics and video functions, and can perform more than one task at a time, their functions are hard-wired, with dedicated silicon for each task, and cannot have new functions added or existing functions updated. The static-function approach lacks flexibility and is inefficient in its use of silicon, requiring higher transistor counts and larger, more costly dies as well as larger off-chip memories. Any time a function is not being used, the associated circuitry sits idle. Functions that once had their own private memory space must now be made to share one, requiring some form of additional memory management kernel. Additionally, these chips cannot keep up with the fluid nature of the multimedia market with its constantly changing needs and standards, nor do they provide manufacturers with the efficiency of using a single part across a range of products over time.

.Complementing MMX(tm)

The software-driven approach to multimedia acceleration on PCs got a significant boost with the announcement of Intel's plans to build the P55C: a Pentium-class CPU with multimedia extensions to its instruction set (MMX). Multimedia extensions on CPUs can add valuable, incremental multimedia compute cycles to PCs with media processors present. Media processors, because they are software driven, are the most flexible complement to MMX-enhanced CPUs, creating even greater overall multimedia performance by giving the OEM and user access to a combined pool of MOPS integer compute power.

MMX by itself will not be sufficient to replace the other accelerator technologies used to make today's basic multimedia PCs, nor is it likely to single-handedly meet the needs of tomorrow's PCs: user demand for more and better multimedia will continue to outpace improvements in the CPU's multimedia extensions (today's average multimedia system with 28.8Kbps modem, 2D-only graphics, and 16-bit FM synthesis sound will be about as appealing in 1997 as 486-based machines are today.) After all, the CPU's first priority must always be to ensure compatibility and performance for an operating system, user applications and legacy system requirements.

Working together, however, the combined performance of media processors and MMX-enhanced CPUs will be able to meet most user demand for multimedia performance. A good analogy for the value obtained through the pairing of a media processor with an MMX-enhanced CPU is the way a GUI accelerator chip helps a standard x86 processor. While X86 processors are capable of GUI acceleration alone, the quality of the user experience is substantially improved by adding a GUI accelerator for a very small added cost relative to the cost of the X86. For this reason, GUI accelerators are found in virtually every PC manufactured today.

For a small added cost relative to the cost of the MMX-enhanced CPU, a media processor can replace the GUI accelerator in the PC while providing a compelling improvement to the user's total multimedia experience for graphics, audio, video and communications.

A Boon to OEMs

The media processor approach to multimedia must be compelling to convince PC OEMs to alter their 5-year old engineering and business model for building and selling PCs, adding fixed-function silicon for each new market requirement. In fact, what makes media processors so interesting is that they appear to have the compelling advantages to quickly penetrate the PC market. Consider:

  • Media processors allow PC OEMs to provide more multimedia features at far less cost than traditional fixed-function and add-in card approaches.
  • New technologies like DVD, 3D and videophone are making fixed-function silicon approaches to acceleration less and less cost-effective.
  • Technologies such as DVD and 3D also require interaction between data types (MPEG-2 and Dolby AC-3 synchronization in DVD; 3D graphics and 3D positional audio, for example) that makes piecemeal solutions more difficult to construct.
  • Adding multimedia functions via mediaware gives PC OEMs unprecedented fast time to market for new features.
  • Media processors' power and private multimedia subsystem approach provide better concurrency than combinations of more expensive fixed-function chips.
  • Media processors' price/performance enables new digital entertainment, education and communications opportunities for "PCs".
  • The risk of being stuck with an outdated feature set on motherboard inventory is minimized since new multimedia features can be added quickly through software.
  • Media processors deliver higher, consumer electronics-quality multimedia quality.
  • A media processor's multimedia features are designed to work with one another, eliminating conflicts and driver inefficiencies as well as many compatibility testing requirements.
  • Preservation of motherboard real estate enables new form factors.
  • Media processors provide OEMs with potential after-market revenue stream through mediaware upgrades.

The Best Solution for Multimedia

The continuing escalation in demand among users for the latest and/or highest-quality multimedia features will force computer, consumer electronics and other manufacturers to harness all the multimedia processing power they can, as affordably as possible. Media processors - built specifically for multimedia's natural data types, dedicated to the task, mass manufactured for low cost, and flexible via mediaware-are the best solution for these manufacturers to keep pace with this demand.

The media processor is perhaps the most significant silicon invention since the microprocessor. Media processors have the potential first to raise the baseline for PC multimedia, then to transform the PC into new form factors for entertainment, education and communications, and finally, to add life to a wide variety of machines that can benefit from a more natural interface.

Limitations To Fixed-Function Approach

Current multimedia PC architectures utilize separate accelerator chips and/or PCI and ISA add-in cards to boost the PC's multimedia performance. This approach falls short of the necessary quality and concurrency requirements. PCI bus latency, bus bandwidth limitations, and bus interrupt contention issues, along with main memory bandwidth limitations, VGA and other legacy hardware support issues, and lack of real time support under Windows, all conspire to make delivery of seamless high quality multimedia a daunting challenge for both the hardware OEM and the applications developer.

Suppose a 3D card does not support VGA. Does that card work well with the existing VGA hardware? If the new 3D card does have VGA support, is that support full-featured enough to support legacy DOS games? Does the audio pop when an incoming fax is received and the hard drive is accessed?

The use of many discrete, fixed-function silicon parts to accelerate the growing number of multimedia standards -- from DVD to 3D graphics to 3D positional audio -- is quickly reaching the point of diminishing returns for performance, as well as cost. A fresh approach to building multimedia subsystems is needed.

The Mpact media processor from Chromatic Research, Inc. solves the multimedia function integration problem by implementing all of the multimedia functions in software called "mediaware" on a single, very powerful VLIW SIMD processor under the umbrella of an efficient real-time operating system running on the media processor.

Gone are multiple interrupts and possible interrupt contention, as there is only one interrupt to the host for all the "devices" implemented by the media processor.

Also gone is the high-bandwith PCI bus traffic that occurs amongst add-in cards and between add-in cards and the host and main memory. Instead, in an Mpact-based PC, the high-bandwidth memory traffic is contained to a 500Mbytes/s Rambus memory subsystem that is private to the Mpact processor. This single memory subsystem is used for all of the media data types as well as the software running on the media processor.

The Mpact approach also eliminates the multiple, potentially conflicting drivers from different vendors running at the Ring 0 level under Windows. Instead, a single interrupt handler running at Ring 0 receives the Mpact interrupt, determines which virtual device is the requestor, and signals the responsible Mpact driver code running at the Ring 3 level.

All of these multimedia functions with their associated drivers and mediaware modules are all developed and integrated together by Chromatic, eliminating the conflicts and inefficiencies typically suffered by the OEM or user as they integrate these disparate parts.

The Mpact media processor provides more than enough compute resources to implement the needed concurrency. With first-generation Mpact media processors providing 3000 to 3600 MOPS (million operating per second) of sustained integer compute power, 9 Gbytes per second of on-chip bandwidth, and 500 to 600 MB per second private memory subsystem, it appears to the user as if the multimedia functions are implemented by separate dedicated hardware components functioning together flawlessly.

Because of their fresh approach to multimedia integration, media processors offer OEMs a leap in multimedia price/performance at a time when affordable fixed-function silicon is encountering diminishing returns.


The Media Processor

Media processors are defined as programmable processors dedicated to simultaneously accelerating several multimedia data types. Because they are programmable, media proccessors provide obsolescence insurance against the changing requirements and standards of multimedia data processing. Media processors are dedicated to processing multimedia data, in contrast with standard x86 host processors, so that their architectures can be specialized to processing these data types in the most cost effective manner possible. Their high bandwidth and fast integer prerformance allow media processors to simultaneously accelerate different multimedia data types, providing the seamless concurrent functionality required of multimedia PCs.

New products that typify the media processor include Mpact, designed by Chromatic Research and manufactured and sold by Toshiba and LG Semicon; Trimedia from Philips, Mfast from IBM, and the MediaProcessor from MicroUnity.

Media Processor Benefits

The programmability of media processors provide important benefits over discrete fixed function multimedia solutions:

Lower costs: Using a single programmable device, numerous discrete function devices can be replaced. Less board real estate is required and there are fewer parts for the PC OEM to inventory. There are lower software support costs because the tight integration offered by the media processor minimizes system configuration conflicts, and there are lower hardware support costs because the decrease in component count improves system reliability.

Longer product lifetime: Media processors offer obsolescence insurance because the multimedia functions they implement can be upgraded as standards change, and new functionality can be added as new standards emerge, opening the door to aftermarket upgrade software sales.

Greater manufacturing flexibility: Late binding manufacturing is facilitated because a single hardware platform can be easily targeted for different markets by simply changing software.

Improved baseline performance: Substantially raising the bar on the base PC multimedia performance creates new possibilities for software developers, spurring development of the overall market for multimedia.

What the Media Processor is Not

The media processor is not a general purpose microprocessor, it is not a DSP in the classic sense, and it is not an integration of fixed function chips.

A general purpose microprocessor such as an x86 dedicates significant transistor resources to the execution of large applications requiring virtual addressing with a paged virtual memory addressing system, and to provide a robust protected execution environment for a wide variety of software applications. A general purpose microprocessor may also dedicate significant transistor resources to multiprocessing support.

In contrast, the more controlled and application specific execution environments of media processors allow them to dedicate their transistor resources to the efficient acceleration of streaming multimedia data types, which are predominantly integer only. Media processors readily adopt architectural techniques usually seen in supercomputers, such as as VLIW, SIMD, and vector processing.

The strengths of media processors and those of general purpose microprocessors often compliment each other, allowing media processors such as the Chromatic Mpact to adopt a cooperative computing coprocessor approach to multimedia compute needs. With this type of load sharing model, as the processing power of both the media processor and host processor improve over time, more opportunities will be created for additional multimedia task concurrency, providing more features at lower cost.

Media processors also possess features that differentiate them from DSPs. Standard DSPs typically do not include framebuffers with GUI acceleration and they do not include support for legacy PC needs such as VGA and industry standard soundcard support. These latter two capabilities greatly facilitate integration of the media processor onto the PC motherboard. DSPs typically operate standalone, whereas media processors exhibit tight integration with OS APIs and often implement task load sharing with the host. DSPs traditionally used to implement audio or modem functionality are typically less powerful than media processors by an order of magnitude.

Because media processors are programmable, the same silicon resources are reused when implementing different functions, in contrast to chips that are integrations of discrete function blocks. These latter devices are perhaps more appropriately described as media accelerators, with products such as the NVidia NV1, and GUI accelerator chips that incorporate video support in the form of colorspace conversion and scaling representing typical examples.

New APIs a Key to Media Processors

New APIs such as the Microsoft DirectX APIs, that provide a consistent abstract view of low-level hardware resources to application software yet are very efficient, are key to the emergence of the Media Processor. These APIs permit the application developer to develop software without detailed knowledge of the underlying multimedia hardware, and at the same time free the media processor designer to pursue whatever architecture provides the best multimedia compute solution so long as the API interface is implemented.

Private Memory a Common Characteristic

A common characteristic of current media processors is the need for a private memory system to satisfy the high bandwidth and low latency requirements of streaming multimedia data types. Two leading technology choices for these private memories are Rambus DRAM and Syncronous DRAM.

The bandwidth requirements of display refresh, GUI acceleration, 3D acceleration/texture mapping, and video make sharing precious bandwidth needed by the host CPU for Windows OS and applications software unattractive. By utilizing a private memory system, the PCI bus can be insulated from the high bandwidth and low latency requirements of the media processor.


Benefits of Software-Based Approach

The programmable paradigm for multimedia advocated by Intel with MMX and by Chromatic Research with Mpact provides obsolescence protection for both the OEM and consumer through upgradability, and provides economies of scale for PC OEMs by allowing the creation of common hardware platforms that support late binding of product feature sets.

Because Mpact is a complete, integrated solution, its mediaware-driven multimedia functions and software drivers work seamlessly together, eliminating the inefficiencies and conflicts associated with multi-vendor, multi-card solutions.

Because the transistors on the media processor are "soft" (not dedicated to any specific multimedia function) and are retargeted to meet the PC's multimedia compute needs on a real-time demand basis, the resulting implementation is much more efficient from a hardware standpoint than fixed-function solutions, where the hardware dedicated to a specific function sits idle when not in use. This hardware efficiency translates to fewer components, less board area, higher reliability, fewer integration problems and lower cost.

Transparent Acceleration Under Win 95 APIs

Mpact mediaware accelerates the PC's multimedia needs through support of Windows 95 DirectX APIs such as DirectDraw, DirectSound, Direct3D and DirectVideo. The software developer only need conform to the Win 95 API specifications in order to obtain the full benefit of Mpact multimedia acceleration. Mpact's dynamic load sharing capabilities and MMX-savvy mediaware also gives ISVs seamless access to MMX compute power.

Mpact System Software

The Mpact system software consists of the Windows 95-resident Mpact Resource Manager (MRM), the media processor-resident Mpact Real-Time Kernel (MRK), associated mediaware modules that implement each of the multimedia functions, and software drivers. (See diagram 1)

One of the responsibilities of the Mpact Resource Manager is to coordinate the load sharing between the host and the Mpact media processor. MMX offers even greater load sharing options for Mpact-based systems because as application software make calls on the DirectX APIs, the MRM can dynamically decide whether to hand the work to a mediaware module running on the Mpact processor, or to a mediaware module running on the host that can take advantage of MMX, or both.

Mpact Real-Time OS

Sufficient real-time support is lacking at the Ring 0 (or kernel) level of the Windows OS, which makes concurrent support of multimedia functions difficult regardless of the host processor's capabilities. In a multimedia PC, which uses Mpact, the real-time functionality needed at the Ring 0 level to support concurrent low latency multimedia is transferred to the Mpact media processor, where it runs under the Mpact Real-Time Kernel (MRK).

Mpact Mediaware Operation

The operation of the various Mpact software components can be illustrated with a 3D sound example. When an application, such as a game, makes a call to a DirectX API to play a 3D sound, the matching Mpact DirectX driver makes the required resource allocation request to the Mpact Resource Manager. The MRM then verifies that sufficient system resources (compute, memory, bandwidth, etc.) are available. At this point the MRM can service the request in three ways, presented in order of descending preference:

1) Service the request using an Mpact-resident 3D mediaware module. An audio buffer for the application would then be allocated in RDRAM.

2) Service the request using an MMX-savvy, host resident 3D audio mediaware. module.

3) Let the DirectX module handle the request (perhaps by emulation).

Assuming the MRM selected option 1, the following actions would occur: If the resource allocation request should result in the Mpact becoming oversubscribed, the MRM would orchestrate a graceful backoff of one or more other tasks running on the Mpact processor. 2D GUI acceleration performance, or the modem data rate can be reduced to accommodate the needs of the new request. Backoff preferences may be specified by the user.

The Mpact 3D audio mediaware module processes the audio data placed in the buffer by the application to provide the spatialization effects, and buffers the data for output to the audio codec.

The 3D sound data is then sent to an audio codec by the Mpact Audio Process Manager, which is a real-time mediaware module running on the Mpact processor. The Audio Process Manager itself runs as a real-time thread under the Mpact resident Real-Time Kernel. The Audio Process Manager and the Real-Time Kernel ensure that there are no interruptions in delivering the audio samples to the codec regardless of other Mpact system activities. The application need only keep the audio data buffer reasonably full to ensure quality sound output.

Despite the sophistication of the Mpact software operating below the DirectX API layer, the application developer need only code to the Microsoft APIs to realize the full benefit of Mpact's compute power, MMX-savvy mediaware, and real-time load balancing kernel.

Adding Value with the Mpact Media Processor

A good analogy for the value obtained by the pairing of a media processor with an MMX-enhanced x86 is provided by pairing a GUI accelerator chip with an x86. The x86 is capable of GUI acceleration alone, but for a small cost adder relative to the cost of the x86, the quality of the user's experience can be substantially improved. For this reason, the GUI accelerator is found in virtually all PCs today. For a small cost adder relative to the cost of the MMX-enhanced x86, the Mpact media processor replaces the GUI accelerator in the PC while providing a compelling improvement in the user's multimedia experience, bringing it closer than ever before to consumer electronics-quality.

Availability

Chromatic Research will provide final, production-level mediaware modules, as well as other Mpact system software components, to its OEMs this summer.

Mpact is a trademark of Chromatic Research, Inc. All other trademarks or registered trademarks are property of their respective owners.

21st, The VXM Network, http://www.vxm.com

s