by Denis Pelli
July 9, 1998
Speed is good. My impression from the benchmarks below is that each computer has two independent dimensions: computing speed and built-in-video speed. Computing speed is determined primarily by processor clock rate (and presence of L2 cache), and only slightly by processor type (601, 603e, 604). [7/9/98 update: floating point is 1.8 times faster on the 604e and G3 processors than on the 601. Upgrades are cheap.] If you're looking for compute power, you should focus on processor clock rate. However, video speed is determined by transfer rate over whatever bus the video is on. Built-in video, which is on the motherboard memory bus, is typically twice as fast as video on an external bus (whether NuBus or PCI). The current memory bus speed limit on Apple's computers is 50 MHz, which is attained by the computers whose processor clock rate is a integer multiple of that: 100, 150, 200, ... MHz. Witness the excellent 36 MB/s video performance attained by the built-in video of the 7500/100. The best PCI-card rate is about half that, a mere 17 MB/s.
Several colleagues have asked how to choose among Apple's new PowerMacs and the (few remaining) clones. As far as doing vision experiments goes, I suggest you make your decision based on the price and the memory bus speed (rather than the more commonly emphasized processor speed). The memory bus speed is nonmonotonically related to processor speed. Apple's older PCI PowerMacs (7300, 7500, 7600, 8500, 8600, 9500, 9600) have a memory bus on the motherboard that can run at any speed up to 50 MHz, except the 7200, whose bus can up to 60 MHz (at least when tweaked by Power Computing). The new G3 PowerMacs have a 66 MHz memory bus. All the CPUs run faster than that, so their clock rate is divided down, by a small integer, to produce a memory bus rate no higher than the maximum allowed, e.g. 50 MHz. Thus each Mac has two clock rates: the processor and the memory bus. (I'm not counting the PCI bus, which always runs at 66 MHz, I think.) For ordinary computations I think that the most important rate is the processor rate. For showing movies via the built-in video the memory bus is the limiting rate, as has been thoroughly documented by TimeVideo (see VideoSpeed). (Showing movies via the PCI bus seems to be roughly half as fast, at best, as via built-in video.) For maximum memory bus speed you should buy a PowerMac whose processor clock rate is an even multiple of its memory bus rate. E.g. the no-longer-sold 7500/100 or the 8600/150. Note that the 7500, 7600, 8500, 8600, 9500, and 9600 all have their processor on a removable daughter card, and that you can upgrade to a faster one.
Mostly you'll care about processor clock rate, since speed is proportional, but note too that the 604e and G3 processors deliver floating point perfromance that is 1.8 times that of the 601 processor. [You will want to distinguish the processor's clock rate--which determines speed of computation--from the memory bus's clock rate--which determines how big a movie you can show.]
MacSpeedZone may still have a good table of processor and bus speeds.
Note that the 9500 and 9600 have no built-in video, so you are limited to whatever video rate you can achieve on a PCI bus card, generally about half of what you could achieve with on-board video.
The 7200 (no longer sold) was an attractively priced, fast machine, for vision work. It is important to buy extra VRAM for it, because unless you do, it only uses half the bandwidth of the bus in accessing video memory. This big speedup (unique to the 7200) is documented by TimeVideo (see VideoSpeed).
Note that the 7300/7500/7600/8500/8600 custom video driver that allows 120 Hz frame rate is only compatible with those models, not the 7200, 9500, or 9600.
VIDEO SPEED
Note that the speed of the memory bus (which determines video speed) has two dimensions: clock rate, and width (i.e. 32 or 64 bits). It's my impression that many of Apple's computers with the 603e PowerPC processor (e.g. the PowerBooks) have disappointingly low video rates and that this is attributable to using that processor with a narrower (32-bit wide) memory bus, unlike the 601 and 604 processors, which have a 64-bit wide memory bus. Motorola's data sheets say the 601 and 604 have a "high performance 64-bit data bus", whereas the "low power" 603 and 603e have a "selectable 32- or 64-bit data bus". (The 603e differs from the 603 in having twice as big a cache.) Apple has used the 603 and 603e in PowerBooks (to save power) and Performas (to save money), not in the business desktop machines: 6100, 7100, 8100, 7200, 7500, 7600, 8500, 8600, 9500, which all use the 601 and 604 processors. For applications that need fast video you want a 64-bit wide bus; only consider a computer with the 603 or 603e processor if it has a 64-bit wide bus.
In a nutshell, across all Macs, the video data rate (MB/s) that you'll attain in showing a pre-computed movie will be about half as fast for a video card as for built-in video, because the built-in video is accessed through a fast memory bus on the motherboard. For the PowerMacs, the most widely quoted "speed" is the processor clock rate (75 to 250 MHz), but the built-in-video rate is determined more by the speed of the memory bus of the motherboard (40 to 60 MHz), which is a fraction of processor clock rate. You can put a faster processor in your PowerMac, or add an external clock to increase the clock rate, but current motherboard designs can run the memory bus at no more than 50 or 60 MHz. The memory bus clock is derived from the processor clock by dividing down the processor clock rate by a small integer to achieve a memory bus rate within the speed limit. (As I understand it, the division takes place within the processor chip itself.) This division has the surprising consequence that increasing the processor clock rate (e.g. from 100 to 132 MHz) may reduce the memory bus rate (from 50 down to 44 MHz in a 7500 or 7600 or 8500). In this example, using an external clock to raise the new processor's speed from 132 up to 150 MHz would preserve the original memory bus speed of 50 MHz. Similarly, the motherboard video clock rate is higher on the 7500/100 (100/2=50 MHz) than on the 8500/120 (120/3=40 MHz).
It appears that the "64" and "128" in the names of the PCI video cards correlate with big differences in transfer speed over the PCI bus, apparently the "width" of the transfer.
Jack Van Olst asks, "but the MacWeek reviews showed that several cards outpaced internal video by 100%, especially on Photoshop scaling, Word scrolling, etc. Are your tests measuring something else?" The answer is yes: MacWeek is measuring performance of graphics applications that use many features of QuickDraw, some of which are speeded up by on-board processing ability of these cards. TimeVideo measures blitting speed, nothing else, which doesn't benefit at all from on-board processing. For my experiments that's what matters.
Also see ComputerSpeed (above).
Here are data rates for showing movies on a few machines:
The quoted rate is the best, across all pixel depths and across CopyBits and CopyBitsQuickly, which are usually within +/-20% of each other. CopyBits, being part of QuickDraw, always runs native. My homemade blitter, CopyBitsQuickly, uses BlockMoveDataUncached(), if possible (available only on PCI Macs), and otherwise uses BlockMoveData(). BlockMoveDataUncached runs about 10% faster than CopyBits, about 40 MB/s on a 7500/100.
The first four lines in the table above were a big surprise to me. CopyBits is slowest on the 9500, faster on the 7200 and 8500, and yet faster on the 7500. This is confirmed by an engineer at Apple, "Yes, this actually does make sense: The 7500 and 8500's internal video is on a private PCI bus that runs synchronous with the processor bus. Thus, on the 7500/100 it's running at 100/2=50 MHz and on the 8500/120 it's running at 120/3=40 MHz. The 9500's video is on a regular PCI bus that runs at 33 MHz."
Subject: PM 7200 Vram/Acceleration/VM
From Macintosh PCI Discussion List, MACPCI-L@MITVMA.MIT.EDU
Date: 11 Jun 1996
From: Mike Sherrill
I've just added Vram to my 7200/90 to 2MB and noticed a few things in the process of testing it:
- 7200's with 1mb Vram, VM disabled, Modern Mem manager disabled produce ~25mb/sec throughput @ 640x480
- 7200's with 2mb Vram, VM disabled, Modern Mem manager disabled produce ~34 mb/sec throught @ 640 x480
- Turning on Modern Mem manager or VM reduces throughput by 40-60% on any Vram configuration.
All of my testing has been done with the 7200 acceleration extension turned on under system 7.5.3 rev 2 with 24meg ram, I don't use ram doubler or speed doubler.
Based upon this info, I think Apple may need to revised the VM routines so they don't conflict with systems video routines. Anybody know if adding a 512k+ cache card will benefit these tests or minimize the conflicts with VM?
Apple says, "Currently [6/10/96], all PCI-based Power Macintosh computers that are capable of accepting processor upgrade cards will be able to support future Processor Upgrade Cards of up to 200 MHz." Powertools is selling PowerSource 150 MHz 604 upgrade cpu cards for PowerMac 7500, 8500, and 9500. Costs $699 if you trade in your old 601 processor. (Apparently not all L2 cache cards are compatible with 150 MHz processors.) XLR8 is selling their PowerEdge 604 150MHz processor for $899. They offer a rebate "for old working processor daughter cards of $200 for 604 units and $100 for 601".
The PowerMac 7200 and 7500 are very appealing. (The 8500 and 9500 seem overpriced.) All the PCI Macs have two busses: the PCI bus always runs at a fixed 33 .3 MHz; the processor bus runs from 40 to 50 MHz, depending on the model or CPU card (Noah Price, noah@apple.com). The 7200 is quite similar to the 7500, having lower clock rate, lower internal SCSI speed, and non-upgradable soldered-in processor chip. The functional difference between the 7500 and the 8500 seems to be just the processor, processor-bus speed, and L2 cache (7500/100 has 601 running at 100 MHz, 50 MHz internal bus, and L2 cache is optional; 8500/120 has 604 running at 120 MHz, 40 MHz internal bus, and L2 cache is included). MacWeek reported that moving the 604 daughterboard and the L2 cache from an 8500 to a 7500 yielded the same performance. All the reviewers love the case of the 7200 and 7500 (easy to access internals) and hate the 8500 minitower case (requires extensive disassembly to add memory). Based on the those specs, I think the best buy would be the 7200--add an L2 cache for more video speed--closely followed by the 7500, with the 8500 seeming to be a lot more money for very little more in performance. However, the benchmarks below reveal that the 7500 video speed is significantly higher than all other Macs, presumably because of its 50 MHz internal bus. Furthermore, an Apple engineer has produced for us a custom version of the 7500/8500 video driver supporting a 120 Hz frame rate; it's included in the VideoToolbox.
Ken Alexander, kennalex@uic.edu, reported, and it's confirmed by Apple, that the video driver on the 7500 and 8500 supresses the VBL interrupt while it's loading the CLUT, so any VBL-interrupt-based task (e.g. a frame counter) will miss a frame each time you call cscSetEntries to load the clut. Calling cscSetEntries once per frame to synchronize your code with the display should work fine, but it'll be slightly tricky to check for overrun (i.e. trying to do too much between cscSetEntries calls), since a simple VBL-interrupt-based frame counter will run differently on different Macs. I've just received documentation on extra video driver calls, supported only by this driver, that disable the interrupt suppression. See 7300/7500/7600/8500/8600 driver notes.
MacWeek 10/9/95 page 42 said that the powermac 7200's video speeds up with more vram. As indicated in the table above, Tom Busey tried adding 1 MB of VRAM (costing $85) to the standard 1 MB on his 7200/90, and found a 33% increase of CopyBits speed.
Power Mac 7100 NuBus bug: Michael Eckert (meckert@ee.uts.edu.au) writes, "From what I understand, a bug in the 7100's NuBus interface chip limits the throughput of the NuBus port. This does not affect built-in VRAM video or DRAM video (the AV option, in PDS slot). This bug affects only the 7100 (not 8100), and is supposed to be fixed during the next release of the machines." October 12, 1994.
Also see ComputerSpeed.
CONTRIBUTORS to the VideoSynch,VideoBugs, & VideoSpeed documents.