• MacTech Network:
  • Tech Support
  • |
  • MacForge.net
  • |
  • Apple News
  • |
  • Register Domains
  • |
  • SSL Certificates
  • |
  • iPod Deals
  • |
  • Mac Deals
  • |
  • Mac Book Shelf

MAC TECH

  • Home
  • Magazine
    • About MacTech in Print
    • Issue Table of Contents
    • Subscribe
    • Risk Free Sample
    • Back Issues
    • MacTech DVD
  • Archives
    • MacTech Print Archives
    • MacMod
    • MacTutor
    • FrameWorks
    • develop
  • Forums
  • News
    • MacTech News
    • MacTech Blog
    • MacTech Reviews and KoolTools
    • Whitepapers, Screencasts, Videos and Books
    • News Scanner
    • Rumors Scanner
    • Documentation Scanner
    • Submit News or PR
    • MacTech News List
  • Store
  • Apple Expo
    • by Category
    • by Company
    • by Product
  • Job Board
  • Editorial
    • Submit News or PR
    • Writer's Kit
    • Editorial Staff
    • Editorial Calendar
  • Advertising
    • Benefits of MacTech
    • Mechanicals and Submission
    • Dates and Deadlines
    • Submit Apple Expo Entry
  • User
    • Register for Ongoing Raffles
    • Register new user
    • Edit User Settings
    • Logout
  • Contact
    • Customer Service
    • Webmaster Feedback
    • Submit News or PR
    • Suggest an article
  • Connect Tools
    • MacTech Live Podcast
    • RSS Feeds
    • Twitter

ADVERTISEMENT

Volume Number: 15 (1999)
Issue Number: 8
Column Tag: Tips & Tidbits

Tips and Tidbits

by Jeff Clites, tips@mactech.com

Blistering Blitting

The June issue of MacTech highlighted some ways to move pixels around but there was no mention of a little-known PPC ASM instruction that can be used to speed pixel blitting (and all memory moving functions for that matter) called lmw/stmw.

When the data is aligned, it takes 4 cycles to execute a normal move memory. These two instructions are very powerful because they take 3 + n cycles to move n words. Hence each additional move only takes 1 cycle. This instruction works differently on different PPCs. The 601 treats these instruction like multiple lwa while the 603e, 604 and, G3 treat the instruction like a multimove.

There are a few restrictions on the following code:

  1. It works fastest when the baseAddrs are aligned.
  2. The code is for 8-bit pixels (tip: change r23 to 12 for 16 bit pixels and to 6 for 32 bit pixels).
  3. The width must be a multiple of 24 for 8 bit pixels (12 for 16 bit pixels, and 6 for 32 bit pixels). If it isn't there will be pixel overwriting. Most of the overwrite will be overwritten with the next scan line. You have been warned! (Any destination world should have a few more words in its baseAddr.)
  4. This assumes the same palette for the source and destination worlds (in 8-bit mode).
  5. This assumes the same depth for the source and destination worlds.

The following is an example of how to move pixels three times faster than the fastest method presented in Fast Blit Strategies.

export SpeedCopy[DS]
export .SpeedCopy[PR]
toc
	tc SpeedCopy[TC],	SpeedCopy[DS]	;TOC entry "SpeedCopy" for
																	;transition vector "SpeedCopy"
		csect	SpeedCopy[DS]			 		;Define transition vector "SpeedCopy"
		dc.l		.SpeedCopy[PR]	 				;Pointer to code
		dc.l		TOC[tc0]								;Pointer to TOC
		dc.l		0
# Prolog: SpeedCopy
;void SpeedCopy(long height, long width, long srcRowbytes,
;		unsigned long *dest, long destRowbytes,	unsigned long* src);
		csect	.SpeedCopy[PR]			;Prolog begins here
		;r3	= dest.height
		;r4	= dest.width
		;r5	= dest.rowbytes
		;r6	= dest.baseAddr
		;r7	= src.rowbytes
		;r8	= src.baseAddr
		stmw		r22,-36(SP)					;store temp register space
		li			r22, 1
		li			r23, 24
@lineLoop
		mr			r12,r4							;x = dest.width
		mr			r10,r6							;tmpdest = dest
		mr			r11,r8							;tmpsrc = src
@pixelLoop:
		lmw		r25,0(r11)					;Move 4 + 4 + 4 + 4 + 4 + 4 from 
														;tempSource to r25 thru r31
		subf.	r12,r23,r12					;Subtract num Pixels from total, test against 0
		addi		r11,r11,24					;Add pixel width to tempSource even for	
														;different size pixels
		stmw		r25,0(r10)					;Move pixels from r25 thru r31 to screen
		addi		r10,r10,24					;Add pixel width to dest even for different
														;size pixels
		bgt		@pixelLoop					;Loop if the subtraction is greater than 0
		subf.	r3, r22, r3				;Subtract one line from height, test against 0
		add		r8,r8,r7						;Add src.rowbytes to src.baseaddr
		add		r6,r6,r5						;Add dest.rowbytes to dest.baseaddr
		bne		@lineLoop					;Loop if height not equal to 0
		lmw		r22,-36(SP)					;Restore register space
		blr
end

With a few changes this code can pixel double, copy every other line, or both.

The best way to make a machine go faster is to make it do less. This is one of only a handful of cases where you can do more with less.

Brad Anderson anderson@rpmmusic.com

 
MacTech Only Search:
Community Search:

 
 
 

 
 
 
 
 
  • SPREAD THE WORD:
  • Slashdot
  • Digg
  • Del.icio.us
  • Reddit
  • Newsvine
  • Generate a short URL for this page:



MacTech Magazine. www.mactech.com
Toll Free 877-MACTECH, Outside US/Canada: 805-494-9797
MacTech is a registered trademark of Xplain Corporation. Xplain, "The journal of Apple technology", Apple Expo, Explain It, MacDev, MacDev-1, THINK Reference, NetProfessional, Apple Expo, MacTech Central, MacTech Domains, MacNews, MacForge, and the MacTutorMan are trademarks or service marks of Xplain Corporation. Sprocket is a registered trademark of eSprocket Corporation. Other trademarks and copyrights appearing in this printing or software remain the property of their respective holders.
All contents are Copyright 1984-2010 by Xplain Corporation. All rights reserved. Theme designed by Icreon.
 
Nov. 20: Take Control of Syncing Data in Sow Leopard' released
Nov. 19: Cocktail 4.5 (Leopard Edition) released
Nov. 19: macProVideo offers new Cubase tutorials
Nov. 18: S Stardom anounces Safe Capsule, a companion piece for Apple's
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live
Nov. 17: Ableton releases Max for Live