The Datasheet Archive - 100 Million Datasheets from 7500 Manufacturers.    


Datasheet Search Engine   
 
Part # or Description: • 5V RS232 Driver • 2SC5066* • "Real Time Clock" • "USB connector" • "blue led" 5mm • 10 watt zener diode • 2N3055* motorola
 
Search Tip: Try entering the part number only. Include a wildcard (eg. lm317* or 1n4148*)

 

 

Technology Manual 2000 Advanced Micro Devices, Inc. rights reserv


Datasheet Thumbnail

  

Download PDF



Top Searches for this datasheet



3DNow!
Technology Manual
2000 Advanced Micro Devices, Inc. rights reserved. contents this document provided connection with Advanced Micro Devices, Inc. ("AMD") products. makes representations warranties with respect accuracy completeness contents this publication reserves right make changes specifications product descriptions time without notice. license, whether express, implied, arising estoppel otherwise, intellectual property rights granted this publication. Except forth AMD's Standard Terms Conditions Sale, assumes liability whatsoever, disclaims express implied warranty, relating products including, limited implied warranty merchantability, fitness particular purpose, infringement intellectual property right. AMD's products designed, intended, authorized warranted components systems intended surgical implant into body, other applications intended support sustain life, other application which failure AMD's product could create situation where personal injury, death, severe property environmental damage occur. reserves right discontinue make changes products time without notice.
Trademarks AMD, logo, 3DNow!, Athlon, combinations thereof, trademarks, AMD-K6 registered trademark Advanced Micro Devices, Inc. trademark Intel Corporation. Other product names used this publication identification purposes only trademarks their respective companies.
21928G/0-March 2000
3DNow!Technology Manual
Contents
Revision History
3DNow!Technology
Introduction Functionality Feature Detection Register Data Types 3DNow!Instruction Formats Definitions Execution Resources AMD-K6® Processors Task Switching Exceptions. Prefixes
3DNow!Instruction
FEMMS. PAVGUSB PF2ID PFACC PFADD PFCMPEQ PFCMPGE PFCMPGT PFMAX. PFMIN PFMUL PFRCP PFRCPIT1 PFRCPIT2 Contents
3DNow!Technology Manual
21928G/0-March 2000
PFRSQIT1 PFRSQRT. PFSUB PFSUBR PI2FD PMULHRW PREFETCH/PREFETCHW
Division Square Root
Division Divide Examples. Square Root Square Root Examples.
Contents
21928G/0-March 2000
3DNow!Technology Manual
List Figures
Figure 3DNow!TM/MMXRegisters Figure 3DNow! Data Type Figure Single-Precision, Floating-Point Data Format. Figure Integer Data Types. Figure Register Unit Register Unit Resources
List Figures
3DNow!Technology Manual
21928G/0-March 2000
List Figures
21928G/0-March 2000
3DNow!Technology Manual
List Tables
Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table Table 3DNow!Technology Exponent Ranges. 3DNow! Floating-Point Instructions. 3DNow! Performance-Enhancement Instructions 3DNow! MMXInstruction Exceptions Numerical Range PF2ID Instruction. Numerical Range PFACC Instruction Numerical Range PFADD Instruction. Numerical Range PFCMPEQ Instruction Numerical Range PFCMPGE Instruction Numerical Range PFCMPGT Instruction Numerical Range PFMAX Instruction Numerical Range PFMIN Instruction Numerical Range PFMUL Instruction Numerical Range PFRCP Instruction Numerical Range PFRCPIT1 Instruction Numerical Range PFRCPIT2 Instruction Numerical Range PFRSQIT1 Instruction Numerical Range PFRSQRT Instruction Numerical Range PFSUB Instruction Numerical Range PFSUBR Instruction Summary PREFETCH Instruction Type Options
List Tables
3DNow!Technology Manual
21928G/0-March 2000
viii
List Tables
21928G/0-March 2000
3DNow!Technology Manual
Revision History
Date 1998 1998 1998 1998 Sept 1998 Sept 1998 Sept 1998 1998 1998 1998 1999 1999 2000 Initial Release Clarified CPUID usage "Feature Detection" page Revised description 3DNow! instructions "Definitions" page Revised function descriptions Table "3DNow!Floating-Point Instructions," page Revised code example PFRSQRT instruction page Changed exceptions generated PREFETCH/PREFETCHW instructions none, deleted exception table, revised PREFETCHW description page Added PUNPCKLDQ instruction division example (24-bit precision) page Added sample code that tests presence extended function 8000_0001h page Clarified instruction descriptions PFRCPIT1 page PFRCPIT2 page PFRSQIT1 page Added PUNPCKLDQ instruction clarified comments square root examples page Changed variable Newton-Raphson recurrence definitions, swapped order PFMUL PUNPCKLDQ instructions square root example (24-bit precision) Chapter page Added references Athlonprocessor throughout manual. Updated clarified PFACC instruction operation description page Description
Revision History
3DNow!Technology Manual
21928G/0-March 2000
Revision History
21928G/0-March 2000
3DNow!Technology Manual
3DNow!Technology
Introduction
3DNow!Technology significant innovation architecture that drives today's personal computers. 3DNow! technology group instructions that opens traditional processing bottlenecks floating-point-intensive multimedia applications. With 3DNow! technology, hardware software applications implement more powerful solutions create more entertaining productive platform. Examples type improvements that 3DNow! technology enables fast frame rates high-resolution scenes, much better physical modeling real-world environments, sharper more detailed imaging, smoother video playback, near theater-quality audio. taken leadership role developing these instructions that enable exciting levels performance realism. 3DNow! technology defined implemented collaboration with independent software developers, including operating system designers, application developers, graphics vendors. compatible with today's existing software requires operating system support, thereby enabling 3DNow! applications work with existing operating systems. 3DNow! technology implemented AMD-K6®-2, AMD-K6-III, Athlonprocessors. Chapter 3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
technology instructions that streaming digital signal processing (DSP) technologies. more information, Extensions 3DNow!and MMXInstruction Sets Manual, order# 22466.
Functionality
3DNow! technology instructions intended open major processing bottleneck graphics application floating-point operations. Today's applications facing limitations fact that only floating-point execution unit exists most advanced processors. front typical graphics software pipeline performs object physics, geometry transformations, clipping, floating-point intensive often limit features functionality application. source performance 3DNow! instructions originates from single instruction multiple data (SIMD) implementation. With SIMD, each inst ruction only operates single-precision, floating-point operands, microarchitecture within processor execute 3DNow! instructions clock through register execution pipelines, which allows total four floating-point operations clock. addition, because 3DNow! instructions same floating-point registers MMXtechnology instructions, task switching between 3DNow! operations eliminated. 3DNow! technology instruction contains instructions that support SIMD floating-point operations includes SIMD MMX-to-floating-point switching. improve MPEG decoding, 3DNow! instructions include specific SIMD integer instruction created facilitate pixel-motion compensation. Because media-based software typically operates large data sets, processor often needs wait this data transferred from main memory. extra time involved with retrieving this data avoided using 3DNow! instruction called PREFETCH. This instruction ensure that data level cache when needed. improve time takes switch between code, 3DNow! 3DNow!Technology Chapter
21928G/0-March 2000
3DNow!Technology Manual
instructions include FEMMS (fast entry/exit multimedia state) instruction, which eliminates much overhead involved with switch. addition 3DNow! technology expands capabilities family processors enables generation enriched user applications.
Feature Detection
properly identify 3DNow! instructions, application program must determine processor supports them. CPUID instruction gives programmers ability determine presence 3DNow! technology processor. Software applications must first test CPUID instruction supported. detailed description CPUID instruction, Processor Recognition Application Note, order# 20734. presence CPUID instruction indicated (21) EFLAGS register. this writable, CPUID instruction supported. following code sample shows test presence CPUID instruction.
pushfd ebx, eax, 00200000h push popfd pushfd eax, NO_CPUID save EFLAGS store EFLAGS save later testing toggle stack save changed EFLAGS push EFLAGS store EFLAGS changed change, CPUID
Once software identified processor's support CPUID, must test extended functions executing extended function 8000_0000h (EAX=8000_0000h). register returns largest extended function input value defined CPUID instruction processor. value greater than 8000_0000h, extended functions supported. following code sample shows test presence extended function 8000_0001h.
eax, 80000000h CPUID eax, 80000000h NO_EXTENDEDMSR query extended functions extended function limit 8000_0001h supported? not, 3DNow! tech. supported
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
next step programmer determine 3DNow! instructions supported. Extended function 8000_0001h CPUID instruction provides this information returning extended feature bits register. register 3DNow! instructions supported. following code sample shows test 3DNow! instruction support.
eax, 80000001h CPUID test edx, 80000000h YES_3DNow! setup ext. function 8000_0001h call function test 3DNow! technology supported
processor supports above features. Concatenating code examples above will produce basis detection software routine. more comprehensive code example available website
Register
complete multimedia units processor combine existing instructions with 3DNow! instructions. addition, merging 3DNow! with MMX, becomes possible write programs containing both integer, MMX, floating-point graphics instructions with performance penalty switching between multimedia (integer) 3DNow! (floating-point) units. processor implements eight 64-bit 3DNow!/MMX registers. These registers mapped onto floating-point registers. shown Figure 3DNow! instructions refer these registers mm7. Mapping 3DNow!/MMX registers onto floating-point register stack enables backwards compatibility register saving that must occur result task switching.
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
BITS
Figure 3DNow!TM/MMXRegisters Aliasing 3DNow!/MMX registers onto floating-point register stack provides safe method introduce 3DNow! technology, because does require modifications existing operating systems. Instead requiring operating system modifications, 3DNow! technology applications supported through device drivers, 3DNow! libraries, Dynamic Link Library (DLL) files. Current operating systems have support floating-point operations floating-point register state. Using floating-point registers 3DNow! code convenient implementing non-intrusive support 3DNow! instructions. Every time processor executes 3DNow! instruction, floating-point register bits zero (00b=valid), except FEMMS EMMS instructions, which bits (11b=empty). Note: Executing PREFETCH instruction does change bits.
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
Data Types
3DNow! technology uses packed data format. data packed single, 64-bit 3DNow!/MMX register quadword memory operand. Figure shows 3DNow! floating-point data type. each hold IEEE 32-bit single-precision, floating-point doubleword.
bits packed, single-precision, floating-point doublewords
Figure 3DNow!Data Type Figure page shows format IEEE 32-bit, single-precision, floating-point format.
32-bit, single-precision, floating-point doubleword Biased Exponent Value definitions 1.X=(-1)S*0 2.X=(-1)S*2(Biased 3.X=Undefined
Exponent 127)
Significand
*Significand
Biased Exponent=0 0<Biased Exponent<FFh Biased Exponent=FFh
value 32-bit, single-precision, floating-point doubleword.
Figure Single-Precision, Floating-Point Data Format
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Figure shows formats integer data types.
bits Packed bytes bits Packed words
bits Packed doublewords bits Quadword
Figure Integer Data Types
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
3DNow!Instruction Formats
format 3DNow! instruction encodings based conventional modR/M instruction format similar format used instructions. assembly language syntax used 3DNow! instructions follows: 3DNow! Mnemonic mmreg1, mmreg2/mem64
destination source1 operand (mmreg1) must (mmreg2/mem64) either register 64-bit memory value. encoding uses opcode prefix followed second opcode byte 0Fh. differentiate various 3DNow! instructions, third instruction suffix byte used. This suffix byte occupies same position 3DNow! instructions would imm8 byte. opcode format follows: modR/M [sib] [displacement] 3DNow!_suffix determine values used modR/M [sib] [displacement], follow conventional encodings. 3DNow! suffix determined actual 3DNow! instruction. 3DNow! suffixes defined Table page example, 3DNow! PFMUL instruction produce following opcodes, depending use: Opcode
Instruction
PFMUL PFMUL PFMUL PFMUL PFMUL mm1, mm1, mm1, mm1, mm1, [ebx] [ebx+10] es:[ebx] [ebx+eax*4+10]
instructions (FEMMS PREFETCH) uses single opcode prefix 0Fh. details opcodes these instructions shown pages respectively.
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Definitions
3DNow! technology provides additional instructions support high-performance, graphics audio processing. 3DNow! instructions vector instructions that operate 64-bit registers. 3DNow! instructions SIMD each instruction operates pairs 32-bit values. definitions 3DNow! instructions starting page contain designations classifying each instruction vectored scalar. Vector instructions operate parallel sets 32-bit, single-precision, floating-point words. Instructions that labeled scalar instructions operate single 32-bit operands (from halves 64-bit operands). 3DNow! single-precision, floating-point format compatible with IEEE-754, single-precision format. This format comprises 1-bit sign, 8-bit biased exponent, 23-bit significand with hidden integer total bits significand. bias exponent 127, consistent with IEEE single-precision standard. significands normalized within range [1,2). contrast IEEE standard that dictates four rounding modes, 3DNow! technology supports rounding mode either round-to-nearest round-to-zero (truncation). hardware implementation 3DNow! technology determines round-to-nearest mode. Regardless rounding mode used, floating-point-to-integer integer-to-floating-point conversion instructions, PF2ID PI2FD, always round-to-zero (truncation) mode. largest, representable, normal number magnitude this precision hexadecimal exponent significand 7FFFFFh, with numerical value 2127 2-23). results that overflow above maximum-representable maximum-representable normal number positive infinity. minimum-representable negative value saturated either
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
this minimum-representable normal number negative infinity. implementation 3DNow! technology determines arithmetic overflow handled either properly signed maximum- minimum-representable normal numbers properly signed infinities. processor generates properly signed maximum- minimum-representable normal numbers. Infinities NaNs supported operands 3DNow! instructions. smallest representable normal number magnitude this precision hexadecimal exponent significand 000000h, with numerical value Accordingly, results below this minimum representable value magnitude held zero. Table shows exponent ranges supported 3DNow! technology. Table 3DNow!Technology Exponent Ranges
Description Unsupported Zero Normal (1-127) lowest possible exponent (254-127) largest possible exponent
Biased Exponent 00h<x<FFh
Note:
Unsupported numbers used operands. results operations with unsupported numbers undefined.
Like instructions, 3DNow! instructions generate numeric exceptions they status flags. user's responsibility ensure that in-range data provided 3DNow! instructions that computations remain within valid ranges held expected).
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Execution Resources AMD-K6® Processors
instructions executed either register unit register unit. operation issued each register unit each clock cycle, maximum issue execution rate 3DNow! operations cycle. 3DNow! operations have execution latency clock cycles fully pipelined. Even though 3DNow! execution resources duplicated both register units (for example, there pairs 3DNow! multipliers, just shared pair multipliers), there restrictions. When, example, 3DNow! multiply operation starts execution register unit, that unit grabs uses shared pair 3DNow! multipliers. Only when actual contention occurs between 3DNow! operations starting execution same time operations held cycle first execution pipe stage while other proceeds. delay never more than cycle. code optimization purposes, 3DNow! operations grouped into categories. These categories based execution resources important when creating properly scheduled code. long 3DNow! operations that start execution simultaneously fall into same category, both operations will start execution without delay. first category instructions contains operations following 3DNow! instructions: PFADD, PFSUB, PFSUBR, PFACC, PFCMPx, PFMIN, PFMAX, PI2FD, PF2ID, PFRCP, PFRSQRT. second category contains operations following 3DNow! instructions: PFMUL, PFRCPIT1, PFRSQIT1, PFRCPIT2. Note: 3DNow! multiply operations, among other combinations, execute simultaneously. Normally, high-performance 3DNow! code, 3DNow! instructions properly scheduled apart from each other avoid delays execution resource contentions well taking into account dependencies execution latencies). Chapter 3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
further information regarding code optimization, AMD-K6® Processor Code Optimization Application Note, order# 21924. This document provides in-depth discussions code optimization techniques processor. execution resources information Athlon processor, refer Athlon Processor Code Optimization Guide, order# 22007. instructio ssors summarized Table page dedicated shared execution resources register unit register unit shown Figure page execution resources some operations, well 3DNow! operations, shared between register units. contention-checking purposes, each represents category operations that cannot start execution simultaneously. addition, 3DNow! multiplies same hardware, while 3DNow! adds subtracts not. 3DNow! performance-enhancement instructions processors summarized Table page FEMMS instruction does specific execution resource pipeline. PREFETCH instruction operated Load unit.
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Register Execution Pipeline
Register Execution Pipeline
Integer Integer Shift Integer Multiply Divide Integer Byte Operations Integer Special Registers Integer Segment Register Loads Add/Subtract, Compare Logical, Pack, Unpack
3DNow!Add/Subtract, Compare, Integer Conversion, Reciprocal Reciprocal Square Root Table Lookup MMXand 3DNow! Multiply, Reciprocal Reciprocal Square Root Iteration
Integer Add/Subtract, Compare
Logical, Pack, Unpack
Shifter
Dedicated Register Resources
Shared Register Resources
Dedicated Register Resources
Figure Register Unit Register Unit Resources
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
Table
3DNow!Floating-Point Instructions
Function Packed 8-bit Unsigned Integer Averaging Packed Floating-Point Addition Packed Floating-Point Subtraction Packed Floating-Point Reverse Subtraction Packed Floating-Point Accumulate Packed Floating-Point Comparison, Greater Equal Packed Floating-Point Comparison, Greater Packed Floating-Point Comparison, Equal Packed Floating-Point Minimum Packed Floating-Point Maximum Packed 32-bit Integer Floating-Point Conversion Packed Floating-Point 32-bit Integer Packed Floating-Point Reciprocal Approximation Packed Floating-Point Reciprocal Square Root Approximation Packed Floating-Point Multiplication Packed Floating-Point Reciprocal First Iteration Step Packed Floating-Point Reciprocal Square Root First Iteration Step Packed Floating-Point Reciprocal/Reciprocal Square Root Second Iteration Step Packed 16-bit Integer Multiply with rounding Opcode Suffix
Operation PAVGUSB PFADD PFSUB PFSUBR PFACC PFCMPGE PFCMPGT PFCMPEQ PFMIN PFMAX PI2FD PF2ID PFRCP PFRSQRT PFMUL PFRCPIT1 PFRSQIT1 PFRCPIT2 PMULHRW
Table
3DNow!Performance-Enhancement Instructions
Operation Function Faster entry/exit MMXor floating-point state Prefetch least 32-byte line into data cache (Dcache) Opcode Second Byte
FEMMS PREFETCH/PREFETCHW
Note:
AMD-K6-2 AMD-K6-III processors execute PREFETCHW instruction identically PREFETCH instruction. Athlon processor, PREFETCHW increase performance providing hint processor intent modify cache line.
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Task Switching
With respect task switching, treat 3DNow! instructions exactly same instructions. Operating system design must taken into account when writing 3DNow! program. programmer must know whether operating system automatically saves current states when task switching, 3DNow! program provide code save states. task switch occurs, Control Register (CR0) Task Switch (TS) processor then generates interrupt (int Device Available) when encounters next floating-point, 3DNow!, instruction, allowing operating system save state 3DNow!/MMX/FP registers. multitasking operating system, there task switch when 3DNow!/MMX applications running with older applications that include instructions, MMX/FP register state still saved automatically through handler.
Exceptions
Table contains list exceptions that 3DNow! instructions generate. Table 3DNow!and MMXInstruction Exceptions
Real Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17)
Chapter
3DNow!Technology
3DNow!Technology Manual
21928G/0-March 2000
rules exceptions same both 3DNow! instructions. addition, exception detection handling identical 3DNow! instructions. None exception handlers need modification. Notes: invalid opcode exception (interrupt occurs 3DNow! instruction executed processor that does support 3DNow! instructions. floating-point exception pending processor encounters 3DNow! instruction, FERR# asserted and, CR0.NE interrupt generated. (This same instructions.)
Prefixes
following prefixes used with 3DNow! instructions:
segment override prefixes (2Eh/CS, 36h/SS, 3Eh/DS, 26h/ES, 64h/FS, 65h/GS) affect 3DNow! instructions that contain memory operand. address-size override prefix (67h) affects 3DNow! instructions that contain memory operand. operand-size override prefix (66h) ignored. LOCK prefix (F0h) triggers invalid opcode exception (interrupt prefixes (F3h/ REP/ REPE/ REPZ, F2h/ REPNE/ REPNZ) ignored.
3DNow!Technology
Chapter
21928G/0-March 2000
3DNow!Technology Manual
3DNow!Instruction
alphabetical order according instruction mnemonics.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
FEMMS
mnemonic FEMMS Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Floating-point exception pending (16) Real Virtual 8086
opcode none none
description Faster Enter/Exit floating-point state
Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) exception pending floating-point execution unit.
Like EMMS instruction, FEMMS instruction used clear state following execution block instructions. Because registers words shared with floating-point unit, necessary clear state before executing floating-point instructions. Unlike EMMS instruction, contents MMX/floating-point registers undefined after FEMMS instruction executed. Therefore, FEMMS instruction offers faster context switch routine where values registers longer required. FEMMS also used prior executing instructions where preceding floating-point register values longer required, which facilitates faster context switching.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PAVGUSB
mnemonic PAVGUSB mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 None None
Virtual 8086 Protected Description
description Average unsigned packed 8-bit values
emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PAVGUSB instruction produces rounded averages eight unsigned 8-bit integer values source operand register 64-bit memory location) eight corresponding unsigned 8-bit integer values destination operand register). does adding source destination byte values then adding 001h 9-bit intermediate value. intermediate value then divided (shifted right place) eight unsigned 8-bit results stored register specified destination operand. PAVGUSB instruction used pixel averaging MPEG-2 motion compensation video scaling operations.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Functional Illustration PAVGUSB Instruction
mmreg2/mem64
byte averaging
mmreg1
mmreg1
Indicates value that rounded-up
following list explains functional illustration PAVGUSB instruction:
rounded byte average FFh. rounded byte average 80h. rounded byte average also 80h. rounded byte average 10h. rounded byte average 01h. rounded byte average 5Ah. rounded byte average 7Fh. rounded byte average A1h.
equations byte averaging with rounding follows:
mmreg1[63:56] (mmreg1[63:56] mmreg2/mem64[63:56] 01h)/2 mmreg1[55:48] (mmreg1[55:48] mmreg2/mem64[55:48] 01h)/2 mmreg1[47:40] (mmreg1[47:40] mmreg2/mem64[47:40] 01h)/2 mmreg1[39:32] (mmreg1[39:32] mmreg2/mem64[39:32] 01h)/2 mmreg1[31:24] (mmreg1[31:24] mmreg2/mem64[31:24] 01h)/2 mmreg1[23:16] (mmreg1[23:16] mmreg2/mem64[23:16] 01h)/2 mmreg1[15:8] (mmreg1[15:8] mmreg2/mem64[15:8] 01h)/2 mmreg1[7:0] (mmreg1[7:0] mmreg2/mem64[7:0] 01h)/2
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PF2ID
mnemonic PF2ID mmreg1, mmreg2/mem64 opcode/imm8 description Converts packed floating-point operand packed 32-bit integer
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
inst conver gist conta ining single-precision, floating-point operands 32-bit signed integers using truncation. Table page shows numerical range PF2ID instruction. PF2ID instruction performs following operations:
(mmreg2/mem64[31:0] 231) THEN mmreg1[31:0] 7FFF_FFFFh ELSEIF (mmreg2/mem64[31:0] -231) THEN mmreg1[31:0] 8000_0000h ELSE mmreg1[31:0] int(mmreg2/mem64[31:0]) (mmreg2/mem64[63:32] 231) THEN mmreg1[63:32] 7FFF_FFFFh ELSEIF (mmreg2/mem64[63:32] -231) THEN mmreg1[63:32] 8000_0000h ELSE mmreg1[63:32] int(mmreg2/mem64[63:32])
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table
Numerical Range PF2ID Instruction
Source Source Destination round zero (Source round zero (Source 7FFF_FFFFh 8000_0000h Undefined
Normal, abs(Source Normal, -2147483648 Source Normal, Source 2147483648 Normal, Source 2147483648 Normal, Source -2147483648 Unsupported
Related Instructions
PI2FD instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFACC
mnemonic PFACC mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
description Floating-point accumulate
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFACC vector instruction that accumulates words destination operand source operand stores results high words destination operand respectively. Both operands single-precision, floating-point operands with 24-bit significands. Table page shows numerical range PFACC instruction. PFACC instruction performs following operations:
temp mmreg2/mem64 mmreg1[31:0] mmreg1[31:0] mmreg1[63:32] mmreg1[63:32] temp[31:0] temp[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table
Numerical Range PFACC Instruction
Source Source Source Normal Source Normal, Undefined Unsupported Source Undefined Undefined
Source Destination
Notes:
Normal Unsupported
sign result logical signs source operands. absolute value result less then -126, result zero with sign being sign source operand that larger magnitude magnitudes equal, sign source used). absolute value result greater than equal 128, result largest normal number with sign being sign source operand that larger magnitude.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFADD
mnemonic opcode/imm8 description Packed, floating-point addition
PFADD mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFADD vector instruction that performs addition destination operand source operand. Both operands single-precision, floating-point operands with 24-bit significands. Table page shows numerical range PFADD instruction. PFADD instruction performs following operations:
mmreg1[31:0] mmreg1[31:0] mmreg2/mem64[31:0] mmreg1[63:32] mmreg1[63:32] mmreg2/mem64[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table
Numerical Range PFADD Instruction
Source Source Source Normal Source Normal, Undefined Unsupported Source Undefined Undefined
Source Destination
Notes:
Normal Unsupported
sign result logical signs source operands. absolute value result less then -126, result zero with sign being sign source operand that larger magnitude magnitudes equal, sign source used). absolute value result greater than equal 128, result largest normal number with sign being sign source operand that larger magnitude.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFCMPEQ
mnemonic PFCMPEQ mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8
description Packed floating-point comparison, equal
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFCMPEQ vector instruction that performs comparison destination operand source operand generates bits zero bits based result corresponding comparison. Table page shows numerical range PFCMPEQ instruction. PFCMPEQ instruction performs following operations:
(mmreg1[31:0] mmreg2/mem64[31:0]) THEN mmreg1[31:0] FFFF_FFFFh ELSE mmreg1[31:0] 0000_0000h (mmreg1[63:32] mmreg2/mem64[63:32] THEN mmreg1[63:32] FFFF_FFFFh ELSE mmreg1[63:32] 0000_0000h
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table
Numerical Range PFCMPEQ Instruction
Source FFFF_FFFFh 0000_0000h 0000_0000h Normal 0000_0000h 0000_0000h, FFFF_FFFFh 0000_0000h Unsupported 0000_0000h 0000_0000h Undefined
Source Destination
Normal Unsupported
Notes:
Positive zero equal negative zero. result FFFF_FFFFh source source have identical signs, exponents, mantissas. Otherwise, result 0000_0000h.
Related Instructions
PFCMPGE instruction. PFCMPGT instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFCMPGE
mnemonic PFCMPGE mmreg1, mmreg2/mem64 opcode/imm8 description Packed floating-point comparison, greater than equal
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFCMPGE vector instruction that performs comparison destination operand source operand generates bits zero bits based result corresponding comparison. Table page shows numerical range PFCMPGE instruction. PFCMPGE instruction performs following operations:
(mmreg1[31:0] mmreg2/mem64[31:0]) THEN mmreg1[31:0] FFFF_FFFFh ELSE mmreg1[31:0] 0000_0000h (mmreg1[63:32] mmreg2/mem64[63:32] THEN mmreg1[63:32] FFFF_FFFFh ELSE mmreg1[63:32] 0000_0000h
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table
Numerical Range PFCMPGE Instruction
Source FFFF_FFFFh 0000_0000h, Normal Unsupported FFFF_FFFFh Undefined Normal 0000_0000h, FFFF_FFFFh 0000_0000h, FFFF_FFFFh Undefined Unsupported Undefined Undefined Undefined
Source Destination
Notes:
Positive zero equal negative zero. result FFFF_FFFFh, source negative. Otherwise, result 0000_0000h. result FFFF_FFFFh, source positive. Otherwise, result 0000_0000h. result FFFF_FFFFh, source positive source negative, they both negative source smaller than equal magnitude source source source both positive source greater than equal magnitude source result 0000_0000h other cases.
Related Instructions
PFCMPEQ instruction. PFCMPGT instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFCMPGT
mnemonic PFCMPGT mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8
description Packed floating-point comparison, greater than
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFCMPGT vector instruction that performs comparison destination operand source operand generates bits zero bits based result corresponding comparison. Table page shows numerical range PFCMPGT instruction. PFCMPGT instruction performs following operations:
(mmreg1[31:0] mmreg2/mem64[31:0]) THEN mmreg1[31:0] FFFF_FFFFh ELSE mmreg1[31:0] 0000_0000h (mmreg1[63:32] mmreg2/mem64[63:32] THEN mmreg1[63:32] FFFF_FFFFh ELSE mmreg1[63:32] 0000_0000h
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFCMPGT Instruction
Source Source Destination Normal Unsupported
Notes:
Normal 0000_0000h, FFFF_FFFFh 0000_0000h, FFFF_FFFFh Undefined
Unsupported Undefined Undefined Undefined
0000_0000h 0000_0000h, FFFF_FFFFh Undefined
result FFFF_FFFFh, source negative. Otherwise, result 0000_0000h. result FFFF_FFFFh, source positive. Otherwise, result 0000_0000h. result FFFF_FFFFh, source positive source negative, they both negative source smaller magnitude than source source source positive source greater magnitude than source result 0000_0000h other cases.
Related Instructions
PFCMPEQ instruction. PFCMPGE instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFMAX
mnemonic opcode/imm8 description Packed floating-point maximum
PFMAX mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFMAX vector instruction that returns larger single-precision, floating-point operands. operation with zero negative number returns positive zero. operation consisting zeros returns positive zero. Table page shows numerical range PFMAX instruction. PFMAX instruction performs following operations:
(mmreg1[31:0] mmreg2/mem64[31:0]) THEN mmreg1[31:0] mmreg1[31:0] ELSE mmreg1[31:0] mmreg2/mem64[31:0] (mmreg1[63:32] mmreg2/mem64[63:32]) THEN mmreg1[63:32] mmreg1[63:32] ELSE mmreg1[63:32] mmreg2/mem64[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFMAX Instruction
Source Source Destination
Notes:
Normal Source Source 1/Source Undefined
Unsupported Undefined Undefined Undefined
Source Undefined
Normal Unsupported
result source source positive. Otherwise, result positive zero. result source source positive. Otherwise, result positive zero. result source source positive source negative. result source both positive source greater magnitude than source result source both negative source lesser magnitude than source result source other cases.
Related Instructions
PFMIN instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFMIN
mnemonic PFMIN mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
description Packed floating-point minimum
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFMIN vector instruction that returns smaller single-precision, floating-point operands. operation with zero positive number returns positive zero. operation consisting zeros returns positive zero. Table page shows numerical range PFMIN instruction. PFMIN instruction performs following operations:
(mmreg1[31:0] mmreg2/mem64[31:0]) THEN mmreg1[31:0] mmreg1[31:0] ELSE mmreg1[31:0] mmreg2/mem64[31:0] (mmreg1[63:32] mmreg2/mem64[63:32]) THEN mmreg1[63:32] mmreg1[63:32] ELSE mmreg1[63:32] mmreg2/mem64[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFMIN Instruction
Source Source Destination
Notes:
Normal Source Source 1/Source Undefined
Unsupported Undefined Undefined Undefined
Source Undefined
Normal Unsupported
result source source negative. Otherwise, result positive zero. result source source negative. Otherwise, result positive zero. result source source negative source positive. result source both negative source greater magnitude than source result source both positive source lesser magnitude than source result source other cases.
Related Instructions
PFMAX instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFMUL
mnemonic PFMUL mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
Virtual 8086 Protected Description
description Packed floating-point multiplication
emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFMUL vector instruction that performs multiplication destination operand source operand. Both operands single-precision, floating-point operands with 24-bit significands. Table page shows numerical range PFMUL instruction. PFMUL instruction performs following operations:
mmreg1[31:0] mmreg1[31:0] mmreg2/mem64[31:0] mmreg1[63:32] mmreg1[63:32] mmreg2/mem64[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFMUL Instruction
Source Source Destination
Notes:
Normal Normal, Undefined
Unsupported Undefined Undefined
Normal Unsupported
sign result exclusive-OR signs source operands. absolute value result less then -126, result zero with sign being exclusive-OR signs source operands. absolute value product greater than equal 128, result largest normal number with sign being exclusive-OR signs source operands.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFRCP
mnemonic PFRCP mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
description Floating-point reciprocal approximation
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFRCP scalar instruction that returns low-precision estimate reciprocal source operand. single result value duplicated both high halves this instruction's 64-bit result. source operand single-precision with 24-bit significand, result accurate bits. Table page shows numerical range PFRCP instruction. Increased accuracy (the full bits single-precision significand) requires additional instructions (PFRCPIT1 PFRCPIT2). first stage this increase refinement accuracy (PFRCPIT1) requires that input output already executed PFRCP instruction used input PFRCPIT1 application-specific example this instruction related instructions. PFRCP instruction performs following operations:
mmreg1[31:0] reciprocal(mmreg2/mem64[31:0]) mmreg1[63:32] reciprocal(mmreg2/mem64[31:0])
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
following code example, bold line illustrates PFRCP instruction sequence used compute accurate bits:
PFRCP(b)
PFRCPIT1(b,X0) PFRCPIT2(X1,X0) PFMUL(a,X2)
Table Numerical Range PFRCP Instruction
Source Destination Source Normal Unsupported
Notes:
Maximum Normal Normal, Undefined
result same sign source operand. absolute value result less then -126, result zero with sign being sign source operand. Otherwise, result normal with sign being same sign source operand.
Related Instructions
PFRCPIT1 instruction. PFRCPIT2 instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFRCPIT1
mnemonic PFRCPIT1 mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
Virtual 8086 Protected Description
description Packed floating-point reciprocal, first iteration step
emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFRCPIT1 vector instruction that performs first intermediate step Newton-Raphson iteration refine reciprocal approximation produced PFRCP instruction (the second final step completes iteration accurate bits). Table page shows numerical range PFRCPIT1 instruction. behavior this instruction only defined those combinations operands such that source operand input PFRCP instruction other source operand output same PFRCP instruction. Refer "Division Square Root" page application-specific example this instruction related instructions.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
following code example, bold line illustrates PFRCPIT1 instruction sequence used compute accurate bits:
PFRCP(b)
PFRCPIT1(b,X0) PFRCPIT2(X1,X0) PFMUL(a,X2)
Table Numerical Range PFRCPIT1 Instruction
Source Source Destination
Notes:
Normal Normal Undefined
Unsupported Undefined Undefined
Normal Unsupported
sign result exclusive-OR signs source operands. sign positive.
Related Instructions
PFRCP instruction. PFRCPIT2 instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFRCPIT2
mnemonic PFRCPIT2 mmreg1, mmreg2/mem64 opcode/imm8 description Packed floating-point reciprocal/reciprocal square root, second iteration step
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFRCPIT2 vector instruction that performs second final intermediate step Newton-Raphson iteration refine reciprocal reciprocal square root approximation produced PFRCP PFSQRT instructions, respectively. Table page shows numerical range PFRCPIT2 instruction. behavior this instruction only defined those combinations operands such that first source operand (mmreg1) output either PFRCPIT1 PFRSQIT1 instructions second source operand (mmreg2/mem64) output either PFRCP PFRSQRT instructions. Refer "Division Square Root" page application-specific example this instruction related instructions.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
following code example, bold line illustrates PFRCPIT2 instruction sequence used compute accurate bits:
PFRCP(b) PFRCPIT1(b,X0)
PFRCPIT2(X1,X0)
PFMUL(a,X2)
Table Numerical Range PFRCPIT2 Instruction
Source Source Destination
Notes:
Normal Normal, Undefined
Unsupported Undefined Undefined
Normal Unsupported
sign result exclusive-OR signs source operands. absolute value result less then -126, result zero with sign being exclusive-OR signs source operands. absolute value product greater than equal 128, result largest normal number with sign being exclusive-OR signs source operands.
Related Instructions
PFRCPIT1 instruction. PFRSQIT1 instruction. PFRCP instruction. PFRSQRT instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFRSQIT1
mnemonic PFRSQIT1 mmreg1, mmreg2/mem64 opcode/imm8 description Packed floating-point reciprocal square root, first iteration step
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
none none
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFRSQIT1 vector instruction that performs first intermediate step Newton-Raphson iteration refine reciprocal square root approximation produced PFSQRT instruction (the second final step completes iteration accurate bits). Table page shows numerical range PFRSQIT1 instruction. behavior this instruction only defined those combinations operands such that source operand input PFRSQRT instruction other source operand square output same PFRSQRT instruction. Refer "Division Square Root" page application-specific example this instruction related instructions.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
following code example, bold lines illustrate PFMUL PFRSQIT1 instructions sequence used compute 1/sqrt accurate bits:
PFRSQRT(b)
PFMUL(X0,X0) PFRSQIT1(b,X1)
PFRCPIT2(X2,X0)
Table Numerical Range PFRSQIT1 Instruction
Source Source Destination
Notes:
Normal Normal Undefined
Unsupported Undefined Undefined
Normal Unsupported
sign result exclusive-OR signs source operands. sign
Related Instructions
PFRCPIT2 instruction. PFRSQRT instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFRSQRT
mnemonic PFRSQRT mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
Virtual 8086 Protected Description
description Floating-point reciprocal square root approximation
emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFRSQRT scalar instruction that returns low-precision estimate reciprocal square root source operand. single result value duplicated both high halves this instruction's 64-bit result. source operand single-precision with 24-bit significand, result accurate bits. Negative operands treated positive operands purposes reciprocal square root computation, with sign result same sign source operand. Table page shows numerical range PFRSQRT instruction. Increased accuracy (the full bits single-precision significand) requires additional instructions (PFRSQIT1 PFRCPIT2). first stage this increase refinement accuracy (PFRSQIT1) requires that input squared output already executed PFRSQRT instruction used input PFRSQIT1 instruction. Refer "Division Square Root" page application-specific example this instruction related instructions.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
PFRSQRT instruction performs following operations:
mmreg1[31:0] reciprocal square root(mmreg2/mem64[31:0]) mmreg1[63:32] reciprocal square root(mmreg2/mem64[31:0])
following code example, bold line illustrates PFRSQRT instruction sequence used compute 1/sqrt accurate bits:
PFRSQRT(b)
PFMUL(X0,X0) PFRSQIT1(b,X1) PFRCPIT2(X2,X0)
Table Numerical Range PFRSQRT Instruction
Source Destination Source Normal Unsupported
Note:
Maximum Normal* Normal Undefined
result same sign source operand.
Related Instructions
PFRSQIT1 instruction. PFRCPIT2 instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFSUB
mnemonic PFSUB mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
description Packed floating-point subtraction
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFSUB vector instruction that performs subtraction source operand from destination operand. Both operands single-precision, floating-point operands with 24-bit significands. Table page shows numerical range PFSUB instruction. PFSUB instruction performs following operations:
mmreg1[31:0] mmreg1[31:0] mmreg2/mem64[31:0] mmreg1[63:32] mmreg1[63:32] mmreg2/mem64[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFSUB Instruction
Source Source Destination
Notes:
Normal Source Normal, Undefined
Unsupported Source Undefined Undefined
Source Source
Normal Unsupported
sign result logical sign source inverse sign source absolute value result less then -126, result zero with sign being sign source operand that larger magnitude magnitudes equal, sign source used). absolute value result greater than equal 128, result largest normal number with sign being sign source operand that larger magnitude.
Related Instructions
PFSUBR instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PFSUBR
mnemonic PFSUBR mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
Virtual 8086 Protected Description
description Packed floating-point reverse subtraction
emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PFSUBR vector instruction that performs subtraction destination operand from source operand. Both operands single-precision, floating-point operands with 24-bit significands. Table page shows numerical range PFSUBR instruction. PFSUBR instruction performs following operations: mmreg1[31:0] mmreg2/mem64[31:0] mmreg1[31:0] mmreg1[63:32] mmreg2/mem64[63:32] mmreg1[63:32]
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
Table Numerical Range PFSUBR Instruction
Source Source Destination
Notes:
Normal Source Normal, Undefined
Unsupported Source Undefined Undefined
Source Source
Normal Unsupported
sign result logical sign source inverse sign source absolute value result less then -126, result zero with sign being sign source operand that larger magnitude magnitudes equal, sign source used). absolute value result greater than equal 128, result largest normal number with sign being sign source operand that larger magnitude.
Related Instructions
PFSUB instruction.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
PI2FD
mnemonic PI2FD mmreg1, mmreg2/mem64 Privilege: Registers Affected: Flags Affected: Exceptions Generated
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
opcode/imm8 none none
description Packed 32-bit integer floating-point conversion
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PI2FD vector instruction that converts vector register containing signed, 32-bit integers single-precision, floating-point operands. When PI2FD converts input operand with more significant digits than available output, output truncated. PI2FD instruction performs following operations:
mmreg1[31:0] float(mmreg2/mem64[31:0]) mmreg1[63:32] float(mmreg2/mem64[63:32])
Related Instructions
PF2ID instruction.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
PMULHRW
mnemonic opcode/imm8 description Multiply signed packed 16-bit values with rounding store high bits.
PMULHRW mmreg1, mmreg2/mem64 0Fh/B7h
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
Exception Invalid opcode Device available Stack exception (12) General protection (13) Segment overrun (13) Page fault (14) Floating-point exception pending (16) Alignment check (17) Real
None None
Virtual 8086 Protected Description emulate instruction (EM) control register (CR0) Save floating-point state task switch (TS) control register (CR0) During instruction execution, stack segment limit exceeded. During instruction execution, effective address segment registers used operand points illegal memory location. instruction data operands falls outside address range 00000h 0FFFFh. page fault resulted from execution instruction. exception pending floating-point execution unit. unaligned memory reference resulted from instruction execution, alignment mask (AM) control register (CR0) Protected Mode,
PMULHRW instruction multiplies four signed 16-bit integer values source operand register 64-bit memory location) four corresponding signed 16-bit integer values destination operand register). PMULHRW instruction then adds 8000h lower bits 32-bit result, which results rounding high-order, 16-bit result. high-order bits result (including sign bit) stored destination operand. PMULHRW instruction provides numerically more accurate result than PMULMH instruction, which truncates result instead rounding.
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Functional Illustration PMULHRW Instruction
D250h 5321h 7007h FFFFh
mmreg2/mem64
EC22h
7FFEh
FFFFh
mmreg1
8807h
F98Ch
3803h
0000h
mmreg1
1569h
Indicates value that rounded-up
following list explains functional illustration PMULHRW instruction:
signed 16-bit negative value D250h (-2DB0h) multiplied signed 16-bit negative value 8807h (-77F9h) produce signed 32-bit positive result 1569_4030h. 8000h then added lower bits produce final result 1569_C030h. This rounding does affect final result 1569h. signed high-order bits result stored destination operand. signed 16-bit positive value 5321h multiplied signed 16-bit negative value EC22h (-13DEh) produce signed 32-bit negative result F98C_7662h (-0673_899Eh). 8000h then added lower bits, producing final result F98C_F662h. This rounding does affect final result F98Ch. signed high-order bits result stored destination operand. signed 16-bit positive value 7007h multiplied signed 16-bit positive value 7FFEh produce signed 32-bit positive result 3802_9FF2h. 8000h then added lower bits produce final result 3803_1FF2h. This result been rounded signed high-order bits result (3803h) stored destination operand. signed 16-bit negative value FFFFh (-1) multiplied signed 16-bit negative value FFFFh (-1) produce signed 32-bit positive result 0000_0001h. 8000h then added lower bits produce final result 0000_8001h. This rounding does affect final result 0000h. signed high-order bits result stored destination operand.
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
PREFETCH/PREFETCHW
mnemonic PREFETCH(W) mem8 opcode description Prefetch processor cache line into data cache (Dcache)
Privilege: Registers Affected: Flags Affected: Exceptions Generated:
none none none none
PREFETCH instruction loads processor cache line into data cache. address this line specified mem8 value. processor, line size bytes. future processors, size line that loaded PREFETCH instruction will least 32-bytes. PREFETCH instruction loads cache line even mem8 address aligned with start line (although some implementations, including AMD-K6 family processors, perform cache fill starting from cache miss mem8 address). cache occurs (the line already Dcache) memory fault detected, cycle initiated instruction treated NOP. applications where large number data sets must processed, PREFETCH instruction pre-load next data into Dcache while, simultaneously, processor operating present data. This instruction allows programmer explicitly code operation concurrency. When present data values completed, next already available Dcache. example concurrent operation vertices processing transformations, where next vertices prefetched into data cache while present being transformed. PREFETCH instruction format processor defined allow extensions future K86processors. instruction mnemonic PREFETCH instruction includes modR/M byte. Only memory form modR/M valid (use register form results invalid opcode exception). Because there destination register, three destination register field bits modR/M byte used define type prefetch performed. PREFETCH PREFETCHW instructions defined pattern 000b 001b, respectively. other patterns reserved future use. PREFETCHW instruction loads prefetched line sets cache line MESI state modified anticipation subsequent data writes line), unlike PREFETCH instruction, which typically sets state exclusive. data that prefetched into Dcache modified, PREFETCHW instruction 3DNow!Instruction Chapter
21928G/0-March 2000
3DNow!Technology Manual
will save cycle that PREFETCH instruction requires modifying Dcache line state. PREFETCHW instruction should used when programmer expects that data cache line will modified. Otherwise, PREFETCH instruction should used. Note: AMD-K6-2 AMD-K6-III processors execute PREFETCHW instruction identically PREFETCH instruction. However, Athlon future processors that support PREFETCHW described above will able take advantage performance benefit provided this instruction. more information, Athlon Processor Code Optimization Guide, order# 22007. Table summarizes PREFETCH type options: Table Summary PREFETCH Instruction Type Options
11-xxx-xxx mm-000-xxx mm-001-xxx mm-010-xxx mm-011-xxx mm-100-xxx mm-101-xxx mm-110-xxx mm-111-xxx PREFETCH PREFETCHW Reserved Reserved Reserved Reserved Reserved Reserved Result Invalid Opcode
Note: "Reserved" PREFETCH types result Invalid Opcode Exception executed. Instead, forward compatibility with future processors that implement additional forms PREFETCH instruction, "Reserved" PREFETCH types implemented synonyms basic PREFETCH type (for example, PREFETCH instruction with type 000b).
Chapter
3DNow!Instruction
3DNow!Technology Manual
21928G/0-March 2000
3DNow!Instruction
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Division Square Root
Division
3DNow! instructions used compute very fast, highly accurate reciprocal quotient. Consider quotient a/b. on-chip, ROM-based table lookup used quickly produce 14-15 precision approximation (using just two-cycle latency instruction-PFRCP). full-precision reciprocal then quickly computed from this approximation using Newton-Raphson algorithm. general Newton-Raphson recurrence reciprocal follows: Given that initial approximation accurate least bits, that full IEEE single precision contains bits mantissa, just Newton-Raphson iteration required. following shows 3DNow! instruction sequence produce full-precision reciprocal from this, lastly, complete required division a/b.
Chapter
Division Square Root
3DNow!Technology Manual
21928G/0-March 2000
PFRCP(b) PFRCPIT1(b, PFRCPIT2(X1, PFMUL(a, 24-bit final reciprocal value processor round-to-nearest value approximately arguments. unit-in-the-last-place (ulp). quotient formed last step multiplying reciprocal dividend
Divide Examples
These examples illustrate 3DNow! instructions perform divides. (14-Bit Precision)
MOVD PFRCP MOVQ PFMUL MM0, MM0, MM2, MM2, [mem] [mem] (approx.)
(24-Bit Precision)
MOVD PFRCP PUNPCKLDQ PFRCPIT1 MOVQ PFRCPIT2 PFMUL
MM0, MM1, MM0, MM0, MM2, MM0, MM2,
[mem] [mem]
(approx.) (MMX instruction) (intermed.) (full prec.)
Note: description PUNPCKLDQ instruction, AMD-K6® Processor Multimedia Technology Manual, order# 20726.
Division Square Root
Chapter
21928G/0-March 2000
3DNow!Technology Manual
Square Root
3DNow! instructions also used compute reciprocal square root square root with high performance. general Newton-Raphson reciprocal square root recurrence follows: Zi2) reduce number iterations, initial approximation read from table. 3DNow! reciprocal square root approximation accurate least bits. Accordingly, obtain single-precision 24-bit reciprocal square root input operand Newton-Raphson iteration required using following 3DNow! instructions: PFRSQRT(b) PFMUL(X0, PFRSQIT1(b, PFRCPIT2(X2, PFMUL(b, 24-bit final reciprocal square root value round-to-nearest value approximately arguments. round-to-nearest value ulp. square root (X4) formed last step multiplying input operand
Square Root Examples
These examples illustrate 3DNow! technology perform square roots. (15-Bit Precision)
MOVD PFRSQRT PUNPCKLDQ PFMUL MM0, MM1, MM0, MM0, [mem] 1/(sqrt 1/(sqrt (approx.) (MMX instr.) (sqrt (sqrt
Chapter
Division Square Root
3DNow!Technology Manual
21928G/0-March 2000
(24-Bit Precision)
MOVD PFRSQRT MOVQ PFMUL PUNPCKLDQ PFRSQIT1 PFRCPIT2 PFMUL
MM0, MM1, MM2, MM1, MM0, MM1, MM1, MM0,
[mem]
1/(sqrt 1/(sqrt 1/(sqrt (intermediate) 1/(sqrt (full prec.) (sqrt (sqrt
(approx.) (approx.) step (MMX instr.) step step
Division Square Root
Chapter

Other recent searches


XN0A311G - XN0A311G   XN0A311G Datasheet
WM8960 - WM8960   WM8960 Datasheet
Si9926DY - Si9926DY   Si9926DY Datasheet
PDM-39-9G - PDM-39-9G   PDM-39-9G Datasheet
NCV4276 - NCV4276   NCV4276 Datasheet
TLE4276 - TLE4276   TLE4276 Datasheet
GNR32D - GNR32D   GNR32D Datasheet
CY7C4255 - CY7C4255   CY7C4255 Datasheet
CY7C4265 - CY7C4265   CY7C4265 Datasheet
AN42851 - AN42851   AN42851 Datasheet
AN2352 - AN2352   AN2352 Datasheet

 

Privacy Policy | Disclaimer
© 2012 Datasheet Archive