![]() | This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||
|
Intro doesn't tell the story of what AVX is supposed to do, as in its purpose. Intro refers vaguely to "new features".
The section on AVX-512 looks like it has been copied from a news release. Maybe it should be a separate article. It needs to point out the distinction between the unnamed instruction set of Knights Corner and the AVX-512 instruction set of Knights Landing. The former uses an MVEX prefix and the latter uses an EVEX prefix. These two prefixes differ by a single bit, even for otherwise identical instructions. Therefore the two instruction sets are not mutually compatible, but both are backwards compatible with AVX2. Does anybody have info about the fate of the Knights Corner instruction set? Is it obsolete or will both lines be continued? Afog 09:48, 2 October 2013 (UTC)
I have rewritten the whole page to fix the issues discussed below and because the article had the tag:
![]() | This article reads like a press release or a news article and may be largely based on routine coverage. (March 2009) |
AES, PCLMUL and FMA are separate instruction sets which I have put into separate articles. I have added information about AMD support, operating system support and many technical details. Afog (talk) 11:50, 4 June 2009 (UTC)
I want to clarify that AVE does 4x64 bit (double precision) and 8x32 bit (single precision) but NOT 2x128 (extended precision). The docs seem to indicate this as there is nothing mentioned about 128 bit floating point numbers. Can an expert please verify this statement as it is critical to understanding AVE.
Gene Thomas (talk) 03:08, 22 May 2009 (UTC)
ibm calls AltiVec VMX, and actually never believed in altivec before apple and nintendo insisted on it for the Gecko and the later G5 design upgraded on the G5 to a 256bit SIMD and a 128Bit SIMD on the Gamecube and Wii. personally i think apple might have had something to do with AVX, but that's just speculation Markthemac (talk) 01:45, 9 June 2008 (UTC)
The article states that 1) the published AVX instruction set includes FMA instructions, and 2) FMA will appear in a future extension of the instruction set. There's a contradiction here, please clarify. --85.140.239.250 (talk) 20:22, 5 December 2008 (UTC)
This is supposedly due to the new instructions? I think a link to source material is needed here or atleast a re-write, as generally I understand the term idle to imply that the processor is doing nothing! As I understand it, power usage during idle NOP instructions is a function of the power control unit within the CPU and not due to an instruction which in an Intel cpu is a microcode op or series of microcode ops.
I would suggest that the meaning here is to imply that a future CPU implementing AVE will have enhanced power control over these new instructions, shutting off unused units during AVE instuction execution. —Preceding unsigned comment added by 86.13.171.234 (talk) 00:16, 10 January 2009 (UTC)
The phrase "idle power usage is insignificant" refers to the power usage caused by leakage current of the extra transistors added to implement the AVX logic in the processor. Transistors use power even when they are not switching. All modern x86 processors shutoff idle units by gating the clock signal into the unit. When the clock signal is turned off, power consumption of the unit will be due solely to transistor leakage current. The power consumed by the AVX unit when the AVX unit is idle is an insignificant portion of the total power consumed by the processor. Typically, this would mean below 1% of total power. Ksavage9 (talk) 20:50, 20 April 2009 (UTC)
Please incorporate in article info from http://forums.amd.com/devblog/blogpost.cfm?catid=208&threadid=112934 —Preceding unsigned comment added by 83.167.112.66 (talk) 14:11, 6 May 2009 (UTC)
Free Pascal is developing support for AVX within their SVN repository. I don't know if that's applicable or not to mention. PrincessSchala (talk) 08:09, 14 March 2012 (UTC)
Maybe someone can explain what sqrt with 3 operands means? xmm1=xmm2=sqrt(xmm3)
?
vsqrtsd xmm1, xmm2, xmm3
I'm disappointed - no fast exp(), cos() etc. Unlike CUDA. Intel really has a problem. Oh, and have to get a new operating system to use the thing. Oh, really. — Preceding unsigned comment added by 113.190.231.252 (talk) 11:51, 21 July 2012 (UTC)
Does Windows XP support the AVX instructions? If not, what is the minimum Microsoft OS needed to support AVX? — Preceding unsigned comment added by 222.165.42.62 (talk) 10:49, 25 October 2012 (UTC)
Would it be correct to call AVX the successor to SSE (more precisely SSE4)? — Preceding unsigned comment added by 222.165.42.62 (talk) 04:01, 30 October 2012 (UTC)
I plan to split AVX-512 off to a separate article. The EVEX based AVX-512 has many new details and consists of several new separate extensions, so it will be better dealt with separately. This can leave this article to deal with 128/256-bit VEX encoded AVX extensions. Any comments or objections? Carewolf (talk) 20:18, 23 February 2014 (UTC)
"Knights Landing processor scheduled to ship in 2015"
The reference provided for this sentence doesn't talk about the shipping date. I couldn't find this date yet on the Internet. — Preceding unsigned comment added by 2A00:FE00:4103:1:0:0:0:300 (talk) 08:28, 22 August 2014 (UTC)
This article states that it's safe to use AVX-128 on OSes that support only SSE and don't support AVX.
In Intel(R) Advanced Vector Extensions Programming Reference or Intel® Architecture Instruction Set Extensions Programming Reference, section "3.2: YMM STATE MANAGEMENT" it is clearly stated that
"An OS must enable its YMM state management to support AVX and FMA extensions. Otherwise, an attempt to execute an instruction in AVX or FMA extensions (including an enhanced 128-bit SIMD instructions using VEX encoding) will cause a #UD exception."
Also, according to AMD64 Architecture Programmer’s Manual Volume 6: 128-Bit and 256-Bit XOP and FMA4 Instructions even 128-bit XOP also cause #UD exception if an OS doesn't support YMM save/restore.
So I believe that the statement that "AVX-128 instructions that do not use YMM registers are also safe to use on operating systems without AVX-support[citation needed], since AVX-support in operating systems refers to handling YMM register state.[3]" is totally incorrect. — Preceding unsigned comment added by 109.188.120.144 (talk) 00:46, 25 October 2015 (UTC)
P.S. it makes sense, since AVX-128 instructions clear the upper half of a destination YMM register, so even AVX-128 touches all 256 bits of YMM reg. — Preceding unsigned comment added by 109.188.120.144 (talk) 09:22, 25 October 2015 (UTC)
Using AVX2 registers in 256 bit mode and AVX512 can slow down the program because the overheating protection will drop frequency when some heavy load AVX2 and AVX-512 instructions are used. Read this Article for more informations. This information should be added to the Wikipedia article. --91.89.138.29 (talk) 22:53, 6 October 2019 (UTC)
The article claims "AVX's three-operand format is limited to the instructions with SIMD operands (YMM), and does not include instructions with general purpose registers (e.g. EAX). Such support will first appear in AVX2.[5]". Is this valid? Does AVX2 support instructions in 3-operand format with general purpose registers? --91.89.138.29 (talk) 22:57, 6 October 2019 (UTC)
Indeed, neither AVX nor AVX2 provides 3-operand format instructions for general purpose registers. I have removed the last sentence to correct it. 110.22.247.167 (talk) 09:20, 21 April 2021 (UTC)
Do XMM and YMM stand for something? Jimw338 (talk) 22:04, 19 August 2020 (UTC)
I saw on Steam a comment re AVX being required for the final boss fight (???) for Death Stranding on PC. Anyone know more about this? E.g. should it be included in the list, and if so for which version?
The percentages listed in that section aren't a reflection of how the processors actually handle things. A fixed ratio multiplier (multiplied by the processor bclk reference, so usually 100MHz unless somebody messed with it) is used to calculate the frequency drop, so any percentages are only relevant for whatever specific speed processor the person testing them tested on and at unmodified clock settings. See the link to the XTU guide I posted in that section. I'm not sure how to re-word it correctly because I don't feel like digging through somebody's findings for their specific model and figuring out the pre-set ratios it used, but for example on Broadwell-E the AVX2 drop ratio is 2x by default, which results in a drop of 200Mhz below the TB3 frequency, and it can be changed in either the BIOS or XTU given specific cooling and possibly slight core voltage increase if the processor is being overclocked to begin with. There is no curve built in of "more cores = lower speed" with AVX. the 5117 has the same 105W TDP but is clocked at 2.0 / 2.8 instead of 2.2 / 3.2 which is kind of suspicious.
Anyway, on an i7-6950x Broadwell-E with the AVX2 ratio set to the default of 2x, the processor TDP ignored because of sufficient water-cooling, and all cores set to a 4.2GHz turbo ratio, every core runs at 4.0GHz because there's no TDP to exceed. Even with TDP left on the defaults there's no real drop because the turbo max defaults to 185W and is heat-limited at that clock speed. Raising it even slightly would produce different results either from heat or exceeding max turbo TDP since voltages don't scale linearly with clock increase. Changing that ratio to 0x results in instability unless core voltages are raised, but since running all cores at 4.2GHz is much higher than the intended one core max turbo for that processor in the first place there's no real point in doing so. Likewise I suspect if I lowered my TDP to 105W via BIOS or XTU I'd immediately start seeing that kind of rapid tanking of performance with high all-core AVX2 usage regardless of whether the ratio offset was set or not. I'm too lazy to do it and original research is useless here anyway (unless you post it on stackexchange apparently), but that's just how the TDP and thermal limiting work on these things. It's a huge part of why everyone can sell these processors with massive turbo clocks and unlocked ratios and not worry about anyone frying them even though (mostly) nobody really understands what exactly it is they're changing or how any of it works. If they do, well, running your memory at the XMP speed it and the processor itself in the case of AMD was advertised at instead of its base frequency already voided the warranty on the processor so Intel / AMD don't really need to worry too much.
It doesn't have AVX512 so I can't test that but the behavior of both ratios is the same; a constant offset from the max ratio; This kind of test can't really be done on a "true" Xeon aside from some earlier v4 and possibly Skylake parts which could be forced to allow overclocking if their microcode wasn't updated, but Broadwell-E was a Xeon anyway and the same applies.
The defaults can be different between processors. On the linked Ice Lake, the information is almost certainly wrong in that the L1 downclock still exists but has the ratio set to 0x for that model and the L2 ratio is set to 1x. I suspect in XTU if someone was overclocking it they could adjust both ratios to be non-zero to keep voltages and thermals within a reasonable range while AVX code was running or set the L2 ratio to zero and increase allowed power / TDP a bit if they really needed the extra 100MHz of speed in AVX512 code.
AFAIK the whole notion of downclocking in the first place came about when the unlocked higher core count Haswell-Es were released and didn't implement this, and someone turned off both the thermal and power limits in their BIOS then tried to run heavy AVX2 code and fried their processor which now had everything telling it to turn off before it caught on fire disabled. Then they whined about it up and down the entire internet for 2 years, so Intel implemented the downclocking to avoid that in Broadwell-E and it became more necessary when power management changed with Skylake and TDP was followed less strictly anyway and AVX512 was even more of a power hog.
In any case my point is that if any hard numbers are to be left in that section they should probably be specified as examples, specific to those single processors, and in terms of base clock multipliers since percentages make little / no sense even on similar speed chips as seen from the 14C skylake server chip downclocking immensely to keep within the low TDP. --A Shortfall Of Gravitas (talk) 08:46, 18 July 2021 (UTC)
It says "AVX provides new features..."
Would it be possible to provide one or two examples of the functions that get the most benefit from this instruction set(s)? Thanks Nei1 (talk) 15:17, 23 February 2023 (UTC)
The OS has apparently supported it since the release of the "Game Porting Toolkit 2," however the real addition to the OS to allow the execution of AVX 2 instructions was modifications done to Rosetta 2 as evidenced by this github thread using it and finding that there is also an issue with it's implementation. Sussis Amogus (talk) 21:32, 27 February 2025 (UTC)