[GTALUG] example of why RISC was a good idea
Lennart Sorensen
lsorense at csclub.uwaterloo.ca
Sun May 22 12:00:39 EDT 2016
On Sat, May 21, 2016 at 01:33:50PM -0400, D. Hugh Redelmeier wrote:
> <https://software.intel.com/en-us/articles/google-vp9-optimization>
>
> Intel describing how they improved the performance of the VP9 decoder for
> Silvermont, a recent Atom core.
>
> The meat is several not-really-obvious changes to the code to overcome
> limitations of the instruction decoder. The optimizations seem particular
> to Silvermont but the article says:
> Testing against the future Intel Atom platforms, codenamed Goldmont and
> Tremont, the VP9 optimizations delivered additional gains.
>
> These optimizations did nothing for Core processors as far as I can tell.
> I don't know if it affects any AMD processors.
>
> A RISC processor would not have a complex instruction decoder so this kind
> of hacking would not apply. I will admit that there are "hazards" in RISC
> processors that are worth paying attention to when selecting and ordering
> instructions but these tend to be clearer.
>
> Another thing in the paper:
>
> The overall results were outstanding. The team improved user-level
> performance by up to 16 percent (6.2 frames per second) in 64-bit
> mode and by about 12 percent (1.65 frames per second) in 32-bit
> mode. This testing included evaluation of 32-bit and 64-bit GCC
> and Intel® compilers, and concluded that the Intel compilers
> delivered the best optimizations by far for Intel® Atom™
> processors. When you multiply this improvement by millions of
> viewers and thousands of videos, it is significant. The WebM team
> at Google also recognized this performance gain as extremely
> significant. Frank Gilligan, a Google engineering manager,
> responded to the team’s success: “Awesome. It looks good. I can’t
> wait to try everything out.” Testing against the future Intel Atom
> platforms, codenamed Goldmont and Tremont, the VP9 optimizations
> delivered additional gains.
>
> Consider 64-bit. If 16% improvement is 6.2 f/s, then the remaining 84%
> would be 32.55 f/s. Not great, but OK.
>
> For 32-bit, 12% is 1.65 f/s; the remaining 88% would be 12 f/s. Totally
> useless, I think.
>
> Quite interesting how different these two are.
64 bit has twice the registers, which for a lot of code is a huge
difference. That is the biggest improvement AMD made to x86. Scrapping
x87 is probably number 2.
--
Len Sorensen
More information about the talk
mailing list