

We aren’t consistently pushing the data through to the L2, there are some simplifications if you do that, but we added the complexity of a write-back so now we keep stuff way more local. The other thing we did is the write-back L1 Cache. Now we just stuff those micro-ops into the op-cache, all the decoding done, and the hit-rate there is really high, so that means we’re only doing that heavy-weight decode 10% of the time. You have this expensive logic block chunking away. So you pump all these x86 instructions in there, burns a lot of power to decode them all, and in our prior designs every time you encounter that code loop you have to go do it again.

I mean, guys make their career doing this sort of thing.

Sam Naffziger: “X86 decode, the variable length instructions, are very complex - requires a ton of logic. It gives us that double-whammy of a power savings and a huge performance uplift.” We can actually put them in that cache 8 at a time so we can pull 8 out per cycle, and we can actually cut two stages off that pipeline of trying to figure out the instructions. When you find the first one, you find all its neighbors with it. We actually call it an op-cache because it stores in a more dense format than the past what it does is, having seen once, we store them in this op-cache with those boundaries removed. To do that, generally we’ve had to build deep pipelines, very power hungry to do that.
Interview amd chief architect zen ryzenian serial#
That means to try to get a lot of them to dispatch in a wide form, it’s a serial process. Michael Clark: “One of the hardest problems of trying to build a high-frequency x86 processor is that the instructions are a variable length. uOp cache was a major discussion piece during the pre-launch press briefing, so seeking more detail on the role of a uOp cache in Zen was the first objective: Our first question to Clark was of the Zen operation cache and Ryzen’s micro-op cache.
