Memory Half 2: CPU Caches
페이지 정보

본문
It should have been famous in the text that a lot of the description of multi-cache interaction is specific to x86 and similarly "sequentially-constant" architectures. Most fashionable architectures are usually not sequentially consistent, and threaded applications have to be extremely careful about one thread relying on knowledge written by another thread turning into visible within the order during which it was written. Alpha, PPC, Itanium, and (sometimes) SPARC, but not x86, AMD, or MIPS. The consequence of the requirement to take care of sequential consistency is poor performance and/or horrifyingly complicated cache interplay equipment on machines with more than (about) 4 CPUs, so we will expect to see extra non-x86 multi-core chips in use quickly. I think your criticism is misdirected. The textual content would not touch on memory consistency at all - it is entirely out of its scope. In addition to, you need a cache coherency protocol on any multi processor system. With regards to memory consistency, there are different opinions.
A while ago there was a very interesting discussion in RealWorldTech the place Linus Torvalds made an interesting level that it can be argued that express memory obstacles are dearer than what the CPU has to do in order to create the illusion of sequential memory consistency, because explicit MBs are by necessity more common and actually have stronger guarantees. Sorry, not true. It describes how caches of various x86 CPUs work together, but would not say it solely describes x86, falsely suggesting that is how every other machine does it too. It leaves the reasonable reader below the impression that programmers needn't know something about memory consistency. That is not solely true even on x86, however is simply false on most non-x86 platforms. If Ulrich is writing for people programming solely x86, the article ought to say so without quibbling. If not, it ought to name out locations where it's describing x86-particular behavior. To the best of my knowledge, the description in the article applies to all cache coherent methods, together with those listed in your earlier publish.
It has nothing to do with memory consistency, which is a matter largely inner to the CPU. I am very presumably incorrect, in fact - I am not a hardware system designer - so I am glad to discuss it. Are you able to describe how the cache/memory behavior in an Alpha (for instance; or every other weak consistency system) differs from the article ? I agree that coding with memory obstacles (and so on.!) is a big subject, and beyond the scope of this installment. It could have sufficed, although, to mention that (and where) it's a matter for concern, and why. 86 and x86-64 truly aren't sequentially-consistent, neural entrainment audio because this could lead to an enormous performance hit. They implement "processor consistency" which suggests masses can go stores but no different reordering is allowed (except for some particular instructions). Or to put it another way, loads have an acquire barrier and stores have a release barrier.
Implementations can issue masses to the bus out of order, but will invalidate early hundreds if obligatory to realize the same have an effect on as if all hundreds have been achieved in order. Specific memory barrier instructions may be needed or helpful even on x86 and x86-64. However ideally programmers will use portable locking or lockless abstractions instead. The concept of disabling hyperthreading (SMT) in the BIOS as a manner to scale back cache misses and presumably improve performance is fascinating (and pertinent to me as I run a system with such a CPU and motherboard). After all, my CPU appears to utilize this feature about 10% of the time, neural entrainment audio and even then it is normally (99.99% of the time) with two distinct, non-threaded purposes. It does appear logical that, if the hyperthreaded CPU shows as two CPUs to the OS (I get two penguins at boot time plus cat /proc/cpuinfo shows two processors), but every virtual CPU is sharing the identical 512K of L2 cache, then possibly my Laptop is sucking rocks in efficiency due to the cache miss charge alone.
- 이전글The Reasons You Should Experience Excersise Bike At A Minimum, Once In Your Lifetime 25.12.05
- 다음글Why Is 18V Cordless Demolition Hammer So Popular? 25.12.05
댓글목록
등록된 댓글이 없습니다.
