On making better use of cores

Started by BenYeeHua, October 23, 2012, 03:47:49 AM

Previous topic - Next topic

BenYeeHua

I hope there will also a way to using multi-cores fully without rewrite the code :)

Jeremy Collake

That isn't very likely, though performance may improve as the OS improves its own multi-core use. Normally programs must be redesigned to take better advantage of multiple cores. Sometimes you can just recompile them with a newer or better compiler, or perhaps using different options, and have a positive effect on core utilization via compiler optimizations or runtime library changes - BUT the normal blocker is simple logic. Operation Y needs the output of operation X, thus it can't be started until operation X finishes.

WinRAR, for example, finally started making better use of multiple cores, but in the only way it really can - by breaking up work loads (e.g. compressing one file with one core and another with another core). Decompression of LZ77 derived algorithms like WinRAR is so rapid that it is a single core operation and is fine being one. Compressing a single file though... still a single core operation (afaik, unless it does something drastic and breaks it up into blocks, which *would* hurt the compression ratio since LZ based compression is a sliding window type dictionary).
Software Engineer. Bitsum LLC.

Jeremy Collake

Software Engineer. Bitsum LLC.

BenYeeHua

OK
So the Transactional Synchronization eXtensions did making better use of cores? :)

And did really that, the developer can design the software to using the Intel Processor with HT core better? :)

Jeremy Collake

Quote from: BenYeeHua on October 23, 2012, 12:52:48 PM
So the Transactional Synchronization eXtensions did making better use of cores? :)

We have communication problems. This sentence covers that: ".. performance may improve as the OS improves its own multi-core use" . Yes, improvements in the underlying OS do indeed affect the performance, but they can't drastically alter it in most cases. Very marginal.

Quote
And did really that, the developer can design the software to using the Intel Processor with HT core better? :)

It depends on the application, but the idea is simply to increase parallelism, not looking any deeper (not caring about HT cores, etc.. that is for the CPU scheduler). Parallelism is what application developers work on. Making more of their application do more simultaneously by using multiple threads, when possible.
Software Engineer. Bitsum LLC.

BenYeeHua

#5
OK
What other OS function is affect the multi-core performance except core-parking?
Like taking the other core inside the AMD Bulldozer module as HT?

If the OS is looking the AMD Bulldozer as module, with a sharing L2, can it increase the performance like putting the relate threads on the same module, and just reading from L2, but not L3? :)

And what else can the OS do to increase the performance of multi-core?

Jeremy Collake

#6
I will let you know if I discover any area of the OS you should tweak, or is tweakable, that matters. The OS is highly complex series of abstraction layers, so there is no easy answer to your questions. You are not going to find any magic bullet that does anything more than increase performance marginally, I can bet that. The 'like threads' on the same core defeats the purpose of parallelism and unless the threads share identical code wouldn't be of any assistance, if assisted at all, as I believe the cache is tossed out when a new thread context is put on the CPU (may or may not be fully tossed L1 to L3, but either way, same results).
Software Engineer. Bitsum LLC.

BenYeeHua

Ya, but 1+1+1+1+1+n can increase many performance
Like disable core-parking and dynamic tick in Windows 8 increase many performance.
Although it only solve the stuttering, but it increase much for gamer, real-time edit music too.
----
Yes, the harm for switching threads, so the L1, L2, L3 are important for multi-task?
----
If the active threads(or processes) decrease, the performance will increase as the threads that switch and jumping between core decrease? :)

edkiefer

Most issue of how well a app can multi -thread is what kind of app , developers can only do so much depending on app .

things like photoshop, 3dmax , and many other utility type apps can be multi-threaded well . this is because they can run parallel threads easy .

Now jump to games and your in totally different type of app , here the game/sim is waiting on user inputs or user defined area's . movement, damage  etc . So it can't parallel work to far ahead as it needs to know what to do .
Bitsum QA Engineer

Jeremy Collake

+1 to Ed. That's what I've been trying to express several times. There are a lot of operations that simply must be done in a linear fashion. The output of operation X must be known to be used for the input of operation Y, to put what Ed said in confusing language ;p. I got tired of trying to explain this to be honest.

BenYeeHua, I appreciate your efforts to tune every last bit of performance out of your OS, but I honestly believe your efforts will be counter-productive if you continue down this path. Too much tweaking is a bad thing and can have unpredictable results. In a *best case* scenario, how much could you even gain? 1% speed improvement? That's the issue here. That's why Process Lasso focuses on responsiveness more than performance, because for any software to say that it can magically boost performance - well, that's an iffy statement to say the least. No registry operations, no memory operations, and no CPU operations are going to make your system run *faster*, at least not at any meaningful level. They can change the behavior of your PC to what you desire, get background apps out of the way, etc.. but isn't going to make your PC faster.
Software Engineer. Bitsum LLC.

BenYeeHua

Ya, Process Lasso is reduce the CPU eater performance at the background and give the performance to the other app.
Increase the responsiveness is also like increase the performance of yourself.(for example, the netbook)

That why so many Firefox user move to Chrome, as they feel Chrome is more responsiveness(and performance), so they can operating more as they can see the website more early.
There are some Firefox tweak can be done too, like changing nglayout.initialpaint.delay to smaller or larger according to the network, so it can showing the website more faster(after press enter) but increase the total load time.
Increase the total load time sound like decrease the performance, but as the page can show faster, I can see/read the page first.
----
And I think I am trying to increase the responsiveness by decrease the wait time for it to process.
As I doing is just reduce the stuttering of the computer, like defrag the hard disk and reduce the seek time by putting the most file that access most($MFT for example) under the read/write head that it park(not the parking zone).

There are some ways to decrease the stuttering/increase the responsiveness, like decrease the wait time to process.
For example, the OneFrameThreadLag for the Batman: Arkham City, it is making two threads, a game thread and a rendering thread.
So when it is rendering, the game thread no need to wait for rendering finish to process, but process when it is rendering.

And I also disable the core-parking to let the background apps out of the way by putting it on the other core that unused.
But not let the OS putting it at the core that are processing the foreground app to saving the power and reduce the responsiveness...
----
That is what the meaning for making better use of cores, using the cores more effective, but not saving the power except when I am on the battery.
I am trying to increase the performance for responsiveness, not for reducing the responsiveness :)
Except that computer is not in use. ;D