New feature request - IO priority

Started by adx, June 29, 2009, 12:28:28 AM

Previous topic - Next topic

adx

First - great app, I've been looking for something like this for years. Windows 2000+ was a huge improvement in stability, but still remains just as susceptible to "pointless freezing" where some process freezes the GUI to the point where it's impractical to kill the offending process. Process Lasso is the first utility I've tried that actually works. So I just bought it.

Anyway, my computer still freezes, but because of a completely different problem: disk IO overload. I've noticed Process Lasso can actually help when the offending process also hogs CPU, but not by design.

The best example is the email client I use - Pegasus Mail. When it starts up it opens every file in the inbox to read the headers, so when the inbox grows to 2000 messages it takes an eternity to open the first time (but almost instant once cached). Now that's fair enough if I want to keep using that app and have let my inbox get that full... The problem is that the computer is effectively unresponsive during that time, can't open a browser window, or do anything that needs access to the disk. Other apps which use almost no CPU to open a ton of small files (or IO requests) have a similar effect.

So my suggestion would be for something to patch this behaviour - not full on IO prioritisation, perhaps just deny an IO hogging app any CPU cycles until its "hoggingness" reduces to a point where other apps can get a look in. But I don't know if the performance info is available (which app is causing the IO delays), or whether this approach would cause lock-ups (might need to deny it CPU only in bursts).


Jeremy Collake

#1
Thanks for your purchase and suggestion.

Your idea is a good one for sure, and has been requested by others as well.

There are some implementation challenges, but it can certainly be done. The I/O performance information is available. The tougher part will be in making sure the algorithm activates at appropriate times, and in appropriate way. The algorithm will work best in Vista+ (where I/O prioritization exists), but will be effective in 2000 and XP as well.

I will see what I can do in a future version. This is something I think many people would benefit from. I had perpetually deferred this feature request in the past, mostly due to implementation difficulty, but the time has come I think. I can't say for sure when I'll get to it, but I'll keep you updated via this thread.

If you have any other suggestions, or need anything, don't hesitate to let me know.
Software Engineer. Bitsum LLC.

adx

That's great!

I had another thought after posting. The simplistic approach I suggested would slow down legitmate disk IO. In fact it would effectively prevent simultaneous CPU and disk usage now I think about it (marketing name "16 bit disk access"). So it would need to know if apps are currently "fighting" for IO time, ideally only ever acting on background tasks, or just certain known-bad tasks. (IO is starting to look realy silly, I'll use I/O.)

Enabling some sort of throttling on a per-task basis might be a good way to get it introduced, then people (like me) can turn it on progressively (task by task) and do some testing while still getting benefit. Just something very simple might work fine, as the goal is only to stop the computer becoming completely unresponsive rather than worry about optimum sharing.

Anyway I'd better get back to work!

Jeremy Collake

#3
LOL, "16-bit disk access".

The approach I have in mind re-prioritizes the processes making the I/O requests -- with the assumption being that the unresponsive system state you experience is the result of other threads being forced to wait on their I/O requests on the monopolized device to be fulfilled (i.e. page-in operations). This may not be a valid assumption, but it seems like the most likely explanation.

In Vista+, the I/O priority is derived the priority of the thread that originated the IRP (I/O Request Packet). In XP, even though there is no I/O priority, the IRP request rate will typically be reduced during high loads for processes with less than normal priority - due to the decrease in CPU time slices granted to the threads of that process (same effect also seen in Vista+, of course).

This is a little less restrictive than any actual I/O throttling, as it will allow for continued I/O even during 'restraint'. However, even this approach will require manual configuration by the user to be effective. The problem is ensuring fair I/O access to all processes (well, threads) while preventing monopolization.. its not easy to determine when a thread is overly utilizing an I/O resource... nor it is easy to determine which threads require immediate I/O access and which threads don't.

I'll indeed have to force the user to make the decision on which processes should be given I/O precedence, and which should be run in the background. Then there is the next question about how to deal with differing I/O types...

Also, in some cases, an IRP may be delayed for reasons other than the resource being monopolized by other threads. The algorithm here is just difficult... to even know when a process is having I/O problems requires keeping track of the typical IRP latency (aka response time) and checking for deviation from the norm on that computer, for that device... since it will, of course, vary between systems and devices. 

Anyway, you see why I avoided this for so long.. Once you get to thinking about the implementation, lots of questions arise. I'll have to keep thinking about it, and wait for some epiphany that greatly simplifies the algorithm. Even with the user deciding which processes get I/O precedence for certain devices, the technology is still difficult to implement... and I fear doing more harm than good.

I was kind of thinking aloud in this reply, so I apologize for its rambling nature..
Software Engineer. Bitsum LLC.

Jeremy Collake

I thought about this a little more in my sleep. As a first step, I'll do what you suggest and implement some manual I/O throttling code. However, actual throttling of I/O will require a lot of work to implement, and require a kernel mode device driver. I think I can tweak the CPU throttling code so that I/O throttling is also induced, as an inherent side-effect of the CPU throttling. That may solve the problems of most users, and be simple to implement at the same time.

Software Engineer. Bitsum LLC.

knolle

Quote from: jeremy.collake on July 03, 2009, 11:38:26 AM
I think I can tweak the CPU throttling code so that I/O throttling is also induced, as an inherent side-effect of the CPU throttling. That may solve the problems of most users, and be simple to implement at the same time.

defenetly and let users chose witch prosses to implement this new I/O throttling

for exampe some HD recording software uses a lot of resurses but you dont want to restrain them

Jeremy Collake

#6
Quote from: knolle on July 03, 2009, 11:52:34 AM
defenetly and let users chose witch prosses to implement this new I/O throttling
for exampe some HD recording software uses a lot of resurses but you dont want to restrain them

Yes, I will.. no process will be throttled unless the user specifically chooses the throttle it.

I added to v3.61.3 beta a new 'lowest' CPU throttling tier. I think application of this throttle to an I/O intensive background process would mitigate its effects on system responsiveness. However, I'm still testing that hypothesis. If it works, then I can set up a way to automatically invoke it at appropriate times for processes the user indicates are problematic.

Additionally, I am going to revisit a different idea I abandoned last month. It is specific to Vista+, but *IF* I can make it work then it will be a great solution to I/O related responsiveness problems... My prior testing of this idea didn't go well, but perhaps a fresh look at it will yield better results. In theory, it should work.

I've been all over the map with my thoughts on this subject.. but that's a good thing I suppose. I really feel like there is a perfect solution that I'll stumble upon if I keep hacking at this issue. Of course, as adx suggested, the solution won't require completely fair I/O sharing.. we simply need to prevent total monopolization of certain devices (usually just the hard drive).

Of course, the fact that the I/O of a single process at normal priority can cause responsiveness problems, particularly in Vista, is somewhat disturbing.. it doesn't seem like this should occur.








Software Engineer. Bitsum LLC.

adx

I seem to have triggered a flurry of activity!

I have also been trying to get my head around how one process can monopolize disk access. My theory was that anything which uses almost no CPU but does a lot of disk access (the "evil" task) can fire off requests at such a rate that it effectively stalls access to the disk for everything else. It doesn't matter what the "evil" task's CPU priority is, or whether the "good" tasks are using CPU or not: If not then the "evil" task will still get all the CPU it can eat at idle priority and thus still saturate the disk. If the "good" task is using a lot of CPU, then presumably the "evil" task can still saturate the disk even with the tiny fraction of CPU that it gets allocated, because it requires so little to do so (eg firing off reads in a tight loop).

But CPU is allocated in time slices, as far as I know in round robin style, so if I/O requests are processed on a first come first served basis then all applictions should get a look in? Within one cycle of the scheduler too, especially if the "good" task is the foreground task. Maybe the "evil" task can bypass the CPU scheduler (eg interrupt/event driven chaining of read requests).

Anyway, I noticed that PL's throttling works in quite coarse bursts, as would be needed to throttle disk I/O. I can confirm that using it on an "evil" app (Pegasus Mail) stops it monopolizing I/O very well. BTW to open the new mail folder does ~5000 I/O reads (from Process Explorer). Another "evil" app (Mailwasher) does 50000 when it opens, but they're not separate files so presumably benefits from reading ahead caching because it's not 10 times slower - but it still prevents anything else from accessing the disk for the duration. So I think the burstey CPU throttling technique will work - kernel mode driver sounds like a lot of trouble.

Yes I think latency is the key to the decision making process, because that is what the user experiences. If a background task has built up 20 seconds of I/O latency to do 5000 I/O reads, while the foreground task has been waiting 20 seconds and yet to have its first read to return any data, then something is obviously out of balance.

Disk spin up is another big cause of delays on my system (Seagate external drive came formatted with NTFS = mandatory recycle bin access any time a file is deleted). But that and Vista's problems are a whole separate issue!

adx

No, I'm still a bit confused about the sharing criteria. What's important is the way I/O "time" is shared between tasks, ie you need to know how long the I/O channel is spending on each request for each task, not the total latency for each task's I/O. In my example above the foreground task may have been waiting 20 seconds but has probably received 0 seconds of I/O time. It may be that Windows doesn't provide this info?

I know it provides bytes/sec, but that's even more useless: Reading 1 byte from the beginning of 5000 random files on the disk (close to what Pegasus Mail does) will transfer less than 5K of data in 20 seconds or however long it takes. Whereas Word might want to open a 50MB file which the disk can provide in 1 second. If sharing is based on data rate, then this won't even remotely work (possibly why Vista has a problem).

Perhaps some guesses can be made based on latency, reads, data rate, and possibly whether the requests are for different files (implying a lot of time spent seeking)?

Jeremy Collake

#9
Quote
But CPU is allocated in time slices, as far as I know in round robin style, so if I/O requests are processed on a first come first served basis then all applictions should get a look in? Within one cycle of the scheduler too, especially if the "good" task is the foreground task. Maybe the "evil" task can bypass the CPU scheduler (eg interrupt/event driven chaining of read requests).

Yes, one would imagine that both the CPU scheduler and I/O scheduler would be more fair than they are. I don't believe the cause is pre-emptive interrupts. It just seems to be the way Windows is designed. You're right that it doesn't take many CPU cycles to fire off a bunch of I/O requests.. which may be part of the issue. The other, I think, is the size of the I/O requests...

With Vista, I/O is supposed to prioritized.. but I really haven't seen this work at all. In fact, it seems to me that XP works better... perhaps due to a limit on the IRP size. In XP and below, there was a limit on the size of a single I/O request - 64KB, iirc. With Vista, there is no limit. So, a process could fire off a request to read 10MB... this single request will keep the drive device busy servicing it, leaving other I/O requests to wait. At least, that's a possibility...

I just googled to check my facts, here's a good summary of the changes to the Vista I/O scheduler: http://en.wikipedia.org/wiki/Windows_Vista_I/O_technologies

UPDATE: I'm going to see if its possible to artificially limit the maximum I/O request size in Vista+ to test my hypothesis.

Quote from: adx on July 03, 2009, 11:46:15 PM
Perhaps some guesses can be made based on latency, reads, data rate, and possibly whether the requests are for different files (implying a lot of time spent seeking)?

Yes, that may be possible, but still difficult to implement. I still can't envision a good way to determine when an I/O intensive process needs to be temporarily throttled. There doesn't seem to be a perfect way to determine this... as best I can tell. With CPU utilization, its pretty easy.. with I/O, its not at all.

I'm glad the CPU throttling works to reduce the I/O monopolization, as I suspected it would. Process Lasso does throttle in very coarse bursts... though I've been experimenting with throttling in very short bursts as well (higher granularity, more throttles per second I mean). You're right the the more coarse bursts probably do better at I/O throttling anyway.

Quote
No, I'm still a bit confused about the sharing criteria. What's important is the way I/O "time" is shared between tasks, ie you need to know how long the I/O channel is spending on each request for each task, not the total latency for each task's I/O. In my example above the foreground task may have been waiting 20 seconds but has probably received 0 seconds of I/O time. It may be that Windows doesn't provide this info?

Yes, an important metric is how much time is dedicated to servicing the request of each calling thread. I don't know how to calculate this off the top of my head, though there may be a way. I can track I/O response time pretty easy, and anything else that you can track with the performance monitor. I'll have to look and see if there are any other metrics that might help in determining when a process needs throttling.

Data rate/throughput definitely won't work, as you concluded.

I can determine when the hard drive is being monopolized, then infer the most likely culprit. This can be accomplished simply be periodically issuing I/O requests on the drive, and tracking the response time. Of course, CPU utilization and other system factors can also affect the IRP response time, but we'll ignore that for now.

So, if Process Lasso were to track the I/O response time for the primary hard drive, and can see decreases in that response time, then it could infer which process(es) are causing this and throttle them. I could infer this based on simple CPU utilization.. while it wouldn't be perfect, I think its likely the monopolizing process(es) are going to be utilizing more CPU cycles than other processes. So, its probably a reasonably reliable way to determine which processes are causing the responsiveness issue.

The problem with this approach is that we really only want to throttle at certain times, not every time the disk was heavily utilized. This would decrease performance.

Ugh, anyway.. my brain is broken again this morning. I'll think more about this as time passes.. I am still waiting on an epiphany ;).






Software Engineer. Bitsum LLC.

adx

An update - I did a bit more testing and noticed that apps like Pegasus Mail (reading an oversized new mail folder with thousands of files on startup) don't always monopolize the disk. Often Excel won't load at all until Pmail has finished reading its files (takes a good 20 seconds or more), but other times it seems to load slowly as you'd expect. I can't explain it. This is after a reboot, nothing in the cache. BTW Pmail uses about 4% CPU while doing its thing.

Another possible idea for gauging an active but I/O hogging task might simply be to look at the bytes per sec while I/O is saturated. If it's low (but not zero) then it's very likely that it is making the disk seek and doing very little. It might even be possible to pre-detect this situation if no other tasks are waiting, when it can be positively identified as a problem. While it's possible that a non-hogging task (reading sequential bytes from a big file) could get caught up in this detection (due to requests being delayed by a hogging task), most software will have a decent request size therefore be unable to provide consistently low bytes/sec. Not foolproof, but perhaps enough to point the finger at some background task and apply a little bit of CPU throttling?

eman6628

Not sure what the development status of the requested feature in this post? Did the current version of Lasso have this feature added, but I can't seem to find where to activate it? My main purpose is to use for determining which process is hogging the IO.

Jeremy Collake

@adx: Good ideas and feedback, sorry I didn't see it until now ;o. I'll consider that for v5.

As for I/O priorities. It is a very complex subject, and you'll not find a perfect solution. The I/O priority is going to be derived from the calling thread's priority, so in a way Process Lasso already addresses I/O priorities. It doesn't do so directly, but does indirectly. Specifically trying to adjust I/O priorities is difficult to do *right* in an effort to boost performance. It will take a lot of testing and review, much more than it took with ProBalance even.

With Windows Vista and Windows 7 the new Resource Monitor is great for determining what is hogging whatever device you are concerned about I/O (e.g. hard drive). Its too bad nobody has written one for XP that is that cool, at least not that I've seen.

Software Engineer. Bitsum LLC.

Jeremy Collake

#13
Implemented as of v4.09.07 beta. Some work remains, but finally done. However, the I/O priority is derived from the calling thread's priority, so unless you're in the odd situation of needing a different CPU priority than I/O priority, it may not be as useful as you hoped ;o.
Software Engineer. Bitsum LLC.