Disk fragmentation, the bright side.
Disk fragmentation is called to the phenomenon of having a file scattered around different parts of the disk instead of being contiguos. This happens after many read/write operations so when you need to add more data to a file, the next space is not available or maybe there is not enough contigous space, but the sum of the free spaces is enough to store it. At both cases, we would have a file that is placed along the hardisk that would need a lot of more time to be read since the hard drive head would need to move around all the diferent places (a slow operation compared with just reading and not repositioning the reading head).
This problem is more common on MS file system, but no privative to them. The other OS can have disk fragmentation but they try to deal with that. In order to circumvent that problem, the OS uses two combined strategies, in the first place it does not care if a file is going to be fragmented, it just write it as quicky as posible, the next step is to have a program that tries to rellocate all the fragmented files in an unfragmented way.
Most of the time this process of disk fragmentation removal is called defragment, so it is common to the user to (wrongly) call a fragmented disk as "defragmented".
Anyway, since the read/write operation are quite slow, and as explained before the reading head rellocation is time consuming, the defragmention is a slow process that takes a lot of time and make almost unusable the machine (thats why people tend to defrag their disk when they are not using their machines).
After those thaughts you may be start considering that having a fragmented disk is not quite good and that the operative system should do a better job. But it is not all true on every case. A proactive aproach would detect a disk fragmentation before it becames a problem, it may rellocate the files on the fly so the disk fragmentation wuld not be a problem.
Theorically the proactive approach is amazing, but in the practice it would not mean much less read/write operations than a full defrag would do. It is just delaying the problem, and in the worst case scenario, the slowdown could happen at unexpected momments, even when you need a fast response of the equipment. Thats why OS thend to ignore the fragmentation problem: it is too complicated and there is little gain to worth the extra work.
A combined strategy of delayed/proactive defragmentation would be more appropiated, since it would start moving the files when the machine is iddle. Maybe there could be an agent or daemon monitoring hard drive status and processor usage, working in a similar way that the Seti@home project does.
The other part is that not all the fragmentation is quite bad, actually in some scenarios it could be helpful. That may sound crazy but I will try to explain myself about it in the following lines.
The main conflict with the disk fragmentation is that you need to rellocate the reading head in order to read the complete file. And thats true, but it is not true that you need to read the whole file on a given time.
On modern OS and machine the multitasking/multiprocesing is a normal activity, people have more than a process runing and most of the time they have less processors than process runing. So the execution is shared among the processes meaning that each process have a little fraction of a second to do its job. So it gain control start executing, realizes that need to read some data from the disk, it ask the OS to read the file, and it sends signals to the hardrive to move the head to a given position, because before that, the head was reading the data of the prior process, so starts reading but the time is over for the process and the control is passed to another process that also would need to move the reading head.
So, what if the needed files for the different processes were fragmented in an intelligent way, allowing that after a process gain control the reading head is just above the data that would be needed to read?. Then we would have, in this hipotetically case, a better performance than if the files where unfragmented. But the needed preconditions are so wide that it is unpractical in most of the cases; since you need to know before execution, which processes would be run, but also, the order and the data that they are going to be readed. Also that distributions depends on the clockspeed of the processor and the speed of the hardrive.
Do you think it is clumbersome?, yes it is. But there is one case (maybe more) in which you know beforehand how the processes are going to be started and which data they are going to need. When the system is booting up it needs to load and execute different process and services, most of the time they are the same and are invoqued in the same order, every time.
So in that particular case it would be possible to organize the data on the disk in a fragmented way that would minimize the disk drive head moviming among the process loading. But the gain of performance would not be so noticeable to justify the data reording and the complexity of the operation. Anyway it is an interesting research area.
In a nutshell, it was explained the concept of disk fragmention, how it affects the performance of the system and why most of the time it is not desireable. Also it was presented an alternative to manage the disk fragmentation and a scenario where the disk fragmentation is useful, and it has been presented an interesting study area.
This problem is more common on MS file system, but no privative to them. The other OS can have disk fragmentation but they try to deal with that. In order to circumvent that problem, the OS uses two combined strategies, in the first place it does not care if a file is going to be fragmented, it just write it as quicky as posible, the next step is to have a program that tries to rellocate all the fragmented files in an unfragmented way.
Most of the time this process of disk fragmentation removal is called defragment, so it is common to the user to (wrongly) call a fragmented disk as "defragmented".
Anyway, since the read/write operation are quite slow, and as explained before the reading head rellocation is time consuming, the defragmention is a slow process that takes a lot of time and make almost unusable the machine (thats why people tend to defrag their disk when they are not using their machines).
After those thaughts you may be start considering that having a fragmented disk is not quite good and that the operative system should do a better job. But it is not all true on every case. A proactive aproach would detect a disk fragmentation before it becames a problem, it may rellocate the files on the fly so the disk fragmentation wuld not be a problem.
Theorically the proactive approach is amazing, but in the practice it would not mean much less read/write operations than a full defrag would do. It is just delaying the problem, and in the worst case scenario, the slowdown could happen at unexpected momments, even when you need a fast response of the equipment. Thats why OS thend to ignore the fragmentation problem: it is too complicated and there is little gain to worth the extra work.
A combined strategy of delayed/proactive defragmentation would be more appropiated, since it would start moving the files when the machine is iddle. Maybe there could be an agent or daemon monitoring hard drive status and processor usage, working in a similar way that the Seti@home project does.
The other part is that not all the fragmentation is quite bad, actually in some scenarios it could be helpful. That may sound crazy but I will try to explain myself about it in the following lines.
The main conflict with the disk fragmentation is that you need to rellocate the reading head in order to read the complete file. And thats true, but it is not true that you need to read the whole file on a given time.
On modern OS and machine the multitasking/multiprocesing is a normal activity, people have more than a process runing and most of the time they have less processors than process runing. So the execution is shared among the processes meaning that each process have a little fraction of a second to do its job. So it gain control start executing, realizes that need to read some data from the disk, it ask the OS to read the file, and it sends signals to the hardrive to move the head to a given position, because before that, the head was reading the data of the prior process, so starts reading but the time is over for the process and the control is passed to another process that also would need to move the reading head.
So, what if the needed files for the different processes were fragmented in an intelligent way, allowing that after a process gain control the reading head is just above the data that would be needed to read?. Then we would have, in this hipotetically case, a better performance than if the files where unfragmented. But the needed preconditions are so wide that it is unpractical in most of the cases; since you need to know before execution, which processes would be run, but also, the order and the data that they are going to be readed. Also that distributions depends on the clockspeed of the processor and the speed of the hardrive.
Do you think it is clumbersome?, yes it is. But there is one case (maybe more) in which you know beforehand how the processes are going to be started and which data they are going to need. When the system is booting up it needs to load and execute different process and services, most of the time they are the same and are invoqued in the same order, every time.
So in that particular case it would be possible to organize the data on the disk in a fragmented way that would minimize the disk drive head moviming among the process loading. But the gain of performance would not be so noticeable to justify the data reording and the complexity of the operation. Anyway it is an interesting research area.
In a nutshell, it was explained the concept of disk fragmention, how it affects the performance of the system and why most of the time it is not desireable. Also it was presented an alternative to manage the disk fragmentation and a scenario where the disk fragmentation is useful, and it has been presented an interesting study area.

0 Comments:
Post a Comment
<< Home