Quote:
Originally Posted by
Zero4549 
@ Kramy - I was pretty sure it was possible as you have confirmed, but the question remains - does any program actually
have this function?
Not to the level you want. Defraggler and MyDefrag do a far better job of getting related data close together, but they don't follow that rule to the letter - if something doesn't fit, it ends up elsewhere. Forcing it to do it perfectly is impossible while your OS is running, so defragmenters generally make a "best effort" attempt.
Quote:
Originally Posted by
Zero4549 
@ Grayfox99 - I understand that the OS would have difficulty defining how much additional space should be reserved for a file as it is being originally written, and that over time even this system would result in fragmentation and non-contiguous folders. That is certainly a hard problem to solve, if it's even possible.
It results in fragmentation extremely quickly. That's what plagues Java (heap fragmentation; time for a garbage collect!) and SSDs. NTFS's solution is to leave gaps all over, then fill them up as it needs to. Data usually gets pretty close, but it does get fragmented as it lands. Even small files can be dumped into a dozen or so pieces.
Basically, if you prioritize low fragmentation you'll have trouble getting the data down quickly; the drive may have to seek further. (we're talking a few extra milliseconds here, but if you're going after TPM/TPS benchmarks, I suppose that matters) If you prioritize getting it down faster (like NTFS), you end up with more fragments. (Hopefully on the same track, which don't affect access times much; but eventually NTFS spirals out of control and fragments end up splattered all across the drive.) Ext4 manages to do both well because of the extents, which provide it with more information about where to place files. Even that isn't perfect - no filesystem is perfect - but I'm fairly confident that Ext4 is superior to NTFS.
Quote:
Originally Posted by
Zero4549 
In fact, I'm sure that is the entire reason that experts have focused so much effort into defrag software in the first place. So why stop at just traditional defrags? Once the data is already written to the drive and a defrag has been initiated and the files themselves have been made contiguous, why not go the extra step and organize the folders in a contiguous arrangement as well?
I'm pretty sure I've seen Defraggler doing that. But again, it doesn't follow the rule to the letter - if it thinks something is better off somewhere else, it puts it somewhere else.
Quote:
Originally Posted by
Zero4549 
Sure, new files added to the folder later will fragment the folder, but then the folder will be in just a few fragments rather than scattered randomly throughout the entire drive. Even then, another defrag a week later will solve that issue until new items are once again added.
Actually, your proposal results in extremely rapid fragmentation. Writes are constantly happening, so it's quite possible that every future write will result in many fragments being created. That's because NTFS tries to get data down fast, which means it doesn't care much about where the data lands on the platter. Right beyond your folder with music in it, there will be... log files, a patched DLL, some web browser cache, more music, more other crud, etc.; if it tried to get data closer together like Linux filesystems do, that would not be the case, but then you'd have to sacrifice seek times. (Linux generally puts data near other data that is related or in the same folder; this can cause larger seeks than Windows as your drive fills up; but because it has less fragmentation and better caching algorithms, it generally maintains a higher performance level.) Anyway, what I'm trying to say is, because of how NTFS actually works your defrag idea will hurt fragmentation performance. (The drive will fragment faster) If NTFS were smarter, it would not, but NTFS is not that smart.
Quote:
Originally Posted by
Zero4549 
Perhaps I'm missing something, but based on what I understand of my own (and presumably other people's) disk usage habits and file organization, this extra step, while perhaps more time consuming in the short term, would result in
substantially less fragmentation over time, more consistent and faster performance within any given application (if perhaps at the expense of slightly longer start-up and shut-down times for said applications, but even that would be improved in the long run due to less file fragmentation), and more consistent performance across the entire length of larger drives.
But very little is actually user created content on most computers. The vast majority is system files, programs, etc.; Microsoft has studied them and found that most programs only read files in their own folders sporadically/randomly rather than sequentially. Dumping all ~30GB in ProgramFiles in sequence will result in much slower access than monitoring which files are most used and dumping those files together on the very edge (with gaps to accomodate being written to.) So that's what the Windows Defrag tries to do.
I don't believe Defraggler has access to the same info, so it tries to prioritize keeping similar data together, and keeping folders together. (I read that on a forum post several years back; no idea if it's still true.) Also, it does a better job than Windows defrag at keeping fragmentation low; something to do with the gap sizes it chooses to leave open, and where it places them.
Quote:
Originally Posted by
Zero4549 
Perhaps it's a problem that has simply not been thought about extensively due to simply not having existed long enough. Until very recently, Hard drives were typically quite small, and systems with large storage space usually arrived at that end through multiple hard drives (and thus multiple partitions, which almost automatically means less scattering of related files).
Today, with it being fairly easy, inexpensive, and even somewhat practical to build a 4+TB RAID array of already large disks (1+ TB) with a single large partition, and with people accumulating significantly more files, older file management techniques are seeming pretty inadequate IMO (at least within a windows environment).
I think the solution lies in providing the filesystem with more input, rather than trying to design smarter algorithms. (which ultimately fail in one use case or another)
Microsoft recently added the ability to customize a folder view. (always show thumbnails for pictures/videos, never do, etc.) This is on a folder by folder basis - why not have some filesystem options available for folders? Maybe make it a powertoy so newbies don't screw up their systems? The info could be read by defragmenters and the filesystem, to help them make smarter decisions.
Here's some example ones... although I'd recommend giving it more than 5 minutes of thought before implementing such a thing.

[x] Keep folder contiguous when possible.
[x] Allocate in 64MB fragments when possible.
[x] Allow spare/wasted space after folder.
Option 1 is what you want if you've got music or pictures in a folder, and you're playing/viewing them in sequence. With all pictures stored sequentially, you could have SSD-level responsiveness when viewing picture folders. It would
not be good for video data. (too large to be contiguous on your drive) Microsoft would have to set a sane limit, like any file over 32-64MB is exempt.
Option 2 tells it to prioritize writing to large gaps, because there's very active data in that folder which
will expand further. If you were just dumping pictures off a camera, this would be less important - but if you have a video editing program rapidly creating dozens of mini thumbnails for sections of video (and then deleting or overwriting them), it would be highly important.
Option 3 would tell it that it's fine to keep
lots of empty space after the folder, and to avoid using it for other stuff. (like log files) In conjunction with Option 1, it would prioritize keeping that folder's data close together with newly written data.
More input is the best way for filesystems to become smarter - and if Microsoft insists on making the algorithms better rather than relying on input... do both. It doesn't take a genius to write a daemon that detects picture folders and flips some options so they load faster.

Anyway, there's my thoughts for the day.