> Technology/science articles
> Towards A Single Folder Filesystem (30/07/2004)
Towards A Single Folder Filesystem
This article was written after much irritation with the present filesystem used in almost all computer operating systems today. Excluding Windows stuff, I currently have around 80,000 files on my hard drive - over half of which are sound files of various kinds and formats. After organizing such a vast collection with so many different file formats, it's clear that the present way of finding and storing files is much less efficient than a meta-tagged single folder filesystem (or database file system) for various reasons I have discussed below.
Microsoft and Apple have shown signs that the upcoming Windows Longhorn OS and Mac OS X will add some emphasis to file metadata, but I have my doubts that they will fully realize its potential, and utilize a single 'folder' for practically all files.
The metadata solution
"I forgot where I put it..."
Advantages and disadvantages
If we are going to have folders, it's a shame we can't navigate by single clicking instead of double clicking (renaming and moving folders could be done via the use of the RMB or Ctrl click). Also, why is the bottom right corner so small when we need to resize the window? (it's okay when there's a left/right slider, but it should the larger at all times).
We've all experienced the problem - having to search through countless directories and sub-directories to find the file you're after. In fact, after just a decade of using your computer, you might have wasted hundreds of hours searching, organizing, and simply navigating the various folders of your hard drive. This problem is compounded if you tend to store dozens of gigabytes of data, or hundreds of thousands of files.
To make things worse, how often have you wondered where to save a certain kind of file? Very often, you'll find files which happily belong in two or more folders. For example, take music. Say you have lots of MP3s that you'd like to sort into folders. You might make folders based by group, composer, or the year the song was made. Or maybe you'd like to create folders categorised by quality.
The thing is; in the current folder hierarchy paradigm, you can't have all these categorisations. You have to pick one kind of grouping and stick with it. Sure, you can make two categorisation types (say by group and by data), but then the file will need to be copied to both directories - wasting disk space. Okay, the space issue can be partially solved if the filesystem is 'intelligent' and recognises that the file is identical and stored twice in different places. But there are still problems with this approach. Firstly, if you delete one of the two files, does the other get deleted? In some cases you would want both to be gone, but in others you might only want just one of them to be removed. Secondly, it means tediously copying the same file to any other categorisations you might have. These other folders could be located in a completely different directory branch on the hard drive.
Clearly, the current way of doing things is far from perfect.
The metadata solution
But imagine a filesystem with no folders at all. Yes, you heard right, every single file is stored in the root folder! Does that sound like a mess to you?
Have you heard of metatags or metadata? They've been around on the web for a long time now, though relatively few web pages use them. But they could well be the key to freeing our filesystems. The idea is that instead of wading throughout folders to cut and paste or locate a file, I can give files metadata by typing (or selecting from an intelligent dropdown menu) specific keywords which relate to that file.
When you want to find it later, we can enter these words, and all files containing them instantly appear in a list (quicker than Microsoft's search at the very least I hope!). They could be ordered by date, file size, how popular the file is (how many times it's been accessed by the user), or some other kind of attribute. Of course, that doesn't mean you need to painfully give each file the same keywords. It would be simple to give the same set of keywords to many files at once.
Let's say I want to store the song "All over the world" by Electric Light Orchestra, I would add these keywords:
- Electric Light Orchestra
Of course, it's not compulsory to add all of those, though naturally, the more info you enter, the easier it will be when you want to locate the file at a later date.
I'll go through some of these tags as there are important points to make:
"mp3" could be added by default, since it forms the extension of the file. Yep, why not treat the file extension as another keyword? I see no disadvantage because you wouldn't usually want the name for other purposes.
"Electric Light Orchestra": Similar to how the Opera browser handles bookmarks, one could set up the filesystem to change "elo" to the longer "Electric Light Orchestra" on the spot (think of Explorer's use of word completion).
"80s": Or should you put 1980s? Doesn't matter either way really. If you couldn't find the file, you could ask the intelligent filesystem to look for part-word matches (or it would do this by default making sure to let you know what it's doing if there are no matches). Similar to what Google does in fact when you type a misspelling.
Also, it would ask if you wanted to turn all instances of any files with the metadata containing "80s", and change them to "1980s" (or vice versa).
"q=43": This is a cool 'keyword' to add. I like the idea of giving all the music on my hard drive an aesthetic quality rating out of 60 to show how 'good' it is. When you come to search for it later, you could easily define a range from x to y, to enable the filter to show you a certain 'quality range' of files.
Here's one more example. Say I want to save a fractal picture that also happened to come from a television show such as the Simpsons. I have one folder on my hard drive where all my pictures go (maybe having to create another sub-folder for fractals), and another where everything on Simpsons trivia/multimedia is stored. Where do I put it? Storing it in both isn't a solution due to the inefficiency problems mentioned earlier. So I use keyword metadata:
- picture (added automatically)
- png (added automatically)
I've already mentioned briefly how you'll be able to retrieve the file. But now I'll talk about it in more detail. Say I wanted to search for that fractal picture. This brings us to a special window. It isn't a folder though :) It's an advanced filter with a field at the top so you can enter certain words. The filter would look through the keyword metatag data of all the files, and display any that matched (or nearly matched - giving a percentage of error). Like Google, if you wanted to search for a phrase, then you'd put the words in double quotes. Standard boolean operators can be used such as NOT (use the minus symbol) or AND (no symbol required here), or OR (the "|" symbol would be appropriate) and brackets. When saving out the metadata in the first place, you could use commas to separate each keyword/phrase. This would ensure that if two keywords were next to each other, but you wanted them to thought of as separate (not a phrase), then searching for that phrase would not return it.
Saving a file would be similar to navigating/loading a file. As you type in keywords, the list is updated dynamically and instantaneously so that you can see which 'collection' of files you'll be saving that one with. You'll also be able to choose from a selection of recently saved keywords (from a dropdown menu) to potentially save having to type out the same ones again and again. Similarly, 'auto-complete' could be used for quick access to keywords/phrases.
Naturally, the field/s would use a flexible, advanced and easy to use wildcard system (Google take note!). Search for phrases with a missing character (one star), or a number of missing characters in a word (two stars), or you can use three stars for any number of characters or words missing off the beginning/middle/end of a phrase. If you want, put phrases in quotes like Google lets you do, otherwise the keywords can be in any order. Treat numbers as a variable using a special "#" character so you can search for a number range.
To extend the concept further, the filter could optionally search the file's 'title', and 'description' too. Metadata like this failed with search engines as Google because they're open to spam abuse. But this is your hard drive - you can put whatever metadata you like.
To display by date (created/modified/accessed), filetype, number of times the file has been accessed, or file size is easy. All of this metadata could be automatically added at file creation. By default, all of these special attributes are shown in the filter view, unless otherwise specified via a range filter. As these functions are so useful, it might be appropriate to add these to the main filter window in a prominent position (along with field/s for the entering of keywords). A nice idea would also be for the system at file creation to add the keyword "picture" to any files that the filesystem recognises as being a picture type (PNG, IFF, GIF, JPG etc. etc.).
I forgot where I put it...
In this new system, if you forget something, it's not going to be the actual directory location, it's going to the actual keyword/s. What happens in this situation? What happens if I forget I used the keywords "simpsons", "fractal" and "picture" for that file? Well, in this situation, the OS could give you a full list of keywords that you have stored in files so far.
But there might be thousands of keywords that you've used?
That's okay, you can filter by the list of keywords you've used - using the same filter window and technique that you used for the files! If you forget the word "fractal", but remember the word "simpsons", just go into 'Find Keywords' mode, enter "simpsons", and the window will display all combinations of keyword sets which have "simpsons". You'd be filtering and searching the metadata itself! I'm sure there will be very few; perhaps just the one you want. Just like with the viewing of files, you can have all the power of the wildcard system (partial keyword matches etc.) along with date filtering, and number of keywords etc.
Also, you could ask it to search for semantic similarities by associating certain words with other words ('audio' with 'sound' with 'music' with 'tune' etc. etc.), using an integrated thesaurus. If you've accidentally used 'sound' as a keyword for some files, and 'audio' for others, then to tidy up things, you could get the filesystem to change all files to using just one of the words.
Finally, the Opera browser has a neat system where you can assoicate a URL with a word of your choice. For example, I have it set up so that I type "s" into the URL field, and it goes to www.skytopia.com, or "sl", and it reaches slashdot.org etc. The metadata filesystem effectively already incorporates this idea. You'd just add a specific keyword which is exclusive to that file.
Could this whole new filesystem work for all types of files? There may be cases where it would be nice to store a program's files into another folder. But in the long run, a single folder style filesystem would probably be better (programs and their associated files would include the program's name and possible the keyword 'program' in their metadata). But of course, if you desperately need folders as well, that's no problem. Folders can happily co-exist with this single folder metadata system.
I suppose a potential problem might arise when the filename of a newly installed program (the main exe or an associated file) coincidentally uses the same filename as an existing file on the HD. Imagine you, or the program then trying to call the file - which one would it choose? If the filesystem sees two files with the same name, and can't decide which to use, it could look at the file creation date, and see which time is nearest. That won't always work though, especially if you create a file associated with the program at a later date. A neater solution is to include the keyword "program" along with the program's name. Any calls to the program's files can then look for that keyword and the program's name to make sure that it's the right file.
Similarly, stored web pages and associated files could contain the website's name and sub-topic. I'm not sure how resource intensive this is, but with computers of the future, I shouldn't imagine it will be too much of a problem.
Where are non-metatagged files stored before they join the others?
Usually, the file system would encourage you to give files metadata at the point they were created. But without this data, recently downloaded/created files are shown separately in a 'limbo' area. They remain there until they have been given keyword and/or description/title metadata.
I have a folder open and ready, and I just want to move/copy numerous files into it instead of giving each and every file the same bunch of metadata keywords. How can this new filesystem work?
Yes - you don't need to type in the same key words over and over again for a bunch of files. You have two options - either keep the files in the aforementioned 'limbo' folder, and then select them all, and give them all the same metadata at once. At this point, they join the millions of other files in the main jumbo folder.
Or you could use something similar to a dropdown menu or RMB menu where you can select from a list of keyword sets - sets which you have recently used. Like before, each one joins the jumbo folder as metadata is assigned.
Is a single directory filesystem for everybody?
Yes. In the long run, it's the best way to go. I'm not sure whether it can completely replace the use of directories without some initial problems however.
Will take a while to transfer everything over from our current hierarchy filesystem.
Identical file names may be a problem unless some standards are implemented.
No more cutting, pasting, or moving ever again.
Massive reduction on the time taken to locate a file.
No more hunting throughout directories when saving a file either.
Powerful filter and metadata model allows flexible sorting and viewing of files.
Have any comments you'd like to make about this article? Visit the forum to air your views.
A nice short read to say why metadata filesystems are good.
DBFS - Database File System. Proof of concept file system for KDE/Gnome.
Call for a MetaData-Enabled Filesystem
Database File System - Another name for a metadata filesystem is the 'database filesystem'. DBFS is an implementation for KDE. Here's a slashdot thread on it.
Explanation of Database File Systems
- A thesis on the subject, with some good links in the bibliography.
Back to top
Skytopia home > Project index > Technology/science articles > Towards A Single Folder Filesystem
All pictures and text on this page are copyright 2004 onwards Daniel White.
If you wish to duplicate any of the information from this page, please contact me for permission.