... at least, not that tree-oriented, heirarchical model we've all gotten so accustomed to.
Sure, there'll be mechanisms for managing information objects, but calling them "file systems" can really hold back our thinking in some subtle yet powerful ways.
A Bit Of Personal History
Back when I was at UC Santa Cruz in the late 1970's, I had to take those obligatory learn-to-program classes as part of my computer science degree.
I was told to write my programs on paper, go to a punch-card machine and type them out, walk them over to a card reader, and then wait an indeterminate amount of time for a stack of green-bar paper to appear in a cubby hole that told me I had made one sort of careless mistake or another.
Now, that struck me as very counterproductive at the time, not to mention my burning desire to spend as little effort as possible doing useless work.
I got access to the campus UNIX system, and figured out a way to use a text editor ("vi") to create my card images, submit them over the "net" (actually, a serial interface) to a "virtual card reader" on the campus mainframe. Later, I figured out how to redirect the print stream back to a UNIX file.
Needless to say, I got a lot more programming done in far less time once I did this, and was able to spend my time on -- ahem -- more worthwhile pursuits. I also was struck as to how much work was needed to bridge the old and the new environments, instead of just going with the new.
Traditional file systems are fast becoming the punch cards and green-bar paper of the information age, in my humble opinion. And there will be those who spend an inordinate amount of effort to bridge the old and the new, and those that simply embrace the new.
Why are traditional file systems rapidly becoming obsolete?
First, there's the "lots of objects" problem.
I did a quick scan around my house, and estimate that I've got about 75,000 "information objects" sloshing around. Let's say I put that on some sort of cloud-like information service, along with 10 million other people.
Bingo -- we're approaching close to a trillion managed information objects in a single infrastructure, far beyond the hairy edge of traditional filesystem technology. And that's just one simple, easy-to-imagine use case. Or, if you work in a good-sized corporate environment, how many discrete information objects are floating around?
Second, there's the metadata problem.
Just like bar codes, metadata is incredibly useful stuff, and there's no real place to put it in a traditional file system that's tightly bound to the object itself. Tags, revision history, format, security, physical location, relationships to other objects -- all of that has to be created and managed separately from the information object itself.
Third, there's the policies and services problem
Information -- like money! -- is more useful if I can specify policies and services I want applied to it, and with a traditional filesystem, there's very limited metadata to build these things off of. Policies and services can range from the pedestrian "please back this up" to the more esoteric "please cache this on my iPhone". Simlar examples exist in corporate settings, e.g. "please store this in a compliant fashion".
There are other, more esoteric limitations to traditional filesystem thinking, but I wanted to keep things approachable here.
Object-Based Information Stores
This isn't an official or sanctioned industry term, but I needed something to describe "not a filesystem".
Object-based information stores are different than filesystems in several important ways.
First, you use a token or other uniform identifier to get your information. File systems imply location, tokens don't -- no such thing as a broken link or a moved file system. Not to mention, tokens can uniquely identify gazillions of information objects.
Second, they have the ability to associate all sorts of metadata with the object itself. As the information object goes, its metadata travels with it. A very useful property indeed.
Third, the ability to hang metadata off the object gives us the ability to create all sorts of useful policies and services around the information without having to put everything in some sort of database or repository.
An early example of this thinking was EMC Centera. All competitive sniping aside, most people would agree that it was very different from what came before it, and it has been reasonably successful in its primary use case: creating large archival stores of information objects in corporate settings.
A more recent example is EMC Atmos. So far, it's exceeded our expectations in providing object-based information stores with rich metadata, policies and services in its primary use case: creating network-based (e.g. cloud) information services.
Bridging The Gap
Of course, it's easy to superimpose a simpler presentation model (e.g. a traditional file system) on top of a richer object store. Sure, there's no easy way to expose the richness of the metadata (nor the associated policies and services) through a traditional file system, but it's comforting for all of us who are expecting to find a tree-structured hierachy of directories and files.
Applications are the real issue, though. They're not in the habit of generating or using rich metadata around information objects, or try to do it with their own private metadata stores. No one's come up with a useful way of teaching a legacy application new tricks without changing code.
There are a few interesting approaches where certain metadata can be added externally by looking at context, but these are generally unsatisfying in the broader case.
One interesting aspect of ocertain layered EMC products (such as EMC Documentum and others) is their ability to provide a richer abstraction to target "up the stack" while getting more comfortable with the newer approaches, e.g. being able to hide the differences between legacy information stores and newer, object-based ones, as well as expose services that application developers can take advantage of directly.
Where Does This Leave Us?
Once again, I feel like a UNIX programmer staring at a punch card machine with a ton of homework to do.
Sure, I can invest a bunch of effort bridging between the old and new paradigm, but wouldn't it be far easier starting with the new stuff?