Handling Files With CoreData

March 12, 2008. Filed under cocoa 13 objc 8 coredata 2

Recently I ran into a pretty common problem while developing a Cocoa program: I am using CoreData for data persistence and management, but I needed to store files as well. Prompting the question: how can you efficiently store files using CoreData? (The zip containing the source we'll develop is available here.)

My initial thought was to open the file with an NSFileWrapper, extract its content as a NSData, save the filename as an NSString, and the icon as an NSImage and push all of them into the CoreData store as binary fields. However, there are a few reasons why literally storing the files in your CoreData store is a poor choice.

  1. First, the overall size of your store might be 50k, or in a large dataset it might be 500k, but consider the impact of storing a single image inside that dataset. Your 50k file might become 450k. Your 500k file might become 1.1 megabytes. Now consider adding fifty files, and your store is now a 25 (or 100) megabyte file, but 99% of that is bulk from the files. Thats a lot of irrelevant bytes that CoreData will have to parse.

  2. Second, depending on which data store you chose, you may be loading the entire store file into memory, and you really don't want to require every file your application knows about to be unceremoniously thrown in memory each time it is run. Imagine iTunes trying to stuff your thirty gigabyte music collection into memory each time you boot it up. It might be slightly problematic.

  3. Third, this is a bad solution because you are going to duplicate a lot of NSFileWrapper functionality, but in a less considered implementation.

So we go back to the drawing board. I knew I didn't really need to store the entire file in the store, I only really needed to store a NSString containing the path to the file, and all of the sudden the problems with the first solution fade away.

So, imagine loading your program up and wanting to display your files for the user. You'd retrieve an NSArray filled with NSManagedObject's that would each contain one crucial key: @"path". Then you'd have to map a conversion onto that NSArray of NSManagedObjects and convert it into an NSArray filled with NSFileWrapper instances.

In Objective-C the code is going to look something like this:

NSString *aFilePath = [someFiles objectAtIndex:0];
NSWorkspace *workspace = [NSWorkspace sharedWorkspace];
[workspace openFile:aFilePath];

Transpose that code to a slightly more functional implementation in Python and we see that what we are doing is quite simple:


Idiomatic Python is, perhaps, a bit more legible:


All in all, its pretty easy to make that conversion. But, things do get a bit more complex as we start adding functionality to our program. Each time we add a new file, then we'll have to add information about that file to both the CoreData store, and also to the list that is keeping track of our current fileWrappers. Maintaining both arrays isn't immensely difficult, but does introduce a lot of room for mistakes.

Fortunately with a little work we can merge the two arrays into one while retaining functionality, and we'll even get lazy instantiation of the NSFileWrappers1.

How We'll Do It

Before we implement the code, lets take a few moments to outline how our solution will work.

  1. First we are going to subclass NSManagedObject as LEManagedFile. This will contain one extra field, which will contain a NSFileWrapper.
  2. Then we are going to add a few messages that LEManagedFile will respond to. Those will correspond to a subset of the functionality provided by the NSFileWrapper. In particular we're interested in messages for accessing the icon, filename and data from the NSFileWrapper.
  3. Next, we'll create a CoreData model named "OurFile", which has only one attribute, a NSString named "path".
  4. Finally, we tell CoreData that the "OurFile" model isn't simply a NSManagedObject, but is actually a LEManagedFile.

And thats all there is to it. If you've been playing with Cocoa for a few months then that should be enough to get you to on the train heading towards a workable solution. However, we're about to take a more in depth look at these steps, so feel free to stick around.

Setting Up Our Project

First, open up XCode. Then head up to the File menu and open up a new project (Apple-Shift-N). Under the Applications header look for the Core Data Application. Select that and then click next. I named mine CocoaFilesInCoreData and stored it at ~/Programming/ObjC/CocoaFilesInCoreData, but it really isn't important where you store it or what you name it.

Next we're going to create the files for LEManagedFile. Go to the File menu and select New File (Apple-N). Name it "LEManagedFile", and then click Finish.

Subclassing NSManagedObject

Now open up LEManagedFile.h in XCode, and lets get to work. First we want to change it to subclass NSManagedObject instead of just NSObject.


Next we want to add one piece of data to LEManagedFile, a NSFileWrapper.


And then we need to specify the messages we will be using:


This means all in all our LEManagedFile.h file is going to look like this:


Then we move on to the actual implementation.

Implementing LEManagedFile

We know we'll have to implement the messages we declared in the LEManagedFile.h file, but there are a few other messages we'll have to handle. Those additional messages are:


dealloc is a staple of writing Objective C code (telling memory management how to recycle the object), and we'll be overriding valueForKey in order to create a unified manner of interacting with both the NSManagedObject and NSFileWrapper functionalities in LEManagedFile.

All in all, the stub version of LEManagedFile.m is going to look like this:


The first message whose response we're going to implement is fileWrapper.


Its a pretty simple lazy initialization accessor. Next, lets implement icon and filename.


Taking advantage of the fileWrapper accessor, these are simple as well. The path and setPath accessor/mutator pair are a bit more complex though, so lets examine them next.


We are relying on CoreData to store and maintain the path value, so we can't simply write a standard accessor and be done with it, instead we write a wrapper around the NSManagedObject functionality that LEManagedFile is extending.

The mutator works along the same lines:


setPath is also a wrapper around the NSManagedObject functionality, but it also needs to invalidate the fileWrapper, if it exists. Normally it is best to access values via the accessor, but here we can't use the fileWrapper accessor, because the accessor will initialize the fileWrapper value it it hasn't already been initialized. Thus, if fileWrapper wasn't already initialized, we would be initializing it just to release it.

Now a little rest break to implement dealloc before tackling initFromPath:withContext and valueForKey.


Then, lets take a look at valueForKey:


What valueForKey does is intercept any incoming messages to valueForKey, see if they are either icon or filename, if they are then it directs them to the accessors we already made (that query the fileWrapper), otherwise it simply passes them on to the valueForKey implemented by NSManagedObject. This is the little trick that allows us to maintain KVO compliance (and thus keep Cocoa bindings working) and maintain the ruse that the LEManagedFile is actually stored in CoreData.

The last thing we need to do is implement initFromPath:withContext. Step one is to go to the top of LEManagedFile.m and import the application delegate for our project:


Now lets consider what initFromPath:withContext needs to do. First, it will copy the file at the specified path into the application support folder (we won't handle file collisions in this example, but you probably should, look into NSFileManager), and then it needs to create an instance of the "OurFile" model in the application's CoreData context.


Keep in mind that initFromPath:withContext is not sendng a processPendingChanges message to aContext, and this new LEManagedFile won't be saved until that occurs.

Anyhow, we're finally done implementing LEManagedFile.m, and it should look like:


Now its time to move on to setting up CoreData to take advantage of our freshly minted code.

Creating the "OurFile" CoreData model

In the project window, double click on the CocoaFilesInCoreData_DataModel.xcdatamodel file. You'll see our familiar CoreModel editing GUI. Add a new entity, and name it "OurFile".

Image of the entity in CoreData management GUI.

Instead of having its class be NSManagedObject, set it to LEManagedFile.

Image of changing the class in CoreData management GUI.

And add one attribute, named "path", which stores a string.

Image of an added attribute in CoreData management GUI.

And thats all there is to it. We've hooked our LEManagedFile class up to the "OurFile" model in CoreData, and now it'll let us mimic storing files in CoreData while only needing to actually store the path to the files.

Ending Thoughts

You can grab a zipped copy of the project here. I setup the Cocoa bindings in the MainMenu.nib as a rather large hint about how to use implement a program using this approach, but you'll need to use Interface Builder to create the files for the OurFileController class and implement (at minimum) the addFile: method before the example will actually work. Essentially what the addFile: method will need to do is use a NSOpenPanel along with the initFromPath:withContext method. You can get a copy of the context from the application delegate, via [[[NSApplication sharedApplication] delegate] managedObjectContext]. (Or you can just do [[NSApp delegate] managedObjectContext] , which is admittedly a bit more concise.)

All in all, I think this is a fairly pleasing solution to the problem of how to store files using CoreData. It takes some time to implement, but is both an efficient and flexible fix. I'd be curious to know if anyone has a cleaner way to achieve this result? As always, let me know any problems with the code, and I'll do my best to help out.

  1. Meaning that a NSFileWrapper for a particular path/file will only be instantiated when it is needed, and not before. This is an important technique in minimizing memory usage, and overall efficiency.