Mac OS X Reference Library Apple Developer
Search

Extracting Metadata from Documents

For Spotlight searching to work, it has to have metadata. While some metadata (modification dates, display name, path name) is easy to gather for a given file, most of the interesting data is embedded inside the file. To gather this embedded information you must provide a Spotlight importer.

What Is a Spotlight Importer?

A Spotlight importer is a small plug-in bundle that you create to extract information from files created by your application. Spotlight importers are used by the Spotlight engine to gather information about new and existing files.

Note: It is imperative that developers provide metadata importers for their own custom document formats. Spotlight metadata importers improve the user experience greatly by making sure your documents can be found during searches.

Spotlight importers parse your document format for relevant information and assigning that information to the appropriate metadata keys. Keys help index the content in the data store and facilitate searches. Xcode includes a project template that provides the required CFPlugin support, as well as templates for the required schema file.

Spotlight importers typically reside within your application’s bundle in the subdirectory MyApp.app/Contents/Library/Spotlight. They can also be installed in ~/Library/Spotlight, /Library/Spotlight, and Framework/PlugIn. System provided importers reside in /System/Library/Spotlight.

Associating a Spotlight Importer With Documents

Spotlight importers are associated with document types by specifying the uniform type identifiers (UTIs) from which they extract data. For more information on Uniform Type Identifiers see Uniform Type Identifiers Overview.

The supported UTI types are specified in the importer’s Info.plist file, contained within the plug-in bundle. An importer can support a single document type or multiple document types. The function in the importer that is called for each file is passed the UTI type of the file and can adjust its extraction means as appropriate.

Additional Guidelines

Avoid the use of external files to store metadata content. All critical metadata should be in the same file as the data. The system store of metadata should be considered volatile.

A Spotlight importer must run entirely without interaction. You should not attempt to present any user interface or expect that the window server is running.

You should not expect your application to be running when your metadata importer is called. Importers can be called at any time to extract metadata from a file. Your metadata importer should be able to extract the information without any assistance from the application that created the file.

It is important to let users know what metadata you include in your file formats and what information you extract for searching. For example, users may not want their user ID or other personal information embedded in files they distribute externally. Consider giving the user an option to save a copy of the file without metadata for external distribution, or disable the extraction of metadata that has security implications.




Last updated: 2009-10-11

Did this document help you? Yes It's good, but... Not helpful...