Media Garden Pipeline

From the beginning of development we knew we needed to support a concept of "Media Sources". Media that you want to display in your website can come from lots of places; it can be embedded from another site (e.g. Youtube), you can be upload it yourself, it could be created in situe - for instance a thumbnail extracted from a video, or even different-sized thumbnail versions of an existing image. We wanted a system that allowed you to treat all media items in a consistent manner, irrespective of where they actually came from.
The various concerns involved in pulling content from these different sources and translating them to media items in your content database meant that during the course of development, requirements evolved from a simple IMediaSource into a series of filters implementable as interfaces, which combine together to form a complete Pipeline. This represents the entire flow and all the necessary entry points that media might have.
Because this number of components potentially made for a very complex UI it became apparent that further steps were needed to abstract all this away from the user. The ideal interface that we wanted was a simple text box into which you could input anything that could be translated into a media item.
It should be noted that the pipeline has ended up so extensible that it could be used for many purposes beyond just media; for instance pulling in text content from RSS feeds (just as one example); but we haven't really explored the possibilities of this yet and for now the focus is purely on media.
Here I'll describe the various parts of the pipeline, what they're for, and why you might want to extend them.

Media Input

These represent the entry points for media in the User Interface.
Two implementations are provided by default: TextQueryInput and FileUploadInput. We do not anticipate or recommend ever needing to add additional ones! File Upload is clearly necessary for users to upload their own media, and anything else can be provided for via the text query. The text query will be interpreted by the next component I describe, and that is the extension point you should use for any media queries not represented by our default implementations (although even that will be a rare requirement).
The File Upload input itself maps to a simple text query (containing the uploaded file location) and is also processed by the next component.

Media Query Filter

These filters take the text query and parse it to discover a way to access media.
There are two default implementations; HttpQueryFilter for http(s) URLs and FileSystemQueryFilter for local media paths (which should be in the form path/to/media/folder or path/to/media/file.jpg).
For adding your own filters it's recommended you use a protocol: pattern to avoid any conflict with existing or third-party queries. For instance, if you wanted to implement a Google image search you might want to use;
The query filter should produce a simple location descriptor which will be handled by the next stage.

Media Location Filter

This takes the output of the query filter and works out how to probe the described location for further information.
For a HTTP location this would involve downloading the HTTP headers; for a local file we will inspect the file on disk and find out more information.
It will also provide an accessor so that components further along the pipeline can read the actual file data for additional probing.

Media Header Filter

Once headers have been generated from a location, the header filters perform the job of inspecting those headers to start determining how to handle the specific content type.
If the headers describe a known media format then we can have a media item that we can offer as Media Source Data.
If the headers describe a HTML page or feed URL, we'll need to do some further parsing to discover other Media Source Data contained within that document.
An interesting possibility is raised here; additional URLs we discover might be considered as new Media Locations and we could feed these back into an earlier stage of the pipeline. Clearly this creates a danger of recursion and for now we want to avoid this scenario, but it's something I'm looking at to see if it could be useful.

Media Source Filter

The header filters will describe a number of actual media sources. In normal operation those sources will be persisted to the database and displayed as a list. The user can then pick which sources they want to actually use (via an Import operation). They can always click "Import All" if they just want everything. At this stage there's still no new content in the database, just metadata about potential content, and now we want to convert that data into actual usable media. There's an exception to this, as there are times when you want to quickly input some media and display it straight away (for instance if you're working in the Media Picker) so sometimes there's a shortcut to this stage.
So the final job is for Media Source Filters to perform the task of translating those media sources into actual content items. These filters will create the content (using ContentManager.New) and add to it any data they can understand from the media source metadata.
Filters might want to generate background tasks; such as queuing videos for thumbnail extraction or recoding.

Further Information

This describes the complete process but there are some implementation details missing and things certainly aren't finalized at this stage.
TODO: Create separate documentation page for each interface

Last edited Apr 21, 2011 at 10:57 AM by randompete, version 2


No comments yet.