Apache Cocoon


Apache Cocoon, usually just called Cocoon, is a web application framework built around the concepts of pipeline, separation of concerns and component-based web development. The framework focuses on XML and XSLT publishing and is built using the Java programming language. The flexibility afforded by relying heavily on XML allows rapid content publishing in a variety of formats including HTML, PDF, and WML. The content management systems Apache Lenya and Daisy have been created on top of the framework. Cocoon is also commonly used as a data warehousing ETL tool or as middleware for transporting data between systems.

Sitemap

The sitemap is at the core of Cocoon. It's here that the web site developer configures the different Cocoon components, and defines the client–server interactions in what Cocoon refers to as the Pipelines.

Components

The components within Cocoon are grouped by function.

Matchers

Matchers are used to match user requests such as URLs or cookies against wildcard or regular expression patterns. Each user request is tested against matchers in the sitemap until a match is made. It is within a matcher that the response to a particular request is specified.

Generators

Generators create a stream of data for further processing. This stream can be generated from an existing XML document or there are generators that can create XML from scratch to represent something on the server, such as a directory structure or image data.

XSP

One type of generator is an XML Server Page, an XML document containing tag-based directives that specify how to generate dynamic content at request time. Upon Cocoon processing, these directives are replaced by generated content so that the resulting, augmented XML document can be subject to further processing. XSPs are transformed into Cocoon producers, typically as Java classes, though any scripting language for which a Java-based processor exists could also be used.
Directives can be either built-in or user-defined processing tags, both of which are defined in logicsheets. Tags are defined using XSLT templates that describe how the tags are transformed into other XML nodes or into procedural code such as Java. The tags are used to embed procedural logic, substitute expressions, retrieve information from the web server environment, and other operations.
Note that XSP is deprecated in recent releases of Cocoon.

Transformers

Transformers take a stream of data and change it in some way. The most common transformations are performed with XSLT to change one xml format into another. But there are also transformers that take other forms of data.

Serializers

A serializer turns an XML event stream into a sequence of bytes that can be returned to the client. There are serializers that allow you to send the data in many different formats including HTML, XHTML, PDF, RTF, SVG, WML and plain text, for example.

Selectors

Selectors offer the same capabilities as a switch statement. They are able to select particular elements of a request and choose the correct pipeline part to use.

Views

Views are mainly used for testing. A view is an exit point in a pipeline. You can put out the XML-Stream which is produced till this point. So you can see if the application is working right.

Readers

Publish content without parsing it. Used for images and such.

Actions

Actions are Java classes that execute some business logic or manage new content production.

The Pipeline

A pipeline is used to specify how the different Cocoon components interact with a given request to produce a response. A typical pipeline consists of a generator, followed by zero or more transformers, and finally a serializer.