Developers with XML Content Skills in Demand

An explosion in demand for developers who know how to build, manage and update dynamic XML-based content is coming -- starting this year, at least according to a study by Zapthink. See why developers who know how to use XML to deliver smart and inexpensive content solutions for end users will create new job opportunities.

Tags: Web Services, Enterprise, ZapThink, XML, Technologies, Management, Producers,

In 2003, enterprise developers and end users are spending more time trying to find and format content than they spend creating it, according to a study released earlier this month by ZapThink, an XML and web services consultancy.

Proper use of XML and web services tools and techniques will emerge as a top enterprise strategy for reversing this trend, ZapThink concluded in its report entitled "XML in the Content Lifecycle". One key reason, ZapThink said, is this: "The primary challenge in the enterprise for producers of content -- information that is intended for human consumption -- is content reuse: the ability to integrate content from disparate sources."

"Content processes are currently where distributed computing applications were in the mid-1980s," said Ronald Schmelzer, senior analyst at ZapThink. "Content today is frequently out of context, hard to reuse, constantly changing with multiple versions in multiple languages, and insecure. Content solutions that leverage XML promise to improve the economics of working with content considerably."

Among ZapThink's key findings:

  1. Producers of content in the enterprise spend over 60% of their time locating, formatting and structuring content -- and just 40% actually creating it.

  2. This imbalance between production and management of content will fuel the market for XML content lifecycle solutions by almost 10x. For technology alone, not counting personnel and training, spending will be $1.8 billion in 2003, exploding to more than $11.6 billion by 2008.

  3. By 2008, about 60% of all content lifecycle products will be XML-enabled.

  4. To date, efforts to improve content processes have been slowed by inability to extract and manipulate content from disparate data sources.

  5. Among the solutions to content confusion will be the evolution of an XML-powered Service-Oriented Architecture for content, which will allow greater and more cost-efficient reuse by users and richer aggregation from disparate systems.

In short, ZapThink's report examines how current ad-hoc content management processes are moving toward models where content is componentized using XML, and then turned into "services" that are discoverable on the network, Schmelzer added.

Content Problems in Context

The basic rule of thumb for content, ZapThink said, is that companies invest considerable resources in creating strong content, but take very little time to create a "content lifecycle" road map that would spell out how, where, by whom and in what format that content would be used once created. XML and web services capabilities, now in place or quickly emerging, will enable a wide array of affordable and standards-based strategies for the entire content lifecycle -- creation, management, publishing, syndication and protection/security.

ZapThink's report makes an important distinction between "content" (for humans) and "data" for machines, stating that "What differentiates human-oriented content from machine-oriented data is that people must create, manage, publish and distribute content so that it can be represented in a variety of different ways, all the while maintaining the same overall meaning."

For example, "content" represents information such as news, facts, fiction, charts, illustrations, photos, opinions -- anything that communicates something to someone.

  • Enterprises typically disperse content in unmanaged, isolated "islands" of information, while application functionality is frequently locked in proprietary systems, requiring integration technology to extricate it.

  • Line-of-business content users require content aggregated from multiple content sources in the enterprise. Similarly, line-of-business application users require high-level business functionality aggregated from multiple enterprise application sources.

  • Portals provide universal access to content. Portals also provide universal access to application functionality. Distribution of content has serious security and rights management issues... and so does distributed computing.

  • Content process applications are currently where distributed computing applications were in the mid-1980s -- out of context, unstructured and hard to locate.

XML, Web Services Provide "Component" Capability
A necessary first step to enabling more flexible content management and reuse, ZapThink said, will be the componentizing or "chunking" of content into discrete blocks that contain metadata describing their contents.
This component approach won't require inventing a whole new array of technologies or standards, the research firm added. "Everything we have learned about how to componentize application functionality and abstract it to the level where we can access it anywhere on the network can be applied to content. All that's required is a shift in the way we architect, implement and manage content processes."

However, the rub will be in how architects, LOB (line of business) managers and developers define these components. Companies must describe these content components in an abstract manner so they can be used in many different ways by many different users, ZapThink recommended. This "many-to-many" paradigm parallels the goal of many emerging web services standards, particularly for on-demand service-oriented architectures (SOA), the report added.

To get to what ZapThink calls a "Service-Oriented Content" (SOC) model, companies will need to take two key steps:

  • Add a layer of abstraction on top of enterprise content to isolate the content consumer from the content producer. In this way, consumers gain the flexibility they require to locate content -- and producers obtain the agility they need to change it as needed.

  • Encapsulate their content into discrete chunks and compose it into usable information. Encapsulation is important because it breaks up large documents into content chunks that can be assigned to different content creators. At its most basic, the rearchitectecting process enterprises must undertake to offer SOC involves encapsulating content components with web services interfaces and then composing (virtualizing) these fine-grained content components into coarse-grained business-level documents.

As a result, the anatomy of the new "Service-Oriented Content" will look like this:

  • Each content "chunk" or component will have a content metadata wrapper that describes it in much the same way that a service description describes application functionality. The metadata will enable the content to self-describe a variety of contexts and meanings, which can help reduce the number of times the same content needs to be created.

  • For locating, storing and managing this multi-contextual content, ZapThink said, developers and architects can use today's WSDL standards to describe and define content chunks, in conjunction with directories and/or registry technologies (such as UDDI), to store the descriptions. Combining metadata, WSDL and registry technologies will also provide a flexible security and access policy for content.

  • Using URL/URI technologies, in conjunction with componentized content, creates a more granular and responsive view of content. In specific, this approach will let users "drill down" as far as needed, without the need to review or retrieve an entire document

  • Using XML schema and standard browser APIs, the content can be separated from the presentation layer and all presentation-specific requirements, such as format.

  • back