From Word to XML in Four Steps

by Mark Schubert on March 02, 2012

Many companies start producing their documentation in Word, and this works well for small manuals. But over time, as products change and grow, small manuals suddenly have hundreds of pages and become increasingly difficult to manage. At this point, companies look for other solutions such as migrating from Word to XML-based documentation.

Structured FrameMaker is a popular choice for an XML editor. But making this transition requires knowledge, discipline, and attention to detail.

Step 1: Use styles in Word

The only way to maintain control over your Word documents is to use styles. Styles are essential for technical documentation, where accuracy and consistency matter. Following a style guide saves time because writers do not have to wonder which style to use. However, inconsistent use of styles often occurs in documents that have evolved over time or that were written by multiple technical writers who applied different styles.

To take control over your Word documents, define a style guide and consistently apply the styles. Documents that are formatted this way make the transition to XML smoother. 

Step 2: Convert Word documents to FrameMaker

Now that you have consistently styled your Word documents, create the same styles—with the same names—in your FrameMaker template. Setting up identical styles in FrameMaker enables you to easily import your Word content into FrameMaker.

As a result you get unstructured, consistently-formatted FrameMaker documents.

Step 3: Map styles to XML elements

To convert unstructured FrameMaker documents to XML, you first need to decide which XML document type definition (DTD) to use. Making this decision requires profound knowledge of XML. For example, do you want to use DITA, DocBook, PI-Mod, or a customized DTD?

After you decide which DTD or schema to use, you must map your FrameMaker styles to XML elements. FrameMaker provides an integrated conversion tables tool for generating structured documents from unstructured documents. For example, based on the mapping, the "Heading 1" style is converted to the XML element . If you use the right sequence, the conversion tables can even handle more complex mappings such as lists within other lists.

By means of the conversion tables, you can simply generate XML documents based on styles.

Step 4: Convert to XML

You are almost there now. Using the conversion tables and a few mouse clicks, you migrate the Word documents to XML. Ideally, because the original Word document was properly formatted using styles, almost no post-editing of the XML files is required. However, you will have to redefine cross references and links to images. You can also use XSLT scripts to automate post-editing.

This 4-step process enables you to convert documents to XML easily and safely.

When MS Word is an integral part of your documentation workflow ...

Even after you make the transition to producing your documentation in XML, you might still receive new content from subject matter experts in MS Word format. No problem!

As long as content providers use styles that match your XML styles, you can use the same conversion process. To ensure consistent formatting, your company can enforce the use of common Word templates that restrict styles to those that can be converted to XML.

Example: The Software Engineering group uses restrictive Word templates to document change requests. The Technical Writing group processes these documents by means of the conversion tables.

Thus, you can use this conversion process for both one-time migration and for recurring conversions.

Add new comment

Your email address will not be published.

You might also be interested in

Creating surveys with Microsoft Forms - Part 1

by Lea Sophie Ladiges on August 30, 2021

For technical communicators, it is not always easy to understand the needs and tasks of the target groups for which we write technical documentation or develop technical solutions.

If your company uses Office 365 , you can use Microsoft Forms to quickly create electronic surveys. Your target audience only needs a few minutes to answer the questions, and you gain valuable insights for your projects. more...