From Word to DITA. Questions, Answers, and Some Facts

The other day, a colleague asked via XING how she could switch to a DITA-based authoring system. Mark Schubert and Ulrike Parson answered her questions.


The colleague described her current documentation as follows:

  • User manuals, created in Word and in a linear way. They are usually read cover-to-cover.
  • Reference manuals, created in Word. These are collections of individual text, organized by different criteria, for example, by program command descriptions.
  • Online help, built from user manuals, manually shortened and freed from cross-references and style sheets.

As the software has a modular design, she wants to create project-based and customer-specific documentation. She also asked if she could rescue the hundreds of pages of legacy text when switching to DITA.

DITA (Darwin Information Typing Architecture) is a document format based on XML for the design, creation, management, and publication of information. DITA allows reuse of topic-oriented content as modules (topics). It is suitable for managing comprehensive documentation.

parson communication:
The good news first: Usually, the technical implementation is no problem.

So what’s the bad news?

parson communication:
Reworking your old content could be costly. DITA content is divided into modules, called topics. These topics are divided into three main types of information: task, concept, and reference. If the information types in the original text are not clearly separated from one another, your effort in modifying your old content according to DITA standard will be much higher. You may have to rewrite content, that is, adjust it to the semantic DITA structure. Even though you already create different types of manuals, their structure may not yet comply with the DITA standard. If the old content is clearly structured already, your effort will be much lower.

Topics are blocks of information that describe a particular issue. Topics are self-contained and independent from other topics. This way, topics can be used in different places of the documentation where the issue occurs. Topics are kept small for flexible use. They are divided into three main types of information: (1) Tasks describe how the user accomplishes a task (how do I …). (2) Concepts describe basic relationships or architecture (what is …). (3) References contain facts, parameters, technical data, etc. and are usually displayed as lists or tables.

How should I proceed?

parson communication:

  1. Make an inventory. Check the content for similarities and differences, and do it both for the information product as a whole (the book) and the individual chapters. List the different versions of your documentation, for example, according to target groups, products, or output formats.
  2. State your requirements. As there are certainly differences between the individual information products and their versions, you need to identify something like a common basis.
  3. Display your favored target documentation in DITA. Usually, DITA offers much more than you need. Most authors use only a small part of the DITA elements or adjust some of them according to their needs. Experiment with the old chapters. Try to break them down and build them into DITA topics.

What’s the ideal topic size? What if the size I choose in the beginning turns out to be too big? Should I design smaller topics rather than bigger ones?

parson communication:
I recommend designing smaller topics. Organize them in a DITA map. You can easily break down a big topic into smaller ones and create a DITA map out of it. But it will take time to do this for many big topics.

Maps let you organize topics that you want to build into an output web or PDF. You can also generate navigation files based on the map structure, and generate links that get added to the topics. (

Do I have to move the pictures separately?

parson communication:
Pictures are stored in a separate directory and referenced with a DITA element. The pictures are displayed in the reference location in the topic.

The current documentation contains screenshots for print and PDF. They should not appear in the online help as the customer looks at the screen already when opening the help. Do I delete references to these screenshots in the text? Or can I take phrases like “see image 13” along when creating a PDF, for example, and omit them for others like HTML?

parson communication:
You can assign version information within a topic. An XML attribute specifies that the images should, for example, only appear in print. These so-called attribute-based conditions make text or parts of sentences appear or disappear. For parts of sentences, you simply apply a element to the text. This element then gets a corresponding attribute.

Well, I once learned that images should always be referred to in the text.

parson communication:
In single-source publishing we weigh the amount of work for the technical author against readability. We also like to use screenshots, in instruction manuals, for example. These screenshots have no title and there are no references to them. Yet they fit in the context and can easily be updated. We also distinguish between screenshots and images. Images to which we refer in the documentation, are also useful in online help.

Do I need a content management system for DITA?

parson communication:
No. You do not necessarily need a content management system to use DITA. You could, for example, employ an XML editor like Oxygen and organize modules in maps. XML files and maps can be managed in a source-code management system like Perforce, Subversion, etc. But: For larger numbers of topics and versions you need sophisticated content management, which only XML-based content management systems offer.


  • Laura Bellamy: DITA Best Practices: A Roadmap for Writing, Editing, and Architecting in DITA, Addison-Wesley Longman, Amsterdam, 2011.
  • Sissi Closs: Single Source Publishing. Topicorientierte Strukturierung und DITA, Entwickler-Press, 2007.


tcworld 2018: What's New with iiRDS?

iiRDS was one of the central topics at the tcworld conference 2018: at the new iiRDS Café, in lectures, showcases, and tutorials. There was also a discussion on Twitter, among other things about the name, which is so difficult to pronounce. more ...

tcworld 2018 part 3: Creative online videos for technical documentation

Clear the stage! At the beginning of his workshop "Creating thrilling online videos", Stephan Schneider showed two pictures. more ...

tcworld 2018 part 2: Virtual and actual highlights

The sun is shining brightly as I arrive for my first tekom. Inside, the conference is already in full swing. Two days of presentations, workshops, fair impressions and conversations await. more ...

tcworld 2018 part 1: Legal Requirements vs Readability

At this year's tcworld conference, there were a lot of discussions about standards and regulations for technical documentation. Experts such as Jens-Uwe Heuer-James and Torsten Gruchmann came together for a panel discussion where the following was repeatedly said: "We need an IEC 82079-1!" more ...

Working in self-organizing teams. Or: how we get rid of management

Today's world is VUCA : volatile, uncertain, complex, and ambiguous. Companies are facing complex challenges such as Industry 4.0 and the Internet of Things. Those who do not respond fast, may be driven out of the market. more ...
  • linkedin
  • xing