Skip to content

Checking and fixing HTML conversion outcomes

Background

The first step in word2canvas is converting your Word document to HTML. To achieve this, word2canvas uses

  1. the Mammoth .docx to HTML converter; and,
  2. a custom style map.

After attempting to convert your Word document, word2canvas will display both the Messages and the HTML generated by Mammoth. These are displayed in accordions which can be opened/closed to check the outcomes of the conversion and inform how to fix any issues.

Checking the conversion outcomes

The following image shows the conversion outcomes of the sample w2c.docx file. The two most common types of checks are

  1. Do the Messages include anything of concern?
  2. Are there any problems with the generated HTML?

Check the Messages

Mammoth messages follow a common format, typically including a type and a message.

The type with be either a warning or a error. Warnings tend to mean the conversion happened, but perhaps with unexpected results.

As illustrated above the most common warning indicates that the document includes a Word style that is not recognised by Mammoth.

Handling unrecognised Word styles

There are three ways you can handle a Word style unrecognised by Mammoth

  1. Ignore it - generally the text will still appear in the HTML. However, it may have lost some of the intended styling.
  2. Remove the style from the Word document - you can search Word documents for specific styles and then choose to apply a different style or to remove the style entirely
  3. Add the style to the word2canvas custom map - the custom map is in the word converter model. If you're not comfortable coding, you can request an update of the custom style map via the word2canvas Issues.

Handling problems with the generated HTML

If the generated HTML does not meet your expectations, the only real solution is to modify the Word document (e.g. change the style used, modify an image etc) and test the conversion again.