WORKING WITH XML

I recently saw a demonstration of using XML import into InDesign. For me, the manual process looked laborious. This turned my attention to using scripts to import XML.

Using a script to import XML is easy enough once you have an XML file. Most databases have an XML option. But that’s another subject. For the purpose of this discussion, we will just assume you have an XML file from a client who promises a series of projects.

The advantage of using a script for importing an XML file is that it can be much simpler than the manual method. Additionally, the script can provide additional functionality.

  • file handling: choose file, open, and save
  • make sure XML import preferences are set correctly as needed by the import
  • create a document from a template
  • save the document making sure that the proper file naming convention and file extension is used
  • automate text styling by mapping XML tags to InDesign paragraph styles

A BRIEF INTRODUCTION TO XML

In case you are not familiar with XML there are just a few things you need to know:

  • XML is a method for exchanging information between applications, the web, databases, and so on.
  • XML is a text-based markup system.
  • Like InDesign, it has hierarchal structure.
  • It is case-sensitive

Hierarchal Structure

As with a tree, the XML hierarchal structure starts with a root. Inside the root can be one or more trunks. And within each trunk there can be one or more branches. And branches can have leaves. I am sure you have the idea. Now, turn the tree upside down so the root is at the top and you have a structure that resembles that of InDesign and XML. In fact, the parent element at the top of InDesign’s XML structure is named “Root”.

Similar to HTML, XML identifies elements in the structure using tags. A tag is simply a pair of angle brackets surrounding an XML identifier, as in <trunk>. There is a tag at the beginning of a text element and one at the end. The ending tag is the same as the beginning tag with the exception that its identifier begins with a slash.

<text>My actual text here.</text>

Case Sensitivity

With case sensitivity, a capital A is not the same as a lowercase a. So just stick with lowercase in most situations in naming tags for XML and you will be fine. Any combination of alphanumeric letters and the underscore can be used for an XML tag identifier. This is what makes XML “extendable.” 
You cannot start a tag with an underscore or a number. Try to use a tag that describes the role of the text within the document, such as <headline> or <text>.

For simplicity, our demonstration will use tags that match the names of our paragraph styles.

File Requirements

An XML file should start with a prolog. Although this is optional, it tells the world what kind of a file it is.

<?xml version="1.0" encoding="UTF-8"?>

Lastly, the file must be saved with the .XML extension.

Actual XML

With all of the above in mind, the contents of the XML file could look like the following:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
    <page>
        <image href = "images//cover.jpg"></image>
        <text>Text by Jane Doe
</text></page>
    <page>
        <text>It was a dark and mysterious night...
</text></page>
...
</Root>

The next task is to have a template set up for the documents.

With this in place, all that is needed is a script.

THE SCRIPT

Create Document from Template

The script begins by creating a document from a template selected by the user. Here we will have a template located in a folder named “Templates” inside InDesign’s application folder:

The path to the application folder is returned from the following AppleScript statement block that places the result in the variable appPath:

   tell application "Adobe InDesign CC 2015"
      set appPath to file path
   end tell
   --returns: a file reference

To add the Templates folder to the result it is necessary to convert the file reference (appPath) to a string (text) before adding (concatenating).

   set folderPath to appPath as string & "Templates:"

The script can now ask the user to select a file from the list of files in the Templates folder:

   set fileList to list folder folderPath
   set userChoice to choose from list fileList with prompt "Choose template for project" without multiple selections allowed

At this point, all the script has is a reference to the template file to open. Before the document is created, the script needs to make sure that XML import preferences are set as anticipated for the project. The call to the handler is placed after the user selects the template to be used. This is to assure that XML import preferences will not be altered without a document being created. Add this to the above:

   my xmlImportPrefs (true)

This calls the following handler to set the preferences. The value of true indicates that the XML can process multiple records.

   (*Sets up xml import preferences for merge import*)
   on xmlImportPrefs(doRepeat)
	tell application "Adobe InDesign CC 2015"
	   tell XML import preferences
		set create link to XML to false
		set allow transform to false
		set import style to merge import
		set repeat text elements to doRepeat
		set import to selected to false
		set ignore whitespace to false
		set ignore unmatched incoming to false
		set import CALS tables to false
		set import text into tables to false
		set remove unmatched existing to false
	   end tell
	end tell
   end xmlImportPrefs

To create the document, another handler can be used. The handler will take two arguments: the string reference to the folder path (folderPath) and a true or false to indicate whether the InDesign document is to be closed after saving. If false, the document remains open.

To take care of any errors, all of this needs to be placed within a try/on error block. One such error will occur if the user cancels out of the choose file dialog. The docFromTemplate handler returns a reference to the document created.

With all of this, the top part of the script should read similar to the following:

   try
   tell application "Adobe InDesign CC 2015"
	set appPath to file path
   end tell
   set folderPath to (appPath as string) & "Templates"
   set fileList to list folder folderPath without invisibles
   set userChoice to choose from list fileList with prompt "Choose template for project" without multiple selections allowed
   set templatePath to folderPath & ":" & userChoice
   my xmlImportPrefs (true)
   set docRef to docFromTemplate (templatePath, false)
   on error errStr
      activate
      display alert errStr
   end try

The docFromTemplate handler opens the template chosen and allows the user to choose the folder and filename for saving. Additionally, the script checks to make sure the file name extension for the file is “.indd”.

   (*Opens template named and saves as document to path chosen*)
   on docFromTemplate(templatePath, doClose)
	--define path to template
	tell application "Adobe InDesign CC 2015"
	   set docRef to open file templatePath
	   activate
	   set savePath to choose file name with prompt "Select file and folder for save"
	   --tests to make sure name of file ends with ".indd"
	   if savePath as string does not end with ".indd" then
	         set savePath to (savePath as string) & ".indd"
	     end if
	     if doClose then
		 close docRef saving yes saving in savePath without force save
	     else
		 save docRef to savePath without force save
	     end if
	 end tell
         return docRef
   end docFromTemplate

It may be a good idea to run the script at this point just to make sure no errors have been made.

Now that there is a document for the XML, the script needs to place the file. For this the rest of the script will require the following information: 

  • a reference to the document. Here the script will use the value stored in the variable docRef which was received as the result of the docFromTemplate handler.
  • a reference to the page on to which the file will be placed.
  • a reference to the XML element that will be used to place the file
  • an x-y coordinate for where on the page the file will be placed
  • a value of true or false

The template that will be used has a text frame on master page A-Master designed to hold the XML text. The script will get the X-Y coordinates for placing the file from the geometric bounds of this frame. The next segment of the script can be tested with the document created from above.

For the sake of testing, the remaining script sections will be written as stand-alones.

   getPlacePoint
   tell application "Adobe InDesign CC 2015"
	set docRef to document 1
	tell docRef
	   set masterPage to page 1 of master spread "A-Master"
	   set pageitems to page items of masterPage
	   set masterFrame to text frame 1 of masterPage
	   copy geometric bounds of masterFrame to {y0, x0, y1, x1}
	   set placePt to {x0, y1}
	end tell
   end tell
   placePt

IMPORTXML

For this segment the script will have the user choose the XML file for import and place the XML. To make sure the user chooses an XML file, the script will test its filename extension. Remember that an XML file needs to have an XML file extension. Also, be aware that if show all filename extensions is not checked in Finder’s preferences, InDesign will not recognize the ext

The code for this segment follows:

   set fileRef to choose file with prompt "Choose XML file for placing"
   set fileRef to choose file with prompt "Choose XML file for placing"
   set fileInfo to info for fileRef
   if name extension of fileInfo is not "XML" then
      error "Wrong file type chosen"
   end if

If an error is not thrown, the script can then import the XML. Again, this will be a standalone file for testing.

   activate
   set fileRef to choose file with prompt "Choose XML file for placing" without invisibles and multiple selections allowed
   set fileInfo to info for fileRef
   if name extension of fileInfo is not "XML" then
	error "Wrong file type chosen"
   end if
   tell application "Adobe InDesign CC 2015"
	set docRef to document 1
	tell docRef
	   import XML from fileRef
	end tell
   end tell

At this point you should be able to import an XML file into Adobe InDesign. When an XML file is imported its structure is shown in InDesign’s Structure panel. Learn to use the keyboard shortcut Command + Option + 1 to open this panel. It is also accessed from InDesign’s View menu: View > Structure.

InDesign’s Structure Panel

INDESIGN XML TEMPLATE

For some projects, creating an InDesign template for XML import can be quite complex. This project starts out very simple: a document with primary text frame set to true. The document has nothing more than a single text frame on the A-Master page to define where the text will be placed. Margins for the page are set to correspond to the text frame. This technique works for a document where all pages of the document will be similar with all text for each page contained within one text frame or a series of linked text frames.

ON YOUR OWN

Now that you have the pieces of code, see if you can put them together into a single working script. Next, try to create an XML file just for fun. In doing so, make sure that you use a plain text editor such as TextEdit to create the file as plain text. If quotation marks are used within the text of the file, they need to be straight quotes (inch marks). InDesign will convert these to typographer’s quotes (that is, if you have convert straight quotes to typographer quotes set to true for text import).

If you get your script running without error, try using your script to import the XML file into an InDesign file created from your template.

 

UPWARD AND ONWARD

In future posts for this series we will complete this project and introduce you to more exciting ways templates and XML files can be set up for even more eye-opening ways to take advantage of XML in your workflows.