Your prescription for increased productivity and profitability
In our previous post we introduced a sample of an XML file. The text of the file served its purpose in illustrating a CALS table. But there remains the need to give a tip-of-the-iceburg explanation of what XML markup is all about.
XML (Extended Markup Language) is one of the ways that plain text can be marked up to identify its elements. The whole idea of a markup language is to surround text (or other elements) with an established code that identifies it. Similar to XML, the code identifies the various text elements of a document using terminology that describes what the element IS. Unlike HTmL, it separates document (data) from style.
If you are familiar with HTML, which is probably the best known markup language, you know that the tags that define a document element begin with a word enclosed in angle brackets (<>). For example: for text that will be styled the largest, the beginning tag is written <h1> with the ending tag written </h1>.
<h1>This is a heading</h1>
With HTML all of the available tags are predefined and recognized by web browsers. How the text is styled depends on the browser (unless styling is defined by CSS).
XML is used for many aspects of document and web development. Because it identifies what the element is but not how it is to be styled, the same XML data can be used in many different types of presentations.
The XML tags that define the document elements are similar to HTML tags in that the tag names are placed inside angle brackets. The difference is that there is no predefined list of tags that need to be used. The tags can be defined as needed by the user (the eXendable part of eXtendable Markup Language). To establish a standard set of tags, publishers create a document known as a document type definition (DTD). One of the most used standards is XMLNews which is used for exchanging news and other information. Its root element is <nitf> and its tags include <head>, <body>, <headline>, <byline>, and <dateline>.
Although users can define their own tags for XML, there are a number of rules that define how XML tags can be written:
<bodyText>Text to be used as body text.</bodyText>
A common practice is to use the sam naming rule as used in the source database. With InDesign documents, paragraph styles are often created using a naming convention that conforms to the XML tags. We will be using this convention in the following example.
An XML document must have a single root element that opens and closes the document. It’s markup tag is often named <Root>. Within that element all other document elements must be properly nested to indicate child/parent relationships.
<Root> <h1>This is a headline</H1> <bodyText><bold><italic>This text is bold and italic</italic></bold></bodyText> </Root>
Notice in the example above that both the beginning and ending tags for italic are inside (nested within) the beginning and ending tags for bold, and both are nested within the bodyText tags.
The XML file we will use for our demonstration is shown below. You may want to enter it into a plain text file to follow along. If you do, watch the beginning and ending tags to make sure they match in spelling as well as capitalization.
The interesting thing about working with Adobe InDesign is that you can import an XML file and associate its tags with the names of paragraph styles in the document. Paragraph styles can also be associatd with HTML tags used for exporting documents in HTML format.We will demonstrate importing XML with the following sample script. We will leave exporting as HTML for later.
Our sample script will create a document from a template that has paragraph styles named the same as the XML tags used in the XML file. It also has a predefined table style named “GrayHead”. The template will be an InDesign template (.indt) stored in a folder named Templates inside the InDesign application folder. (If you do not have administration privileges, you will need to use another location.) If you ar following along, make sure your template has the following styles:
Paragraph Styles:
Head, Text_10, Label, Table_Head, Table_Body.
Table Style: GrayHead.
Cell Styles: Head (for the first row), Body (for all cells except the left region and right region), LeftCol (for the left region), RightCol (for the right region).
How you define these styles is up to you.
We will take the script one step at a time. In case you have problems we will post a download of the files on the AppleScript page of this web site.
Define variables that can be changed by the user at the top of a script. Later, when the script is working as designed, these can be replaced by a user interface using InDesign’s custom dialog.
set elementName to "table" set tableStyleName to "GrayHead" set doLink to false --should XML file be linked
This part of the script will be taken care of (handled) by a handler getTemplate. The call to the handler places the result passed back by the handler into the variable templateRef. The handler uses a choose file command.
--the call to the handler set templateRef to getTemplate() (*Template file selected by user is returned from handler; otherwise an error is thrown*) on getTemplate() tell application "Adobe InDesign CC 2015" set appPath to file path as string end tell set templatePath to (appPath & "Templates:") as alias set templateRef to (choose file with prompt "choose template for document" default location templatePath without invisibles) return templateRef error "User Cancelled" end getTemplate
The script will use a choose file command to have the user identify the XML file for import. Notice how both choose file commands incorporate a default location parameter. This requires an alias reference to the folder where the file will be found. For simplicity we used the path to the user’s desktop, but you may wish to use another path.
--establish folder path for XML file set homePath to path to desktop from user domain --define XML file to import set fileRef to choose file with prompt "Select XML file for import" default location homePath without invisibles
Once the script has a reference to the template and the XML file to place, the document is created. A reference to the document is placed in the variable docRef. The script then tells the document (docRef) to import the XML file using the reference to the XML file (fileRef)
tell application "Adobe InDesign CC 2015" tell XML import preferences set import style to merge import set import CALS tables to false set repeat text elements to false set create link to XML to doLink end tell set docRef to open templateRef tell docRef import XML from fileRef end tell end tell
If you run the script at this point and open the document created, you will see the structure of the XML file in Adobe’s Structure Pane (View > Structure > Show Structure). You will need to click on the small arrows next to the Root element and the to disclose the elements (and their contents) in the structure.
Root Head Text_10 table td td ... Text_10
To place an XML structure on a page, there are a number of conventions that can be used in InDesign. Our example will use the XML root element as part of a place XML command. The statement places the imported XML to the first page of the document. As a consequence of the place XML statement, the XML will be placed in a text frame positioned at one-half inch from the left and top of the page (place point). A reference to the text frame created is placed in the variable frameRef.
--after the import statement inside the tell docRef code block set rootElement to XML element 1 end tell --ends tell docRef block tell page 1 of docRef set frameRef to place XML using rootElement place point (".5 in", ".5 in"} end tell
To style the document text, the script will associate paragraph style names with the text contained by the XML elements using the name of their tags. Another handler will accomplish this task. It takes advantage of the fact that the names of the paragraph styles in the document are the same as the XML tags.
Notice that the call to the handler begins with the reserved word my. This is required as the call is inside a tell statement to the application. The handler associates the paragraph styles to the XML element contents by creating an XML import map. Once the map is created, the script executes the map XML tags to styles method.
--call mapTags handler passing a reference to the document my mapTags (docRef) (*Associates tags in document's list of XML tags with corresponding Paragraph styles Uses XML tag to style to map styles.*) on mapTags(docRef) tell application "Adobe InDesign CC 2015" tell docRef set tagList to XML tags repeat with i from 1 to length of tagList set tagName to name of item i of tagList if (exists paragraph style tagName) then set styleRef to paragraph style tagName make XML import map with properties {mapped style:styleRef, markup tag:item i of tagList} end if end repeat map XML tags to styles end tell end tell end mapTags
All that is left for the script to do is to create and style the table, the script first needs to get a reference to the XML element whose markup tag is named “table”. For this we will use a handler to identify the element. It is important to note here that XML elements do not have a name property. XML elements are referenced in order as child elements in the XML structure. This is similar to items within a list (or list of lists). The following handler accomplishes this task by looping through the number of elements found in the first level of XML elements (child elements of Root). When the required element is found, the loop exits. If the Table element were to be nested at another level in the structure, the handler would need to be written much differently. Remember that the name to associate with our table XML element (elementName) and the name of the table style (tableStyleName) were defined at the top of the script.
--place this code before the last end tell statement tell docRef set tableElement to my getXMLElement(rootElement, elementName) end tell set tableStyleRef to table style tableStyleName of docRef --get width of table's container to set width of the table copy geometric bounds of frameRef to {fy0, fx0, fy1, fx1} set frameWidth to fx1 - fx0 --convert the contents of the XML "table" element to a table tell tableElement to convert element to table row tag "tr" cell tag "td" tell tables of frameRef set row type of row 1 to header row set applied table style to tableStyleRef set width to frameWidth tell cells to clear cell style overrides end tell (*Returns reference to XML element tagged with value of variable elementName*) on getXMLElement(rootElement, elementName) set foundElement to missing value tell application "Adobe InDesign CC 2015" repeat with i from 1 to count of XML elements of rootElement if name of markup tag of XML element i of rootElement is elementName then set foundElement to XML element i of rootElement exit repeat end if end repeat if foundElement = missing value then error "Element " & elementName & " was not found" else return foundElement end if end tell end getXMLElement
Final document
Now that you have the pieces to the puzzle, see if you can put them together to create a real working script. Be sure to add a try/on error statement block to catch errors that will occur if the user clicks Cancel in response to the choose file methods. You will also need to trap the error generated in the getXMLElement handler (if the XML element is not found).
Yes, scripts such as this can get a little involved, but if you let handlers take care of commonly used functionality, your efforts will be amply rewarded as you are able to use these handlers for any number of scripts. When working with databases, nothing beats working with XML (except maybe JSON but that works with JavaScript and is not supported by InDesign). If you do have data stored as JSON, you can use JavaScript (and a number of other scripting languages) to convert it to XML.