Your prescription for increased productivity and profitability
I have always been an avid supporter of InCopy. I love its workflow options with InDesign. But over the past year or so, I have been presented with a number of workflow situations that involved Google Docs.Google Docs is cool and it’s free. What’s more, people are using it. So I decided I had better ramp up on this side of the publishing equation. And, believe me, I’m still working on it and don’t pretend to have all the answers.
Recently I had the opportunity to work on a book that was written in Google Docs. In the process of working on the book, I experimented with a few ideas to automate processes with scripts. Some ideas I felt were worth sharing.
It looks like Google Docs is HTML (or XML) pure and simple; and it’s live. That is the fun part: You never “save” your document.
If you have the opportunity, export a document (File > Download as Web Page > HTML, zipped). Unzip the download and open the HTML file in a text editor. This is not for the “Learn HTML in 24 Hours” student. But with patience, and a mental filter that blocks out the “spans” before you go blind, you can get an idea of how Google Docs does it–at least for HTML.
The neat thing about the HTML download is that all of the inline images get downloaded into their own separate folder. So, no matter what you do next with the document, you have the images. But, of course, the user most likely did not optimized the images (resampled to the correct dpi, cropped, or prcessed otherwise).
It would be nice if you could just change the file type designation for the file downloaded (change HTML to XML XML) and simply import it into InDesign. According to many articles you find on the web, it should just “work.”
Problem is, File > Import XML… will probably raise an error. Getting a message like “Expected end of tag ‘meta’ Line:3, Column 4574” makes this process look a little unfavorable. Even for those who know HTML, I don’t think finding where character 4574 in the third line of text would be something even the most astute HTML jockey would want to tackle. (Maybe this person also has the tools that will spot and fix the errors.)
Short of this, it is possible for a small shop to put together a fairly nice workflow using a script or two.
This is what I did for my book project.
A template was created in InDesign for the final document.
A document was created from the template (I used my “docFromTemplate” script).
The document download as Microsoft Word was imported into InDesign. For this there are two options:
Make sure Show Import Options is checked in the system Open dialog (File > Open in InDesign).
Set import options as shown below, or use your own (or maybe a Preset).
Setting Microsoft Word import options
If the styles used in the Google Doc don’t match the styles used in your template, make sure to check Custom Style: Import Style Mapping… so you can map the styles.
Mapping styles in InDesign
I decided to use a script. From my AppleScript “PlaceMicrosoftWord_Style” script the handler that takes care of import preferences is shown below:
on setMWordPrefs() tell application "Adobe InDesign CC 2015" tell word RTF import preferences set convert bullets and numbers to text to false set remove formatting to false --set convert tables to Unformatted Table set import endnotes to true set import footnotes to true set import index to true set import TOC to true set import unused styles to false set preserve graphics to false set preserve local overrides to false set resolve character style clash to resolve clash use existing set resolve paragraph style clash to resolve clash use existing set use typographers quotes to true end tell end tell end setMWordPrefs
The idea here is that you don’t want the images to import with the document. It is better to make sure the images are optimized before they get placed. If you have set up the styles correctly in InDesign and you mapped the style names to those in Google Docs, the text for the document should end up looking near to perfect.
The dread Missing Fonts dialog
Chances are, even with all your preparation, you may get the Missing Fonts dialog when you import. You could have your script disable this dialog, but it is a good idea to take note of what is reported missing. Close the dialog if the missing fonts are not part of the original Google Doc styles. You will probably need to do some tweaking in the document as users hardly ever get it one hundred percent right. Besides things like tables and tabbed columns will definitely require some additional work.
Since the document I had to work with was one long text flow, I decided to import one long document into InDesign and let a script cut the resulting document into chapters after import. This was a fairly easy script to write since the first line of every chapter was set in a unique paragraph style (Title).
The script first searched for each occurrence of the Title paragraph style in the document. This produced a list of text index references from back of the document forward. Then, within a repeat loop the list was parsed:
The next challenge was to place the images.
The one step that is absolutely necessary for this workflow is to resample and/or resize images to fit the document’s criteria. Because I wanted to optimize the image quality as part of this step, I left this as pretty much a manual method. The images would be:
You can make minor changes to image files once placed in the InDesign document, but for the most part, images should be page-ready when imported.
Now for the main reason for this blog: the script for placing the images.
The major modules for this script were as follows:
Basic script setup – establish values for script. Variable values should be obtained from user with a custom dialog. Here they are just hard-coded.
set foundSet to {} --minimum characters allowed to consider text a story set minText to 1 --arbitrary number to add to top margin for testing if image is at top of page set topTest to 20 --character style for image tags set noPrintName to "NoPrint" --objectStyles set objStyleName to "ImageStyle" set topObjName to "TopObjStyle" --use paragraph style to align images horizontally set paraAlignName to "AlignParagraph" --will return list of stories from getStoryRef handler if set to true set allowMultiple to false
Choose Folder – Have the user choose the folder that contains the image files:
try set dPath to choose folder with prompt "Select folder for images to place" set folderPath to dPath as string on error errStr --give the user a message and exit the script end try
Document setup – Set application preferences and get document margins (right hand page)
tell application "Adobe InDesign CC 2015" set measurement unit of script preferences to points set preserve bounds of image preferences to true set docRef to document 1 tell margin preferences of docRef copy {top, right, bottom, left} to {py0, px0, py1, px1} end tell
Story reference – Get a reference to your story
Your Google document should be one story. For some reason when a Word document is placed in InDesign you may end up with more than one story. Only one of these stories will have any amount of text so our handler should be able take care of this.
Call to the handler:
set storyRef to getStoryRef (docRef, minText, allowMultiple)
(*Parses through list of stories to get those not on master spread and whose length is greater than 1. If allowMultiple, a list is returned, otherwise item 1 of the test list is returned*) on getStoryRef(docRef, minText, allowMultiple) tell application "Adobe InDesign CC 2015" set docRef to document 1 set testList to {} set storyList to stories of docRef repeat with i from 1 to length of storyList try set storyRef to item i of storyList set testIt to text containers of storyRef set theTest to class of parent of parent page of item 1 of testIt if theTest is spread then set end of testList to theTest end if end try end repeat set lastTest to {} repeat with i from 1 to length of testList set storyLength to length of storyRef if storyLength > minText then set end of lastTest to storyRef end if end repeat if allowMultiple then return lastTest else return item 1 of lastTest end if end tell end getStoryRef
For simplicity, the script assumes that allow multiple is false so only one story is returned from the handler. Otherwise you would need to repeat through the stories if a list of more than one is returned.
Image tag References – Get a list of references for the image tags in the document
You can use find grep to get this list. For this I used the following:
set find grep preferences to nothing set change grep preferences to nothing set find what of find grep preferences to "\\[image[0-9]*(\\.[a-z]+)\\]" tell storyRef set foundSet to find grep with reverse order end tell
Notice that the list is returned in reverse order. This is important. Any time a script relies on object indexes, always work the document in reverse order. Otherwise, indexes will change and you will most likely end up with a mess.
If the foundSet is greater than one (1), repeat through the set and place the images. At first I was going to delete the image tags after placing the images, but after a few trial and errors I decided to just change the color of the text to nothing. This way you can use the style attributes of the image placement tags to tweak the positioning of the image in relation to caption or text following.
Insertion Point – Establish Insertion Point for Placement
Use the index of the image tags to get a reference to the insertion point to use as the object for placing
repeat with j from 1 to length of foundSet try set testIt to item j of foundSet set applied character style of testIt to charStyleRef set insIdx to index of testIt set insertRef to insertion point (insIdx) of thisStory
Get the name of the file to place from the image tag
set imageText to contents of testIt set textLength to length of imageText set fileName to text 2 thru -2 of imageText set filePath to folderPath & fileName
Place the image Have the insertion point defined above place the image
tell insertRef set placedList to place file filePath set imageRef to item 1 of placedList end tell
Apply object styles – (you may have other tests you will want to add)
set containerRef to parent of imageRef set applied object style of containerRef to objStyleRef if item 1 of geometric bounds of containerRef ≤ (py0 + topTest) then set applied object style of containerRef to topStyleRef else set applied object style of containerRef to objStyleRef end if
End the repeat
Setting image placement tags to no print was key
This is quite the script. Putting it together should not be too hard for you to do as the majority of the code is supplied. The next time your client says that the text will come from Google Docs, you will have a workflow that should be pretty much automated. And, since you know how to write scripts, you are in control should you need to add some more functionality.