WordPress to Indesign: The final countdown

We’ve had a lot of requests for us to open source the WP Browser, the final piece of getting a post from Google Docs to WordPress and into Indesign. We hadn’t up till this point because it was a plugin that only worked on Windows and, well, was pretty bad.

Over the last few weeks, I’ve been working on rewriting WP Browser using Indesign’s Javascript API. It runs as a Script, not as a Plugin, but it works on both Windows and Mac, it’s lightweight and it doesn’t need anything extra (like Adobe Air) to run. And now it’s open source.

If you want to skip to the end, you can download the Indesign plugin as well as an associated WordPress plugin at https://github.com/bangordailynews/WordPress-to-InDesign.

Also, shameless plug: If this looks cool and you want to build more things like it, we’re hiring.

We’ve been using the WordPress plugin for quite a while now, and testing the Indesign plugin for a little while. They work pretty well.

The WordPress plugin allows you to map various HTML tags to Indesign paragraph and character tags. The easiest way to start would be to create a text box with all the styles you want to map in Indesign, then export the text in that box as Indesign Tagged Text (File -> Export, then select Indesign Tagged Text. Abbreviated is fine.)

Tagged text is just a super easy way to tell Indesign what paragraph style each block of text should appear as. Generally it appears as <pstyle:Paragraph Style Name>. Things like bolding and italics generally appear like <ct:Bold> and <ct:Italic>. A really simple tagged text export looks like this (I deleted the paragraph style definitions at the top for clarity):

<ASCII-MAC>
<pstyle:Body Text>This is some test text
<pstyle:Article Subheading>This is an article heading
<pstyle:Article Subheading>Still an article heading
<pstyle:Body Text>Some test text with <ct:Bold>some bolding<ct:> and <ct:Italic>italics<ct:>.

Pretty simple, right?

By default, the WordPress plugin will save off each post as tagged text in /wp-content/uploads/indesign/. The files are named post_id.txt, and you can use rsync and a cron job to sync them locally if you wish. The tagged text is also available through a simple JSON api that will be used in the Indesign plugin.

The thing I keep calling an Indesign plugin is, as I mentioned, actually an Indesign script. To install, open up the scripts pane in Indesign. (If it’s hidden you can show it by going to Window -> Utilities -> Scripts.) Right click on the Application folder, then click Reveal in Finder and navigate into the Scripts Panel folder in the window that opens up. Then drag and drop WP Browser.jsx into the folder. When you go back to Indesign, WP Browser will show up under the Application folder in the scripts pane. Double click to open the browser.

The WP Browser pretty simply allows you to perform a fulltext search on your WordPress install. You can modify the script to either read the post_id.txt files we talked about above from a local location or to dynamically create a local file with the tagged text each time we hit import.

WordPress_Browser

By default, the filter list is populated by categories. I’ll tell you how to modify it below (we’ve modified it to populate from a list of the paper’s sections, and the stories are categories to go into each section).

Before you click import, you must have a text box selected. If you have the entire text box selected, the story will replace everything in that box. If you have a selection of text selected, it will replace that selection. Else it will insert the text where your cursor is.

Both the WordPress plugin and Indesign script require a bit of modification:

  • WordPress plugin
    • Set an API key at the top of the file. This will be used in the Indesign script and ensures no unauthorized access to all your unpublished posts.
    • in function do_tagged_text:
      • You can set “formats” for the story to decide which paragraph styles stories are generated with. By default, the post meta key for that format is _format. You can change that if you want.
      • We also use a post meta field to override the author attached to the post (for example, for one-time contributors). By default, that field is _byline.
      • The plugin integrates with Co-Authors Plus. It also allows you to identify a meta entry for the user that will display as their “title” (ours is BDN Staff, by default).
      • We strip out all HTML that’s not a heading, a paragraph tag, a list, or text styling. You can be more or less strict.
      • Where we start replacing p, b, em, etc tags, the plugin by default maps the character style to <ct:Bold> and <ct:Italic>. If you use custom fonts, you might have to change this.
      • We also strip the state out of datelines if it’s local. (See line ~180)
      • Around line 200, we start converting headings to paragraph styles. After we’re done with that style, you need to change back to the default style.
      • Then, around line 275, you’ll want to set the paragraph styles for byline and the default paragraph style. There’s also an example of how to change the styles based on format. (This could be coded better.)
    • in function wp_browser_search
      • By default, you can filter posts in the WP Browser based on category. To change this to a different taxonomy, you’ll need to change the get_categories() call ~310 as well as the category_name arg in get_posts ~335.
      • By default, we only query for posts with status publish, draft and pending. If you want to expand or limit that, you can do so ~330.
  • WP Browser
    • On line 3, set the API key you set in the WordPress plugin
    • On line 4, set the domain of your website, no leading http:// (just example.com). The plugin doesn’t currently query over https.
    • On line 8, decide whether you want to import from post_id.txt files saved locally or from a file created dynamically.
    • On lines 13 and 15, set the path to the files above on Mac and Windows, respectively.

One last thing:

When InDesign makes a call to the server, it does so by creating a socket connection and then requesting the path, or something.

In short, your server will see a request come in for localhost/wp-admin/admin-ajax.php?etc

So, especially on multisite and possible on regular WordPress, you’ll need to set the host for it to work.

I did this by adding the following line to wp-config.php. There’s probably a better way to do it:

if( ( $_SERVER[ 'HTTP_HOST' ] == 'localhost' || empty( $_SERVER[ 'HTTP_HOST' ] ) ) && !empty( $_GET[ 'action' ] ) && ( $_GET[ 'action' ] == 'wp-browser-search' || $_GET[ 'action' ] == 'wp-browser-notify' ) )
    $_SERVER[ 'HTTP_HOST' ] = 'mysite.com';

I just ripped a lot of this out of the BDN site and took a lot of our customization out. It will definitely require customization. It might break. Leave a comment below if you have a question. Email me at wdavis@bangordailynews.com if I did something really stupid.

This entry was posted in Uncategorized. Bookmark the permalink.

16 Responses to WordPress to Indesign: The final countdown

  1. Christian D. says:

    Gonna be giving this a try this week if I can carve out sometime!

    BTW what do you guys use for hosting? We are currently using a third-party newspaper host company and I’ve been thinking hard about us moving off to our own hosting but don’t want the sys. admin issues. Just curious.

    Next on your list should be classifieds and re-thinking that pig. I won’t name the company but we’ve been on Quark 4.1 (circa. 1999) for our liners layout. For nearly 3 years now they have been promising flowing classifieds in InDesign…well they got it now but it’s highway robbery IMO.

    I’ll let you know if I have any issues with this…probably a few questions more than anything.

    • William P. Davis says:

      Christian,
      We host with Firehost. They’re quite good, but you still have to manage the servers yourself in a lot of cases. If you’re on WordPress, I would recommend WP Engine — they’ll handle everything for you, from basic server stuff all the way down to WordPress upgrades and site caching.

      Definitely interested in rethinking classifieds — not just the software, but the model, too.

      Will

  2. Kevin says:

    Hey Will,

    I’m getting this error:
    JavaScript Error!
    Error Number: 61
    Error String: Unclosed token
    Engine: Session
    File: /Volumes/25
    Line: 10
    Source: )
    Offending Text: <

    Any idea what might be causing this? The search endpoint seems to be correct, when going to it from a web browser.

    • William P. Davis says:

      Kevin,
      It looks like you might be missing a semicolon in your script. Specifically, make sure there is a semicolon at the end of the line that starts var placeFromServer

      • Kevin says:

        Hi Will,

        I went through it and couldn’t find any issues with semicolons.

        Here’s my script – I don’t think I really modified anything in it.

        http://pastebin.com/PLp2QGpV

        • William P. Davis says:

          Ah, I see what’s going on.

          One of the annoying things about this plugin I haven’t been able to figure out a fix for is that InDesign apparently makes a socket connection to the server (in this case, ndsmcobserver.com), and then requests localhost/wp-admin/admin-ajax.php…

          You can test whether it’s going to work by running curl localhost on your server — it should return the HTML for the website. Right now it seems to be returning a 404 page (Apache doesn’t know what it’s looking for).

          You can solve this by either making the virtualhost serving your website the default connection if no host is defined, or by letting apache listen on a separate port and using that port only for WP Browser calls. This guy had the same problem and proposed that fix, which has now been merged back into the plugin.

          • Kevin says:

            Hi Will,

            Makes sense! It’s working now. Thank you so much for your help.

            What I did is I set that vhost to be default, and I actually had one explicitly defined for localhost that goes nowhere, which I’ve removed.

            Also, I changed the port to :8080, because that’s what Apache’s running as – I’m running varnish on 80.

            Thanks again, Will! Plugin looks great.

          • William P. Davis says:

            Excellent, glad you got it working!

          • Kevin says:

            Actually, it appears that actually importing isn’t working. When I have placeFromServer set to true, it says that “does not exist. Ensure the drive is mapped correctly”. I assume this is the setting you’re talking about above about setting up rsync or something to copy files locally.

            Is there a way to get it to get it directly from the server in real time? Setting placeFromServer to false (which seems counter-intuitive) does nothing when I press import.

          • William P. Davis says:

            Kevin,
            I have rsync set up to run every minute, which is pretty close to real-time. My crontab file looks like this:

            */1 * * * * /path/to/indesign.sh

            And indesign.sh looks like this:

            #!/bin/bash
            rsync -ave "ssh -i /path/to/rsync-key" --delete me@myserver:/path/to/wp-content/uploads/indesign/* /path/to/local/files/

            That works pretty well.

            I would like to know why the import isn’t working from the website, though. Can you shoot me your api key in an email? wdavis@bangordailynews.com.

            Will

          • Kevin says:

            Hey Will,

            I sent you my API key in an email. I was trying to debug the InDesign script, I found that it doesn’t get into the “for( var index in searchResults.data ) {” loop. That is, an alert that’s put just before that shows up, but one put inside it or anywhere after does not.

            Can’t really see why, but perhaps that might help?

          • William P. Davis says:

            Thanks Kevin. Good catch — the searchResults object wasn’t global, so it couldn’t be accessed to place the file. I’ve updated WP Browser.jsx in the repo.

            One note: In my test, the file came in as plain text, as opposed to Indesign Tagged Text. That’s because I’m on a Mac, so when the script creates the local file it’s created with Mac line endings, rather than Windows line endings. If you’re using Macs to lay out with Indesign, you’ll probably want to change the line endings declaration on line 221 of bdn-indesign.php to <ASCII-MAC>. If you’re using both platforms, it would be better probably to save the files to a local server and use that option.

          • Kevin says:

            Hey Will,

            Thanks for your help! Problem I’m having now is that it just inserts “null” instead of the actual content. Any ideas?

            Also, I think your line ending thing might have been filtered out by the comment system – what should it be for Macs? I changed it to “\ n” without the space — is that right?

          • William P. Davis says:

            Just updated the comment to fix the encoding issue. What was the null problem?

      • Kevin says:

        Nevermind, I fixed the problem!

  3. mike Smith says:

    I’m getting this same error:
    JavaScript Error!
    Error Number: 61
    Error String: Unclosed token
    Engine: Session
    File: user folder on computer
    Line: 5
    Source: )
    Offending Text: <

    am wondering why is it looking on the computer for the file? what dont i have set right? i can send java code too if necessary

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>