Automatically tagging existing documents using a Regular Expression

EasyCatalog can automatically tag fields in existing documents by looking for patterns in the text defined by a regular expression.

This feature – Auto Pickup – is designed to mark up documents that have already been created (either by hand or by another automation method) and is useful for updating prices, for example.  Auto Pickup works with documents where fields have been placed into a text flow — for documents created using tables, use the Cell Finder function.

To use Auto Pickup, the document needs to have been constructed with a consistent, identifiable, pattern.  For example, in the following document each code can be identified by its format (three characters, a dash, three characters) and the price is always preceded with a dollar sign:

Auto Pickup can be accessed using a keyboard shortcut:

  • Define a keyboard shortcut by using the Edit > Keyboard Shortcuts… menu option in InDesign.
  • On the Keyboard Shortcuts dialog, change the Product Area pop-up to be ‘EasyCatalog’.
  • Select Auto Pickup in the Commands list.
  • Click in the New Shortcut text box at the bottom of the dialog.
  • Press the key – or combination of keys – you would like to assign to this action, such as F5.
  • Press the Assign button.  You may be warned about modifying the default set, so answer ‘Yes’ on the dialog that appears to create a new keyboard shortcut set.
  • OK the dialog

To run the Pickup, select any field in your data panel (so that EasyCatalog knows which data source you need to link to) and press the keyboard shortcut.  The Auto Pickup dialog should appear:

The regular expression which defines the text pattern in the document (and where fields appear) should now be entered into the top part of the dialog, prefixed with ‘REGEX:’.  The format of this regular expression may be different for each type of document you’re attempting to mark up.

Notes:

  • The regular expression must be prefixed with REGEX:
  • Fields are identified inside of regular expression ‘capturing groups’
  • The name of the field is specified using <?<name of field>
  • If the text in your document contains reserved regular expression characters, such as $, remember to escape them by prefixing them with a backslash.
  • To accurately determine the record to link to, all fields that are defined as key fields in your data source need to be specified somewhere in the regular expression. If your key field in not in the document, another field can be used to identify the record in the document instead. The field to search for should be prefixed with ‘SF:’ – e.g. (?<SF:SKU>[0-9A-Z]{5}).
  • By default EasyCatalog will process the entire text content through the regular expression.  To evaluate one paragraph at a time, append ‘/m’ to the regular expression.

Sites such as regexr.com are useful when defining the regular expression format as they allow you to see the effect of the expression in real time.

Once the dialog is OKed, EasyCatalog will run the regular expression on the selected document content (or the entire document if there is no selection).