Extending ctags with xcmd

Maintainer:Masatake YAMATO <yamato@redhat.com>

xcmd means “External parser command”.

WARNING: You cannot use --xcmd-<LANG>=COMMAND option in ./.ctags and ~/.ctags to avoid run unwanted COMMAND unexpectedly. However, it is inconvenient to test and develop a xcmd driver described here. For those who understands risk, you can use --_allow-xcmd-in-homedir. By putting this option to /etc/ctags.conf or /usr/local/etc/ctags.conf, you can use --xcmd-<LANG>=COMMAND in ~/.ctags or ~/.ctags/*.

Basic usage

There are commands generating tags file specialized to a language. CoffeeTags is an example. CoffeeTags deals with scripts of coffee language. It is written in Ruby. Therefore we cannot merge the parser into ctags directly (Remember ctags written in C). However, the format of tags file generated by CoffeeTags conforms to FORMAT <http://ctags.sourceforge.net/FORMAT>. This means we can reuse the output instead of reusing the parser source code.

With the new --xcmd-<LANG>=COMMAND option, ctags invokes COMMAND as an external parser command(xcmd) for input files written in LANG. ctags merges the output of COMMAND into tags file.

By default the following executables are searched with following order for finding xcmd COMMAND:

  1. ~/.ctags.d/drivers/COMMAND
  2. /usr/libexec/ctags/drivers/COMMAND

These are called built-in search path.

On GNU/Linux more directories can be added with the environment variable named CTAGS_LIBEXEC_PATH. As same as CTAGS_DATA_PATH, directories can be set with : separators to CTAGS_LIBEXEC_PATH. When searching COMMAND, ctags visits the directories before visiting the built-in search path.

More search paths can be added with --libexec-dir=DIR option. ctags visits DIR/drivers before visiting the directories specified with CTAGS_LIBEXEC_PATH and built-in search path. If ctags cannot find COMMAND, ctags treats COMMAND as an executable file, and tries to run it.

If an executable file as COMMAND needs to be specified explicitly, use absolute (starting with /) or relative path (starting with .) notations.

Generally, an executable file COMMAND should not be specified directly because ctags requires very specific behaviors (protocol). Generally available tags generator like CoffeeTags don’t conform with the expected protocol. Executables under the built-in search path are expected to fill the gap between generally available tags generator and universal-ctags. This is the reason why the name drivers is used as part of built-in search path.

To write a driver for a tags generator, please read -“xcmd protocol and writing a driver”.

xcmd v2.1 protocol and writing a driver

This is still experimental. The v1 protocol was obsoleted.

list-kinds enumeration

ctags invokes COMMAND specified with --xcmd-<LANG>=COMMAND twice. Once when ctags starts and registers COMMAND for LANG to its internal database. Once again when ctags requests COMMAND to parse an input file and for generating tags information.

At the first time ctags calls COMMAND with following command line:

$ COMMAND --list-kinds=LANG

The exit status and standard output are captured by ctags to know following two tings.

  1. Whether COMMAND can be available or not.
  2. Lists the tag kinds recognized by COMMAND.

Availability is detected by the exit status of COMMAND process; 0 means available. If the status is other than 0, the LANG parser is treated as disabled with warning messages. 127 is a special number; the LANG parser is treated as disabled without warning messages. You can override the code 127 with your own value with notAvailableStatus flag like:

... \
--xcmd-foo=./foo.sh{notAvailableStatus=42}
... \

Standard output contributes to know the lists. ctags expects following format when parsing the output:

^([^ \t])[ \t]+([^\t]+)([ \t]+(\[off\]))?$

The output lines matched above pattern are recognized as follows:

``\1``
kind letter

\2

kind name

[off] given after a kind name means the kind is disabled by default.

Here is the example command line and output of coffeetags driver:

$ libexec/drivers/coffeetags --list-kinds=coffee
f  function
c  class
o  object
v  var
p  proto
b  block

Generating tags

After getting kinds related information from COMMAND, ctags calls COMMAND with one argument, the name of input file:

$ COMMAND input-file

ctags expects COMMAND prints the result to standard output. ctags reads them via a pipe connected to the process of COMMAND.

ctags expects COMMAND generates tags elements with enabling all kinds. Tags elements of disabled kinds are filtered by ctags side when generating tags file.

Note for tags format

Read FORMAT about the expected format of xcmd output. Format 2 is expected. Sort is done in ctags side if necessary.

Tag lines in the output are merged to the final tags file with filtering; some fields in the tag lines may be dropped if user specifies --fields=- option.

In addition to real tag informations, Pseudo-tag lines started from !_TAG_ are expected.

Following example is taken from CoffeeTags:

$ libexec/drivers/coffeetags /dev/null
!_TAG_FILE_FORMAT       2       /extended format/
!_TAG_FILE_SORTED       0       /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Łukasz Korecki  [email protected]/
!_TAG_PROGRAM_NAME      CoffeeTags      //
!_TAG_PROGRAM_URL       https://github.com/lukaszkorecki/CoffeeTags     /GitHub repository/
!_TAG_PROGRAM_VERSION   0.5.0   //

ctags merges the Pseudo-tag lines with !LANG suffix:

$ ./ctags   --language-force=coffee foo.coffee; cat tags | grep '^!'
!_TAG_FILE_FORMAT       2       /extended format; --format=1 will not append ;" to lines/
!_TAG_FILE_SORTED       1       /0=unsorted, 1=sorted, 2=foldcase/
!_TAG_PROGRAM_AUTHOR    Darren Hiebert  [email protected]/
!_TAG_PROGRAM_AUTHOR!coffee     Łukasz Korecki  [email protected]/
!_TAG_PROGRAM_NAME      Exuberant Ctags //
!_TAG_PROGRAM_NAME!coffee       CoffeeTags      //
!_TAG_PROGRAM_URL       https://github.com/fishman/ctags        /official site/
!_TAG_PROGRAM_URL!coffee        https://github.com/lukaszkorecki/CoffeeTags     /GitHub repository/
!_TAG_PROGRAM_VERSION   Development     //
!_TAG_PROGRAM_VERSION!coffee    0.5.0   //

Integration to the source tree

Put your xcmd driver under libexec/drivers. This must be an executable; don’t forget dong chmod a+x.

Currently an executable file is written as a sh script; I assumed a driver may do a few very small things. sh may have enough functions this purpose and have enough portability. If you need some thing compiled language like C for writing a driver, we need to add targets for building and installing the driver to Makefile.in. Remember sh doesn’t mean bash.

Here is an example taken from libexec/drivers/coffeetags:

#!/bin/sh
<<... copyright notices are snipped ...>>
#
#
# This is a xcmd driver for CoffeeTags.
# CoffeeTags is developed at https://github.com/lukaszkorecki/CoffeeTags .
#
#
case "$1" in
--list-kinds*)
        coffeetags --list-kinds
        exit $?
        ;;
-*)
        echo "unknown option: $1" 1>&2
        exit 1
        ;;
*)
        coffeetags --include-vars "$1"
        exit $?
        ;;
esac

An optlib file is also needed to let ctags know the driver. Here is an example taken from data/optlib/coffee.ctags:

#
<<... copyright notices are snipped ...>>
#
--langdef=coffee
--map-coffee=+.coffee
--xcmd-coffee=coffeetags

Finally you have to add these new two files to Makefile.in. Add the name of driver file to DRIVERS variable like:

DRIVERS = coffeetags

Then add the name of optlib file to PRELOAD_OPTLIB or OPTLIB like:

PRELOAD_OPTLIB =    \
        \
        coffee.ctags \
        ...

If you add the optlib file to OPTLIB, it will not loaded automatically when ctags starts.

NOTE for writing a test case for xcmd

You may want to test the output merged from a xcmd. The test for xcmd should be conducted only if the xcmd is available.

Consider a system where coffeetags command is not installed, running test cases for coffeetags are meaningless. This means a stage for checking the availability of xcmd is needed before running a test case.

Units/TEST/languages is for the purpose. See “How to write a test case” in “Using Units”.