Keyferret layout file (.kflf) format
====================================

Keyferret loads its keystrokes from a layout file.  This file explains how it
works.  It's a long way from being robust documentation so I suggest you look at
the supplied global.kflf in parallel with this, in the program run directory. 
In most cases this is at: C:\Program Files (x86)\Keyferret\global.kflf

This is the version 3 format. Versions 1 and 2 had different formats.

I suggest you understand the basics of how Unicode works before tackling
anything too complicated.  Many tutorials exist.

You can ask Keyferret to load a different layout file by passing it as
a command-line parameter to Keyferret when you load it. Otherwise, it loads:
1. global.kflf from the program run directory
2. keyferret\*.kflf from the user's Documents directory


General
=======

Command parameters are delimited by either spaces or tabs. 
Parameters described as <param...> consume the rest of the line.

A hash (#) indicates a comment.  It and everything after it in the
line is ignored.

A backslash is used for the following sequences:
\# : Hash character
\\ : Backslash character
\s : Space character
\uXXXX : 4-digit hex unicode value given by XXXX.
\UXXXXXX : 6-digit hex unicode value given by XXXXXX.
\{abc} : Replaced by define abc (see below).


Replacement definitions
=======================

Most lines define "replacements". A replacement replaces zero or more characters
with an output string when a particular keystroke sequence is typed.

Replacement definitions comprise a number of space-separated fields, each
beginning with a character indicating the type of field. The fields, which
should be in order, are as follows:

------

! : Insert before other replacements with matching prefix and keystroke sequence.

Normally when multiple replacements are defined with the same prefix and
keystroke sequence, the one that is defined first has higher priority. This
command causes the new replacement to be inserted just before the existing one,
if it exists. This is useful user-defined files that override the default
layout.

Example:
    ! ^i =ï

------

> : Categories of letter which this can follow.

The replacement will only be available immediately following characters matching
the specified categories. Those characters are untouched by the keystroke
sequence. 

Categories are specified by two-letter codes defined by unicode, such as "Ll".
See https://en.wikipedia.org/wiki/Unicode_character_property. Categories can be
combined, e.g. "LuLl" for uppercase and lowercase letters. Major categories can
be specified with a star, e.g. "L*" for all letters. If a prefix is specified,
then this field looks at the character before the prefix.

Example:
    >L* ^' =\u0301 # ´

This causes RAlt+' to generate an acute accent, but is available only following
a letter.

------

- : Prefix string to be replaced.

The replacement will only be available if the prefix string is the most recent
set of characters output. If the keystroke sequence is input, the prefix is
deleted.

Example:
    -= ^~ =\u2248 # ≈

This causes the = character to be replaced by ≈ when RAlt+~ is pressed.

------

~ : Substituted prefix to be replaced.

Specifies a prefix string, but the characters are treated as if they had been
typed and replaced by the characters that would been output. This is mostly
useful in non-base layouts where the main keys have been remapped.

For example, in the Cyrillic layout (where RAlt+7 is used to create superscript
characters:

    ~d ^7 =\u2de3 # ⷣ

can be used instead of having to enter

    ~\u0434 ^7 =\u2de3 # ⷣ

or

    -д ^7 =\u2de3 # ⷣ

This mechanism is also used by the multiple ^ field feature described below.

------

_ : Bare (non-RAlt) key.

Key to be typed. 

Only a single key is supported. However, multiple replacements can be defined
for the same key, in which case subsequent ones are reached by pressing the same
key multiple times in quick succession.

Each replacement must specify a key sequence by including either a _ or ^ field,
but not both.

Example:
    _p =\u03c0 # π

Used in the Greek layout, this causes the p key to be remapped to π.

------

^ : RAlt key sequence.

Key sequence to be typed with the right Alt (AltGr) key pressed.

Each replacement must specify a key sequence by including either a _ or ^
field, but not both.

Example:
    ^ae =æ

Multiple ^ fields can be specified. A sequence:

    ^x ^y =z

is expanded into two commands:

    ~x ^y =z
    ^xy =z

Note the use of ~ instead of - to describe the prefix – this is almost certainly
the behaviour you want for non-base layouts.

See the section on "Discoverability" below for information on when this is
useful.

------

= : Output string.

When the key sequence occurs, the output string is typed, after deleting any
prefix.

Each keystroke sequence must specify an output string by including a = field.

Multiple = fields can be defined on the same line. This is equivalent to
defining several lines, one for each = field, with all the other fields the
same. Defining multiple replacements with the same RAlt key sequence is
generally discouraged as they are awkward to type. However, defining multiple
replacements for non-RAlt sequences can occasionally be useful.

You can cause a key to be ignored by specifying a null output string. For
example, in the Hebrew layout, a number of capital letters such as G are not
mapped to any character. If no key sequence is defined, then it would default to
the underlying keyboard mapping, outputting G when G is pressed. To prevent this
(without using the INTERCEPT_ALL command), the following is sufficient:

    _G =


Other commands
==============

H <help_text...>

For the next keystroke command, display <help_text> in the help window when the
keystroke is used.

This is only useful for RAlt keystrokes - non-RAlt sequences don't have a
mechanism to see any help added in this way.

When a keystroke command is expanded into multiple keystrokes sequences, such as
when there are multiple ^ fields, the help text applies to all such sequences.

------

D <char> <display_text>

When character <char> is displayed in the help, show it as <display_text>
instead. Useful for non-printing characters.

Example:
    D \u00a0 <nbsp>

This associates the string "<nbsp>" with the character 0xA0 (nonbreaking space)
in the help window.

------

C <out_str>

Auto-generate combining character sequence for <out_str> based on keystrokes
already defined.

While this doesn't normally affect the characters that can be typed, it provides
a convenient way to add explicit sequences to the layout database so that they
are more discoverable in the help.

Example:
    C \u00e1 # á

This:

1. Decomposes the character á into normalisation form NFD, giving the letter a
   followed by the acute accent combining character.

2. Find key sequences which generate the final code point, in this case the
   acute accent, which can be generated by RAlt+'.

3. Creates key sequences which act upon the remainder of the string as a prefix.
   In this case that gives:
       -a ^' =á

------

CR <char1> <char2>

Equivalent to a C command on every character between <char1> and <char2>.

------

LAYOUT <key> <name...>

Select the layout.  "BASE" selects the layout used when no other layout is
selected.  Otherwise <key> is the single-character value corresponding to the
key to press with CapsLock to select that layout.

<name> gives the name for the layout to show in the help window.

------

LAYOUT (BASE|<key>)

After the first definition of a layout, selects the layout again so you can add
more keystrokes to it.

------

INTERCEPT_ALL

Causes the layout not to pass keystokes to the keyboard driver if they are not
defined in the layout. This means that any keys not defined in the layout will
not do anything.

------

HL <help_text...>

Provide help text for the current layout.

------

FONT <font...>

Define a font to use to display characters.  The first defined font is used to
display each character when help is display, provided that that character is in
the font.  If not, the subsequent fonts are tried in order.

Note that this is just for the help window - it has no effect on the font used
by the application you are typing in!

Keystroke sequences resulting in characters that are not in any listed fonts are
ignored.

------

FPM <window_regex...>

<window_regex> is a regular expression which, if it matches a window title,
causes keystrokes typed in that window to use font preservation measures. This
is a somewhat ugly hack to try to make sure that the application doesn't change
the font to something unexpected when cycling through characters that are not in
the currently selected font: keystrokes are injected by the sequence
<sp><left><key><right><bs>, and deleted by <sp><left><bs><right><bs>.

This is necessary for Microsoft Office applications which use the Word editor.


Conditional parsing
===================

OPT <opt> <description...>

Creates an option <opt> which can be enabled or disabled in the system tray
menu. It will be shown with description <description>. By default these options
are false, but once selected, they are saved in the registry and are persistent
across sessions.

------

DEF <def> <value...>

Causes instances of \{<def>} to be replaced by <value>.

------

UNDEF <def>

Deletes define <def>.

------

IF <cond>
ELSE IF <cond>
ELSE
ENDIF

Flow control statements for conditional inclusion. <cond> can be:
OPT <opt>  : True if option <opt> is selected
DEF <def>  : True if define <def> exists
NOT <cond> : True if <cond> is false


Duplicate keystrokes
====================

When a RAlt sequence is typed, it is possible for multiple keystroke commands to
match, with or without prefixes. When this happens, they can all be selected by
continuing to hold down RAlt and pressing <space>. The order of precedence is:

1. Longer prefixes match before shorter prefixes.

2. Keystrokes with the same prefix length match in the order that they are
   defined.

When a non-RAlt sequence has multiple alternatives, ones after the first can be
reached by pressing the same key multiple times without pausing.


Magic
=====

"Transitive replacements" are created automatically, meaning if you 
create replacements "^a =b" and "-b ^c =d" then it will also create
"^ac =d". Such pairs can be declared in either order.

If you print a combining character, Keyferret will automatically try
to make a precomposed character out of it and the character before it
if one exists.  E.g. if you type the letter 'a' followed by a
keystroke generating a combining acute accent, Keyferret will delete
the previous a and replace it with an 'a acute' character.  In this
way the output is of Unicode Normalisation Form C (NFC), which is what
Windows expects keyboard drivers to produce, and is important if
you're typing into e.g. a code editor that can't handle combining
characters.

If you didn't understand the previous paragraph then it largely
translates to: to create characters with a diacritic, just create key
combinations for the combining character for the diacritic, and ignore
any Unicode characters which combine a letter with a diacritic --
they'll be generated automatically.


Discoverability
===============

Keystroke sequences should be discoverable from the help window. The help window
only displays keystroke sequences that are currently available, so keystroke
sequences which require a prefix are less discoverable than keystroke sequences
without prefixes. However, long sequences of RAlt keys are awkward to type. If a
character can only be reached after first typing a prefix, it should be really
intuitive that the prefix should be typed first.

Layouts should adhere to the following guidance where possible:

* Key sequences which place a diacritics on an unmodified letter should assume
  that letter as a prefix. It is reasonable to expect the user to first type the
  letter and then look for ways to add the diacritic. 
  
  For example, RAlt+` adds the grave accent to the previous character; there is
  no need for RAlt+e` to generate è.

* Key sequences which modify a letter should be available as *both* a sequence
  with that letter as a prefix (for easy typing) and a sequence where the letter
  is typed while holding RAlt (for discoverability). 

  For example, RAlt+/ will replace o with ø, and RAlt+o/ will generate ø without
  a prefix.

Generation of the second category of sequences is generally done with commands
of the form:

    ^o ^/ =\u00f8 # ø

This is expanded into the following two commands:

    ~o ^/ =\u00f8
    ^o/ =\u00f8

which has the desired effect.

When modifying a character which itself needs a RAlt sequence to generate it, a
regular prefix is normally sufficient. For example:

    -ə ^+ =\u1d4a # ᵊ

This takes the schwa character as a prefix, and turns it into a superscript when
RAlt++ is pressed. The schwa is created by RAlt+es ("e schwa"). There is no need
to create an additional sequence RAlt+es+, for discoverability, because of the
"transitive replacements" mechanism described above – this is created
automatically.
