Liblouis Table Specification
Opcodes
space
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "space"
- for using
$s
wildcard in multipass scripts (correct
,context
,pass2
,pass3
,pass4
) - for finding word boundaries (
largesign
,joinword
,joinnum
,contraction
,lowword
,sufword
,prfword
,begword
,begmidword
,midendword
,endword
,prepunc
,postpunc
,firstwordital
,firstwordbold
,firstwordunder
,lastworditalbefore
,lastwordboldbefore
,lastwordunderbefore
,lastworditalafter
,lastwordboldafter
,lastwordunderafter
) - for finding number boundaries (
begnum
) - for dropping space (
largesign
,joinnum
,joinword
)
- for using
punctuation
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "punctuation mark"
digit
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "digit"
- for using
$d
wildcard in multipass scripts (correct
,context
,pass2
,pass3
,pass4
) - for finding (the absense of) word boundaries (
prepunc
,postpunc
,firstwordital
,firstwordbold
,firstwordunder
,lastworditalbefore
,lastwordboldbefore
,lastwordunderbefore
,lastworditalafter
,lastwordboldafter
,lastwordunderafter
) - for finding (the absense of) number boundaries (
numsign
,begnum
,midnum
,endnum
,decpoint
) - for finding one-letter words and letters following a digit (
letsign
) - for joining a word and a digit (
joinnum
)
- for using
letter
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "letter"
- for using
$l
wildcard in multipass scripts (correct
,context
,pass2
,pass3
,pass4
) - for finding (the absense of) word boundaries (
repword
,largesign
,partword
,sufword
,prfword
,begword
,begmidword
,midendword
,endword
,prepunc
,postpunc
,singleletterital
,singleletterbold
,singleletterunder
,firstletterital
,firstletterbold
,firstletterunder
,lastletterital
,lastletterbold
,lastletterunder
,firstwordital
,firstwordbold
,firstwordunder
,lastworditalbefore
,lastwordboldbefore
,lastwordunderbefore
,lastworditalafter
,lastwordboldafter
,lastwordunderafter
) - for finding one-letter words and letters following a digit (
letsign
) - for joining a word and a letter (
joinword
)
- for using
lowercase
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as both "letter" and "lowercase"
uppercase
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as both "letter" and "uppercase"
uplow
- associates first character with first dots (
uppercase
) - defines first character and first dots as "uppercase letter" (
uppercase
) - associates second character with last dots (
lowercase
) - defines second character and last dots as "lowercase letter" (
lowercase
) - associates letters as each other's case-counterparts
- for mapping uppercase character to lowercase dots
when prefixed with a capital sign (
capsign
) - for matching translation rules written in lowercase letters on input strings containing uppercase letters
- for mapping uppercase character to lowercase dots
when prefixed with a capital sign (
lowercase f 124 uppercase F 1247 uplow Oo 1357,135 letter u 136 always foo 124-136 always OOf 136-1247 # translation rules with uppercase letters # defined with uplow don't work
{ "input": "foo", "output": "fu" }, { "input": "FOO", "output": "FOO" }, { "input": "fOO", "output": "fu" }, { "input": "OOf", "output": "OOf" }
litdigit
- associates character with dots
- for replacing character with dots
- has precedence over space, digit, punctuation, math, sign, letter, uppercase and lowercase
- for replacing dots with character during backward
translation if they're part of a number (
numsign
)- has precedence over space, digit, punctuation, math, sign, letter, uppercase and lowercase
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- for replacing character with dots
- defines character (and dots if single-cell) as "literary digit"
letter a 1 litdigit 1 1 sign # 3456 numsign 3456
/* TODO: backward translation */ { "input": "#a", "output": "1" }, { "input": "a", "output": "a" }
sign
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "sign (without special meaning)"
math
- associates character with dots
- for replacing character with dots
- for mapping dots to character in display phase if dots is a single-cell dot pattern
- defines character (and dots if single-cell) as "mathematical symbol"
capsign
- defines dots as "capital sign"
uplow Ff 1247,124 lowercase o 135 uppercase O 1357 uplow Uu 1367,136 punctuation , 6 capsign 6 always uu 1367
{ "input": "Foo", "output": ",foo" }, { "input": "FOO", "output": ",f,O,O" }, { "input": "Fuu", "output": ",fU" }, { "input": "FUU", "output": ",f,U" }, { "input": "FuU", "output": ",fu,u" }, { "input": "FUu", "output": ",f,U" }
begcaps
- defines dots as sign that announces a block of uppercase letters (
uppercase
)lookbehind != uppercase && lookahead = ( uppercase uppercase )
endcaps
- defines dots as sign that closes a block of uppercase
letters within a word (
uppercase
,lowercase
)lookbehind = ( uppercase uppercase ) && lookahead = lowercase
letter f 124 uplow Oo 135 punctuation , 6 punctuation ' 3 begcaps 6-6 endcaps 6-3
{ "input": "foo", "output": "foo" }, { "input": "fOO", "output": "f,,oo" }, { "input": "fOOo", "output": "f,,oo,'o" }
letsign
- defines dots as "letter sign"
- for inserting before a one-letter word
- for inserting between a digit and a letter, except in case of
endnum
(letter
,digit
,endnum
) - for inserting before a word that is also a contraction (
contraction
)
letter f 124 letter o 135 digit 0 356 punctuation ; 56 letsign 56
{ "input": "f", "output": ";f" }, { "input": "foo", "output": "foo" }, { "input": "0foo", "output": "0;foo" }
noletsign
- inhibits the use of a letter sign when any of characters occur as a one-letter word or after a digit
letter f 124 letter o 135 punctuation ; 56 letsign 56 contraction foo noletsign f
{ "input": "f", "output": "f" }, { "input": "foo", "output": ";foo" }
noletsignbefore
- inhibits the use of a letter sign when any of characters preceed a one-letter word
noletsignafter
- inhibits the use of a letter sign when any of characters follow a one-letter word
letter f 124 punctuation ; 56 punctuation ( 12356 punctuation ) 23456 letsign 56 noletsignbefore (
{ "input": "f", "output": ";f" }, { "input": "(f", "output": "(f" }, { "input": "f)", "output": ";f)" }
numsign
- defines dots as "number sign"
digit 0 356 sign # 3456 numsign 3456
{ "input": "0", "output": "#0" }
space \s 0 letter a 1 letter b 12 letter c 14 letter d 145 letter f 124 letter l 123 letter o 135 letter r 1235 letter u 136 letter z 1356 digit 0 356 digit 1 2 punctuation . 256 punctuation - 36 punctuation ; 56
TODO comp6
TODO replace
always
- matches characters
- replaces matched characters with dots
include chardefs.cti always bar 1356
{ "input": "bar", "output": "z" }, { "input": "foobar", "output": "fooz" }
begmidword
- matches characters when they are either at the beginning or in the middle of a word (
space
,punctuation
,letter
)lookbehind = ( space | puntuation | letter ) && lookahead = letter
- replaces matched characters with dots
begnum
- matches characters when they are at the beginning of a number (
space
,punctuation
,digit
)lookbehind = ( space | punctuation ) && lookahead = digit
- replaces matched characters with dots
begword
- matches characters when they are at the beginning of a word (
space
,punctuation
,letter
)lookbehind = ( space | puntuation ) && lookahead = letter
- replaces matched characters with dots
contraction
- matches characters when they are a word (
space
,punctuation
)lookbehind = ( space | punctuation ) && lookahead = ( space | punctuation )
- replaces each matched character with its associated dot patterns
- inserts letter sign before word (
letsign
)
include chardefs.cti letsign 56 word could 14-145 contraction cd
{ "input": "could", "output": "cd" }, { "input": "cd", "output": ";cd" }
endnum
- matches characters when they are at the end of a number (
digit
)lookbehind = digit
- replaces matched characters with dots
- inhibits the use of a letter sign (
letsign
)
letter t 2345 letter h 125 letter s 234 digit 5 15 punctuation ? 1456 punctuation ; 56 letsign 56 endnum th 1456
{ "input": "th", "output": "th" }, { "input": "5th", "output": "5?" }, { "input": "5ths", "output": "5?s" }, { "input": "5t", "output": "5;t" }
endword
- matches characters when they are at the end of a word (
space
,punctuation
,letter
)lookbehind = letter && lookahead = ( space | punctuation )
- replaces matched characters with dots
include chardefs.cti endword oo 136
{ "input": "foo", "output": "fu" }, { "input": "foo ", "output": "fu " }, { "input": "foo.", "output": "fu." }, { "input": "foobar", "output": "foobar" }
joinnum
- matches characters when they are a word and a space and a digit follow (
space
,punctuation
,digit
)lookbehind = ( space | puntuation ) && lookahead = ( space+ digit )
- replaces matched characters with dots
- drops space between characters and digit
include chardefs.cti joinnum foo 124
{ "input": "foo", "output": "foo" }, { "input": "foo 0", "output": "f0" }
joinword
- matches characters when they are a word and a space and a letter follow (
space
,punctuation
,letter
,litdigit
)lookbehind = ( space | punctuation ) && lookahead = ( space+ ( letter | litdigit ) )
- replaces matched characters with dots
- drops space between characters and following letter
include chardefs.cti joinword foo 124
{ "input": "foo", "output": "foo" }, { "input": "foo bar", "output": "fbar" }
largesign
- matches characters
- replaces matched characters with dots
- drops space between adjacent largesign words (
space
,punctuation
,letter
)lookbehind = ( ( space | punctuation ) largesign space+ ) && lookahead != letter
include chardefs.cti largesign foo 124 largesign bar 12
{ "input": "foobar", "output": "fb" }, { "input": "foo bar", "output": "fb" }, { "input": "foo barr", "output": "f br" }
lowword
- matches characters when they are a word preceded and followed by whitespace (
space
)lookbehind = space && lookahead = space
- replaces matched characters with dots
include chardefs.cti lowword foo 124-136
{ "input": "foo", "output": "fu" }, { "input": "foo ", "output": "fu " }, { "input": "foo.", "output": "foo." }
midendword
- matches characters when they are either in the middle or at the end of a word (
space
,punctuation
,letter
)lookbehind = letter && lookahead = ( space | punctuation | letter )
- replaces matched characters with dots
midnum
- matches characters when they are in the middle of a number (
digit
)lookbehind = digit && lookahead = digit
- replaces matched characters with dots
digit 0 245 punctuation . 256 punctuation , 6 sign # 3456 numsign 3456 midnum , 256
{ "input": "0,0", "output": "#0.0" }, { "input": "0.0", "output": "#0.#0" }
midword
- matches characters when they are in the middle of a word (
letter
)lookbehind = letter && lookahead = letter
- replaces matched characters with dots
nocross
- matches characters when they do not cross syllable boundaries
- replaces matched characters with dots
foo1bar
include chardefs.cti include hyph.dic nocross foob 124-136-12 nocross bar 12
{ "input": "foobar", "output": "foob" }
partword
- matches characters when they are part of a word but not the whole word (
letter
)lookbehind = letter || lookahead = letter
- replaces matched characters with dots
include chardefs.cti partword oo 136
{ "input": "oo", "output": "oo" }, { "input": "foo", "output": "fu" }, { "input": "foobar", "output": "fubar" }
postpunc
- matches characters when they are part of punctuation at the end of a word (
space
,punctuation
,letter
,digit
)characters(1) = punctuation && lookbehind = ( ( letter | digit ) ( !space )* ) && lookahead != letter
- replaces matched characters with dots
letter f 124 letter o 135 sign < 126 sign > 345 sign # 3456 punctuation ' 5 punctuation ( 12356 punctuation ) 23456 prepunc ( 5-126 postpunc ) 5-345
{ "input": "(foo)", "output": "'<foo'>" }, { "input": "( foo )", "output": "( foo )" }, { "input": "(#foo)", "output": "'<#foo'>" }, { "input": "#(foo)", "output": "#'<foo'>" }
include chardefs.cti prepunc .foo 124-136
{ "input": ".foobar", "output": "fubar" }
prepunc
- matches characters when they are part of punctuation at the beginning of a word (
space
,punctuation
,letter
,digit
)characters(1) = punctuation && lookbehind != letter && lookahead = ( ( !space )* ( letter | digit ) )
- replaces matched characters with dots
prfword
- matches characters when they are either a word or at the end of a word (
space
,punctuation
,letter
)lookbehind = ( space | puntuation | letter ) && lookahead = ( space | punctuation )
- replaces matched characters with dots
repeated
- matches characters
- replaces matched characters with dots for first match
lookbehind != repeated
- drops characters for consecutive repetitions
lookbehind = repeated
punctuation - 36 repeated --- 36-36-36
{ "input": "---", "output": "---" }, { "input": "------", "output": "---" }, { "input": "-------", "output": "----" }
repword
- matches characters when the word before it equals the word after it
- replaces matched characters with dots and drops word after it
include chardefs.cti repword - 1356
{ "input": "foo-foo", "output": "fooz" }, { "input": "foo-foo-foo", "output": "fooz" }
sufword
- matches characters when they are either a word or at the beginning of a word (
space
,punctuation
,letter
)lookbehind = ( space | puntuation ) && lookahead = ( space | punctuation | letter )
- replaces matched characters with dots
syllable
- matches characters
- replaces matched characters with dots
- inhibits other contractions across boundaries either from left or right
include chardefs.cti syllable bar = always foob 124-136-12 always foobar 124-12
{ "input": "fooba", "output": "fuba" }, { "input": "foobar", "output": "foobar" }
word
- matches characters when they are a word (
space
,punctuation
)lookbehind = ( space | puntuation ) && lookahead = ( space | punctuation )
- replaces matched characters with dots
include chardefs.cti word foo 124-136
{ "input": "foo", "output": "fu" }, { "input": "foo.", "output": "fu." }, { "input": "foobar", "output": "foobar" }
decpoint
- matches when character preceeds a digit (
digit
)lookahead = digit
- replaces matched characters with dots
digit 0 245 punctuation . 46 punctuation ; 56 sign # 3456 numsign 3456 decpoint . 46
{ "input": ".0", "output": "#.0" }, { "input": "0.0", "output": "#0.0" }, { "input": ";0", "output": ";#0" }
context
- matches characters expressed by test
- replaces matched characters with dot patterns expressed by action
pass2
- considered in the second pass only
- matches dot patterns expressed by test
- replaces matched characters with dot patterns expressed by action
pass3
- considered in the third pass only
- matches dot patterns expressed by test
- replaces matched characters with dot patterns expressed by action
pass4
- considered in the fourth pass only
- matches dot patterns expressed by test
- replaces matched characters with dot patterns expressed by action
correct
- considered in the corrections phase only
- matches characters expressed by test
- replaces matched characters with characters expressed by action
TODO display
TODO hyphen
TODO class
TODO before
TODO after
TODO grouping
TODO swapcd
TODO swapdd
TODO swapcc
TODO exactdots
TODO nofor
TODO noback
Appendix I: Translation algorithm in pseudo-code
(def table (compile-table)) (defn translate [input typeform] "" {:pre [(= (count input) (count typeform))]} (let [;; pass 0 tmp (make-corrections input) ;; pass 1 syl-info (mark-syllables tmp1) cap-info (mark-capitals tmp1) tmp (loop [pos 0 tmp2 [] prev-rules [] emph-info []] (if (>= pos (count tmp)) tmp2 (if-let [[pos tmp2] (maybe-translate-comp-braille tmp typeform pos tmp2)] (recur pos tmp2 prev-rules emph-info) (let [[tmp2 emph-info] (maybe-insert-emph-indicator tmp typeform emph-info pos tmp2) rule (select-rule tmp syl-info cap-info pos prev-rules)] (if (= (rule-type rule) :compbrl) (let [[pos tmp2] (do-compbrl tmp pos tmp2)] (recur pos tmp2 prev-rules emph-info)) (let [tmp2 (or (maybe-insert-num-indicator tmp pos rule prev-rules tmp2) (maybe-insert-let-indicator tmp pos rule tmp2) (maybe-insert-cap-indicator tmp pos tmp2) tmp2) tmp2 (if (= (rule-type rule) :largesign) (trim-trailing-space tmp2) tmp2)] (let [[pos tmp2] (do-translation tmp pos rule tmp2) prev-rules (conj prev-rules rule)] (recur pos tmp2 prev-rules emph-info)))))))) ;; pass 2 to 4 tmp (loop [n 2 tmp tmp] (if (> n 4) tmp (recur (+1 n) (loop [pos 0 tmp2 []] (if (>= pos (count tmp)) tmp2 (if-let [rule (select-multipass-rule tmp pos n)] (let [[pos tmp2] (do-translation tmp pos rule tmp2)] (recur pos tmp2)) (let [tmp2 (conj tmp2 (tmp pos)) pos (+1 pos)] (recur pos tmp2))))))))] tmp)) (defn make-corrections [input] "(See opcode correct.)") (defn mark-syllables [input] "(See opcode syllable.)") (defn mark-capitals [input] "(See opcode uplow.)") (defn maybe-translate-comp-braille [input typeform pos output] "Maybe translate part of the `input' starting at position `pos' as computer braille, based on the `typeform' parameter. (See opcodes begcomp and endcomp.)" {:pre [(= (count input) (count typeform)) (< pos (count input))] :post [(if-let [[new-pos new-output] %] (and (> new-pos pos) (only-appended? output new-output) (> (count new-output) (count output))) true)]}) (defn maybe-insert-emph-indicator [input typeform emph-info pos output] "Maybe insert an emphasis indicator based on the `typeform' parameter at position `pos' and info `emph-info' from previous iterations. Maybe mark positions in the input for future insertion of emphasis indicators. (See opcodes singleletter__, firstletter__, lastletter__, firstword__, lastword__before, lastword__after and len__phrase.)" {:pre [(= (count input) (count typeform)) (< pos (count input))] :post [(let [[new-output new-emph-info] %] (and (only-appended? output new-output) (only-appended? emph-info new-emph-info)))]}) (defn select-rule [input syl-info cap-info pos prev-rules] "Select a translation rule that matches the `input' at position `pos', based on info about syllable and capital positions, and a list of previously applied rules.") (defn do-compbrl [input pos output] "") (defn maybe-insert-num-indicator [input pos rule prev-rules output] "" {:post [(if-let [new-output %] (and (only-appended? output new-output) (> (count new-output) (count output))) true)]}) (defn maybe-insert-let-indicator [input pos rule output] "" {:post [(if-let [new-output %] (and (only-appended? output new-output) (> (count new-output) (count output))) true)]}) (defn maybe-insert-cap-indicator [input pos output] "" {:post [(if-let [new-output %] (and (only-appended? output new-output) (> (count new-output) (count output))) true)]}) (defn trim-trailing-space [output] "Trim trailing space characters from `output'." {:post [(= % (take (count %) output))]}) (defn do-translation [input pos rule output] "Apply the translation rule on `input' at position `pos'." {:post [(let [[new-pos new-output] %] (and (> new-pos pos) (only-appended? output new-output)))]}) (defn select-multipass-rule [input pos n] "Select a multipass translation rule for pass `n' that matches the `input' at position `pos'. (See opcodes pass2, pass3 and pass4)." {:pre [(#{2 3 4} n)] :post [(if-let [rule %] (= (rule-type %) (case n 2 :pass2 3 :pass3 4 :pass4)) true)]}) (defn only-appended? [old new] (= old (take (count old) new)))