Modulo:mrooteo
Aspekto
MODULO | ||
Memtesto disponeblas sur la paĝo Ŝablono:kat-elemento-eo. |
- Ĉi tiu modulo efektivigas laboron de ŝablono
{{kat-elemento-eo}}
, kaj aldone estas vokata far((mfarado))
, do dureĝima modulo. - Dependas de:
- permodule legata ŝablono
{{radikoj}}
- permodule legata ŝablono
--[===[
MODULE "MROOTEO" (root eo ie Esperanto root)
"eo.wiktionary.org/wiki/Modulo:mrooteo" <!--2024-Nov-18-->
Purpose: describes and categorizes an Esperanto morpheme according to a
list stored in a separate template, to be used on category pages
Utilo: priskribas kaj enkategoriigas esperantan morfemon laux
listo konservita en aparta sxablono, uzinda sur kategoriaj pagxoj
Manfaat: ...
Syfte: beskriver och kategoriserar morfem paa esperanto ...
Used by templates / Uzata far sxablonoj:
* "kat-elemento-eo"
Used by modules:
* "mfarado"
Required submodules / Bezonataj submoduloj:
* none / neniuj
Required templates:
* "SXablono:radikoj"
Incoming: * named obligatory "in=" one of 3 types:
* bare root (I: -il-, M: nul, N: mov, P: fi-, U: -j) only lowercase !!!FIXME!!! NOT yet
ASCII plus 6 lowercase -eo- letters, identified by NOT
containing any space
* new-style pagename "Vorto -eo- enhavanta morfemon N (kapt)"
identified by containing at least one space and last character
being a bracket ")"
* legacy-style pagename, identified by containing at least one
space and last character NOT being a bracket ")"
colons ":" and underscores "_" prohibited in order to prevent
things like "Kategorio:Radiko arb'" (fullpagename) or
"Radiko_arb'" (URL-style)
* named optional "vs=" word class "o" "a" "i" one or two letters !!!FIXME!!! NOT yet
* 1 special parameter
"givetable=" value "true" if called from a module
* 2 hidden parameters
* "nocat=" no error possible
* "detrc=" no error possible
Pagename is never accessed, since this module is intended to be
called from other module at the end, rather than from a template.
For same reason there is no point to peek the caller's frame. No
anonymous parameters are tolerated. !!!FIXME!!! not yet checked
Reads template "SXablono:radikoj":
* empty lines skipped
* length of a nonempty line 2 ... 100'000 octet:s !!!FIXME!!! NOT yet checked
* nonempty lines must alternate level-2 wiki headings with root lists
* leading spaces, trailing spaces and multiple spaces are prohibited
* headings
* 7 ... 200 octet:s !!!FIXME!!! NOT yet checked
* headings must be formed like "== " ... " ==" with exactly ONE
separation space at every side
* legal char:s in headings: !!!FIXME!!! NOT yet checked
* dash "-"
* 10 ASCII numbers
* 26 ASCII UPPERCase
* 26 ASCII lowercase
* 6 -eo- UPPERCase
* 6 -eo- lowercase
* sorting of the headings is not required but dupes are prohibited
* root lists
* 2 ... 100'000 octet:s (checked already before)
* root lists contain a list of roots, all in one line, separated by spaces
* legal char:s in root lists: !!!FIXME!!! NOT yet checked
* dash "-" earliest or last in a root only
* 26 ASCII lowercase
* 6 -eo- lowercase
* root length 2...40 octet:s
* sorting of the roots is not required but dupes are prohibited
Strategy:
* identify the type of incoming root from "in=" by dashes and a constant
table with nonstandalone roots !!!FIXME!!! NOT yet
* read the template and convert the raw text block into single non-empty
lines (at least 2 char:s, prohibit leading and trailing spaces) in a table
* we have more 3 tables (besides the one for lines), one for all headings,
one for hits by headings, and one for roots collected from a line
* every new heading is added into the table for all headings with
the stripped heading being the key/index, and value always "true"
avoiding dupes that way
* every new heading is also temporarily stored in a string
* walk through the roots putting them all into a table (emptied at the
beginning of the line) with the root being the key/index, and value
always "true" avoiding dupes that way, if the root from the list is equal
the incoming root, then store the heading into the table for hits by
headings, still do NOT abort search at a hit
* for categories translate the found headings into names by means of
a constant table, pass unchanged if no translation found
* add an extra category based on the number of hits ZERO or non-ZERO
Error codes:
* #E01 internal
* #E02 "in=" bad (wrong length, or contains colon or underscore)
* #E03 "in=" bad (later type identification failed)
* #E04 template obviously bad (not found or empty or quasi-empty)
* #E05 empty or bad line (too short (<2) or leading or trailing space
or double space)
* #E16 number of lines bad (must be even and at least 2)
* #E17 overall pattern bad (alternating lines required)
* #E19 heading bad (too short or bad equal signs or bad char)
* #E20 dupe heading
* #E21 root list bad use of spaces ie empty root
* #E22 invalid length of root, must be 2...40 octet:s including possible dash
* #E23 invalid char in single root (other than eo lowercase)
* #E24 in single root illegal use of "-" (two consecutive
dashes, dash in other position than earliest and last)
* #E26 dupe root in one line (2 details follow)
]===]
local exporttable = {}
require('strict')
------------------------------------------------------------------------
---- CONSTANTS [O] ----
------------------------------------------------------------------------
local constrtemplate = string.char(0xC5,0x9C) .. "ablono:radikoj"
local contabvisi = {}
contabvisi [0] = 'La elemento "'
contabvisi [1] = '" (tipo ' -- needs terminating ") " from elsewhere
contabvisi [2] = 'troveblas en '
contabvisi ['netr'] = 'ne troveblas en iu listo kaj do estas neoficiala' -- NO dot at end
local contabcats = {}
contabcats [0] = "Ne-AV-elemento" -- ZERO hits
contabcats [1] = "AV-elemento" -- non-ZERO hits
contabcats [ 'OA0'] = "Fundamenta elemento"
contabcats [ 'OA1'] = "Elemento de la 1-a Oficiala Aldono"
contabcats [ 'OA2'] = "Elemento de la 2-a Oficiala Aldono"
contabcats [ 'OA3'] = "Elemento de la 3-a Oficiala Aldono"
contabcats [ 'OA4'] = "Elemento de la 4-a Oficiala Aldono"
contabcats [ 'OA5'] = "Elemento de la 5-a Oficiala Aldono"
contabcats [ 'OA6'] = "Elemento de la 6-a Oficiala Aldono"
contabcats [ 'OA7'] = "Elemento de la 7-a Oficiala Aldono"
contabcats [ 'OA8'] = "Elemento de la 8-a Oficiala Aldono"
contabcats [ 'OA9'] = "Elemento de la 9-a Oficiala Aldono"
contabcats ['OA10'] = "Elemento de la 10-a Oficiala Aldono"
------------------------------------------------------------------------
---- SPECIAL STUFF OUTSIDE MAIN [B] ----
------------------------------------------------------------------------
-- SPECIAL VAR:S
local qboodetrc = true -- from "detrc=true" but default is "true" !!!
local qstrtrace = '<br>' -- for main & functions, debug report req by "detrc="
------------------------------------------------------------------------
---- DEBUG FUNCTIONS [D] ----
------------------------------------------------------------------------
-- Local function LFDTRACEMSG
-- Enhance upvalue "qstrtrace" with fixed text.
-- for variables the other sub "lfdshowvar" is preferable but in exceptional
-- cases it can be justified to send text with values of variables to this sub
-- no size limit
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdtracemsg (strshortline)
if (qboodetrc and (type(strshortline)=='string')) then
qstrtrace = qstrtrace .. strshortline .. '.<br>' -- dot added !!!
end--if
end--function lfdtracemsg
------------------------------------------------------------------------
-- Local function LFDMINISANI
-- Input : * strdangerous -- must be type "string", empty legal
-- * numlimitdivthree
-- Output : * strsanitized -- can happen to be quasi-empty with <<"">>
-- To be called from "lfdshowvcore" <- "lfdshowvar" only.
-- * we absolutely must disallow: cross "#" 35 | apo "'" 39 |
-- star "*" 42 | dash 45 | colon 58 | "<" 60 | ">" 62 | "[" 91 | "]" 93
-- * spaces are showed as "{32}" if repetitive or at begin or at end
local function lfdminisani (strdangerous, numlimitdivthree)
local strsanitized = '"' -- begin quot
local num38len = 0
local num38index = 1 -- ONE-based
local num38signo = 0
local num38prev = 0
local boohtmlenc = false
local boovisienc = false
num38len = string.len (strdangerous)
while true do
boohtmlenc = false -- % reset on
boovisienc = false -- % every iteration
if (num38index>num38len) then -- ONE-based
break -- done string char after char
end--if
num38signo = string.byte (strdangerous,num38index,num38index)
if ((num38signo<43) or (num38signo==45) or (num38signo==58) or (num38signo==60) or (num38signo==62) or (num38signo==91) or (num38signo==93) or (num38signo>122)) then
boohtmlenc = true
end--if
if ((num38signo<32) or (num38signo>126)) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if ((num38signo==32) and ((num38prev==32) or (num38index==1) or (num38index==num38len))) then
boovisienc = true -- overrides "boohtmlenc"
end--if
if (boovisienc) then
strsanitized = strsanitized .. '{' .. tostring (num38signo) .. '}'
else
if (boohtmlenc) then
strsanitized = strsanitized .. '&#' .. tostring (num38signo) .. ';'
else
strsanitized = strsanitized .. string.char (num38signo)
end--if
end--if
if ((num38len>(numlimitdivthree*3)) and (num38index==numlimitdivthree)) then
num38index = num38len - numlimitdivthree -- jump forwards
strsanitized = strsanitized .. '" ... "'
else
num38index = num38index + 1 -- ONE-based
end--if
num38prev = num38signo
end--while
strsanitized = strsanitized .. '"' -- don't forget final quot
return strsanitized
end--function lfdminisani
------------------------------------------------------------------------
-- Local function LFDSHOWVCORE
-- Prebrew report about content of a variable including optional full
-- listing of a table with numeric and string keys. !!!FIXME!!!
-- Input : * vardubious -- content (any type including "nil" is acceptable)
-- * str77name -- name of the variable (string)
-- * vardescri -- optional comment, default empty, begin with "@" to
-- place it before name of the variable, else after
-- * varlim77tab -- optional limit, limits both string keys and
-- numeric keys, default ZERO no listing
-- Depends on functions :
-- [D] lfdminisani
local function lfdshowvcore (vardubious, str77name, vardescri, varlim77tab)
local taballkeystring = {}
local strtype = ''
local strreport = ''
local numindax = 0
local numlencx = 0
local numkeynumber = 0
local numkeystring = 0
local numkeycetera = 0
local numkey77min = 999999
local numkey77max = -999999
local boobe77fore = false
if (type(str77name)~='string') then
str77name = '??' -- bite the bullet
else
str77name = '"' .. str77name .. '"'
end--if
if (type(vardescri)~='string') then
vardescri = '' -- omit comment
end--if
if (string.len(vardescri)>=2) then
boobe77fore = (string.byte(vardescri,1,1)==64) -- prefix "@"
if (boobe77fore) then
vardescri = string.sub(vardescri,2,-1) -- CANNOT become empty
end--if
end--if
if (type(varlim77tab)~='number') then
varlim77tab = 0 -- deactivate listing of a table
end--if
if ((vardescri~='') and (not boobe77fore)) then
str77name = str77name .. ' (' .. vardescri .. ')' -- now a combo
end--if
strtype = type(vardubious)
if (strtype=='table') then
for k2k,v2v in pairs(vardubious) do
if (type(k2k)=='number') then
numkey77min = math.min (numkey77min,k2k)
numkey77max = math.max (numkey77max,k2k)
numkeynumber = numkeynumber + 1
else
if (type(k2k)=='string') then
taballkeystring [numkeystring] = k2k
numkeystring = numkeystring + 1
else
numkeycetera = numkeycetera + 1
end--if
end--if
end--for
strreport = 'Table ' .. str77name
if ((numkeynumber==0) and (numkeystring==0) and (numkeycetera==0)) then
strreport = strreport .. ' is empty'
else
strreport = strreport .. ' contains '
if (numkeynumber==0) then
strreport = strreport .. 'NO numeric keys'
end--if
if (numkeynumber==1) then
strreport = strreport .. 'a single numeric key equal ' .. tostring (numkey77min)
end--if
if (numkeynumber>=2) then
strreport = strreport .. tostring (numkeynumber) .. ' numeric keys ranging from ' .. tostring (numkey77min) .. ' to ' .. tostring (numkey77max)
end--if
strreport = strreport .. ' and ' .. tostring (numkeystring) .. ' string keys and ' .. tostring (numkeycetera) .. ' other keys'
end--if
if ((numkeynumber~=0) and (varlim77tab~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content num keys :'
numindax = numkey77min
while true do
if ((numindax>varlim77tab) or (numindax>numkey77max)) then
break -- done table
end--if
strreport = strreport .. ' ' .. tostring(numindax) .. ' -> ' .. lfdminisani(tostring(vardubious[numindax]),30)
numindax = numindax + 1
end--while
end--if
if ((numkeystring~=0) and (varlim77tab~=0)) then -- !!!FIXME!!!
strreport = strreport .. ' ### content string keys :'
end--if
else
strreport = 'Variable ' .. str77name .. ' has type "' .. strtype .. '"'
if (strtype=='string') then
numlencx = string.len (vardubious)
strreport = strreport .. ' and length ' .. tostring (numlencx)
if (numlencx~=0) then
strreport = strreport .. ' and content ' .. lfdminisani (vardubious,30)
end--if
else
if (strtype~='nil') then
strreport = strreport .. ' and content "' .. tostring (vardubious) .. '"'
end--if
end--if (strtype=='string') else
end--if (strtype=='table') else
if ((vardescri~='') and boobe77fore) then
strreport = vardescri .. ' : ' .. strreport -- very last step
end--if
return strreport
end--function lfdshowvcore
------------------------------------------------------------------------
-- Local function LFDSHOWVAR
-- Enhance upvalue "qstrtrace" with report about content of a
-- variable including optional full listing of a table with numerical
-- and string indexes. !!!FIXME!!!
-- Depends on functions :
-- [D] lfdminisani lfdshowvcore
-- upvalue "qstrtrace" must NOT be type "nil" on entry (is inited to "<br>")
-- uses upvalue "qboodetrc"
local function lfdshowvar (varduubious, strnaame, vardeskkri, vartabljjm)
if (qboodetrc) then
qstrtrace = qstrtrace .. lfdshowvcore (varduubious, strnaame, vardeskkri, vartabljjm) .. '.<br>' -- dot added !!!
end--if
end--function lfdshowvar
------------------------------------------------------------------------
---- MATH FUNCTIONS [E] ----
------------------------------------------------------------------------
local function mathmod (xdividendo, xdivisoro)
local resultmod = 0 -- MOD operator is "%" and bitwise AND operator lack too
resultmod = xdividendo % xdivisoro
return resultmod
end--function mathmod
------------------------------------------------------------------------
---- LOW LEVEL STRING FUNCTIONS [G] ----
------------------------------------------------------------------------
-- Local function LFGSTRINGRANGE
local function lfgstringrange (varvictim, nummini, nummaxi)
local nummylengthofstr = 0
local booveryvalid = false -- preASSume guilt
if (type(varvictim)=='string') then
nummylengthofstr = string.len(varvictim)
booveryvalid = ((nummylengthofstr>=nummini) and (nummylengthofstr<=nummaxi))
end--if
return booveryvalid
end--function lfgstringrange
------------------------------------------------------------------------
local function lfgtestuc (numkode)
local booupperc = false
booupperc = ((numkode>=65) and (numkode<=90))
return booupperc
end--function lfgtestuc
local function lfgtestlc (numcode)
local boolowerc = false
boolowerc = ((numcode>=97) and (numcode<=122))
return boolowerc
end--function lfgtestlc
------------------------------------------------------------------------
---- UTF8 FUNCTIONS [U] ----
------------------------------------------------------------------------
-- Local function LFULNUTF8CHAR
-- Evaluate length of a single UTF8 char in octet:s.
-- Input : * numbgoctet -- beginning octet of a UTF8 char
-- Output : * numlen1234x -- unit octet, number 1...4, or ZERO if invalid
-- Does NOT thoroughly check the validity, looks at ONE octet only.
local function lfulnutf8char (numbgoctet)
local numlen1234x = 0
if (numbgoctet<128) then
numlen1234x = 1 -- $00...$7F -- ANSI/ASCII
end--if
if ((numbgoctet>=194) and (numbgoctet<=223)) then
numlen1234x = 2 -- $C2 to $DF
end--if
if ((numbgoctet>=224) and (numbgoctet<=239)) then
numlen1234x = 3 -- $E0 to $EF
end--if
if ((numbgoctet>=240) and (numbgoctet<=244)) then
numlen1234x = 4 -- $F0 to $F4
end--if
return numlen1234x
end--function lfulnutf8char
------------------------------------------------------------------------
-- Local function LFUSEARCHCASEPAIR
-- Search pair of undecoded values (UPPER, lower) based on ONE
-- value (unrolled loop).
-- Input : * strmyset -- "eo" or "sv" or "GENE" (no "ASCII" here) !!!FIXME!!! GENE does NOT work yet
-- * num16valin -- undecoded UINT16BE value, $C200... $DEFF
-- Output : * num16upper, num16lower -- ZERO if nothing found
-- Called from "lfucomparecibegin" and "lficaseadvjaem".
local function lfusearchcasepair (strmyset, num16valin)
local num16upper = 0
local num16lower = 0
local function xxxdicompare (xxxupper, xxxlower) -- only 3 upvalues used
local xxxfound = false
if ((num16valin==xxxupper) or (num16valin==xxxlower)) then
num16upper = xxxupper
num16lower = xxxlower
xxxfound = true
end--if
end--function xxxdicompare
while true do -- fake loop
if (strmyset~='eo') then
break -- do NOT check
end--if
if (xxxdicompare(0xC488,0xC489)) then
break -- found it
end--if
if (xxxdicompare(0xC49C,0xC49D)) then
break -- found it
end--if
if (xxxdicompare(0xC4A4,0xC4A5)) then
break -- found it
end--if
if (xxxdicompare(0xC4B4,0xC4B5)) then
break -- found it
end--if
if (xxxdicompare(0xC59C,0xC59D)) then
break -- found it
end--if
if (xxxdicompare(0xC5AC,0xC5AD)) then
break -- found it
end--if
break -- finally
end--while -- fake loop
while true do -- fake loop
if (strmyset~='sv') then
break -- do NOT check
end--if
if (xxxdicompare(0xC384,0xC3A4)) then
break -- found it
end--if
if (xxxdicompare(0xC385,0xC3A5)) then
break -- found it
end--if
if (xxxdicompare(0xC389,0xC3A9)) then
break -- found it
end--if
if (xxxdicompare(0xC396,0xC3B6)) then
break -- found it
end--if
break -- finally
end--while -- fake loop
return num16upper, num16lower
end--function lfusearchcasepair
------------------------------------------------------------------------
-- Local function LFUTRISTLETR
-- Evaluate char to tristate result (no letter vs uppercase letter
-- vs lowercase letter) within defined charset (ASCII + selectable
-- extra subset of UTF8).
-- Input : * strsel4set : "ASCII" (default, empty string or type "nil"
-- will do too) "eo" "sv" (value "GENE" NOT here) !!!FIXME!!! is GENE supposed to be supp or not ??
-- * strin4trist : single unicode char (1 or 2 octet:s) or
-- longer string
-- Output : * numtype4x : 0 no letter or invalid UTF8 -- 1 upper -- 2 lower
-- Depends on functions : (this is LFUTRISTLETR)
-- [U] lfulnutf8char lfusearchcasepair
-- [G] lfgtestuc lfgtestlc
-- Possible further char:s or fragments of such are disregarded, the
-- question answered is "Is there one uppercase or lowercase letter
-- available at begin?".
local function lfutristletr (strsel4set, strin4trist)
local numtype4x = 0 -- final result to be returned, preASSume invalid
local numlong4den = 0 -- actual length of input string
local numlong4bor = 0 -- expected length of single char
local numcha4r = 0 -- UINT8 beginning char
local numcha4s = 0 -- UINT8 later char (BIG ENDIAN, lower value here above)
local numcxa4unde = 0
local numw4upper = 0
local numw4lower = 0
while true do -- fake loop -- this is LFUTRISTLETR
numlong4den = string.len (strin4trist)
if (numlong4den==0) then
break -- bad string length
end--if
numcha4r = string.byte (strin4trist,1,1)
numlong4bor = lfulnutf8char(numcha4r)
if ((numlong4bor==0) or (numlong4den<numlong4bor)) then
break -- truncated char or invalid
end--if
if (numlong4bor==1) then
if (lfgtestuc(numcha4r)) then
numtype4x = 1
end--if
if (lfgtestlc(numcha4r)) then
numtype4x = 2
end--if
break -- success ASCII
end--if
if (numlong4bor==2) then
numcha4s = string.byte (strin4trist,2,2) -- only $80 to $BF cannot ovrfl
numcxa4unde = numcha4r * 256 + numcha4s -- UINT16BE
numw4upper, numw4lower = lfusearchcasepair (strsel4set,numcxa4unde)
if (numcxa4unde==numw4upper) then
numtype4x = 1
end--if
if (numcxa4unde==numw4lower) then
numtype4x = 2
end--if
end--if
break -- finally
end--while -- fake loop -- join mark
return numtype4x
end--function lfutristletr
------------------------------------------------------------------------
---- HIGH LEVEL STRING FUNCTIONS [I] ----
------------------------------------------------------------------------
-- Local function LFIDEBRACKET
-- Separate bracketed part of a string and return the inner or outer
-- part. On failure the string is returned complete and unchanged.
-- There must be exactly ONE "(" and exactly ONE ")" in correct order.
-- Input : * strde31br, boooutside
-- * numxminlencz -- minimal length of inner part, must be >= 1 !!!
-- Note that for length of hit ZERO ie "()" we have "begg" + 1 = "endd"
-- and for length of hit ONE ie "(x)" we have "begg" + 2 = "endd".
-- Example: "crap (NO)" -> len = 9
-- 123456789
-- "begg" = 6 and "endd" = 9
-- Expected result: "NO" or "crap " (note the trailing space)
-- Example: "(XX) YES" -> len = 8
-- 12345678
-- "begg" = 1 and "endd" = 4
-- Expected result: "XX" or " YES" (note the leading space)
local function lfidebracket (strde31br, boooutside, numxminlencz)
local numindoux = 1 -- ONE-based
local numdlong = 0
local num31wesel = 0
local numbegg = 0 -- ONE-based, ZERO invalid
local numendd = 0 -- ONE-based, ZERO invalid
numdlong = string.len (strde31br)
while true do
if (numindoux>numdlong) then
break -- ONE-based -- if both "numbegg" "numendd" non-ZERO then maybe
end--if
num31wesel = string.byte(strde31br,numindoux,numindoux)
if (num31wesel==40) then -- "("
if (numbegg==0) then
numbegg = numindoux -- pos of "("
else
numbegg = 0
break -- damn: more than 1 "(" present
end--if
end--if
if (num31wesel==41) then -- ")"
if ((numendd==0) and (numbegg~=0) and ((numbegg+numxminlencz)<numindoux)) then
numendd = numindoux -- pos of ")"
else
numendd = 0
break -- damn: more than 1 ")" present or ")" precedes "("
end--if
end--if
numindoux = numindoux + 1
end--while
if ((numbegg~=0) and (numendd~=0)) then
if (boooutside) then
strde31br = string.sub(strde31br,1,(numbegg-1)) .. string.sub(strde31br,(numendd+1),numdlong)
else
strde31br = string.sub(strde31br,(numbegg+1),(numendd-1)) -- separate substring
end--if
end--if
return strde31br -- same string variable
end--function lfidebracket
------------------------------------------------------------------------
-- Local function LFIKATALDIGU
-- Brew cat insertion (no extra colon ":") or link to
-- appendix from 3 elements.
local function lfikataldigu (strprefixx, strkataldnomo, strhintvisi)
local strrbkma = ''
if (type(strhintvisi)=='string') then
strrbkma = '[[' .. strprefixx .. ':' .. strkataldnomo .. '|' .. strhintvisi .. ']]'
else
strrbkma = '[[' .. strprefixx .. ':' .. strkataldnomo .. ']]'
end--if
return strrbkma
end--function lfikataldigu
------------------------------------------------------------------------
---- HIGH LEVEL FUNCTIONS [H] ----
------------------------------------------------------------------------
-- Local function LFHVALIDATEROOT !!!FIXME!!! incomplete
-- Depends on functions :
-- [U] lfulnutf8char lfusearchcasepair lfutristletr
-- [G] lfgtestuc lfgtestlc
-- #E22 len -- #E23 char -- #E24 dash !!!FIXME!!! incomplete
local function lfhvalidateroot (strcrassroot)
return 0
end--function lfhvalidateroot
------------------------------------------------------------------------
-- Local function LFHFINDROOT
-- Output : * numstaatus -- ZERO NOT found | -1 found | >=1 error code
-- * stroneroot -- villain on #E22 ... #E26 only
-- Depends on functions :
-- [H] lfhvalidateroot
-- [U] lfulnutf8char lfusearchcasepair lfutristletr
-- [G] lfgtestuc lfgtestlc
-- #E01 inte -- #E21 spaces -- #E22 len -- #E23 char -- #E24 dash -- #E26 dupe !!!FIXME!!! incomplete
local function lfhfindroot (strpickedline, strinkommenrot)
local varoden = 0
local tablistwithrootsnodupe = {} -- reset on every line
local stroneroot = ''
local numstattus = 0 -- preASSume NOT found
local numstakkus = 0
local numlinnlen = 0
local numrootiex = 1 -- ONE-based read octet position in line
strpickedline = strpickedline .. ' ' -- must have always a termination space
numlinnlen = string.len(strpickedline) -- now at least 3
while true do -- loop over roots in the line
if (numrootiex>=numlinnlen) then
break -- no chance for a root anymore
end--if
varoden = string.find(strpickedline,' ',numrootiex,true)
if (type(varoden)~='number') then
numstattus = 1 -- #E01 internal
break
end--if
if (numrootiex>=varoden) then
numstattus = 21 -- #E21 spaces or empty root
break
end--if
stroneroot = string.sub(strpickedline,numrootiex,(varoden-1)) -- avoid the space
numrootiex = varoden + 1 -- later look for next root here
numstakkus = lfhvalidateroot (stroneroot)
if (numstakkus~=0) then
numstattus = numstakkus
break -- abort search on #E22 len -- #E23 char -- #E24 dash
end--if
if (tablistwithrootsnodupe[stroneroot]) then
numstattus = 26 -- #E26 dupe root
break -- abort search on dupe
end--if
tablistwithrootsnodupe[stroneroot] = true -- we can't do this twice !!!
if (stroneroot==strinkommenrot) then -- !!! HERE WE GOT A HIT !!!
numstattus = -1 -- do NOT abort search, still risk of error
end--if
end--while -- loop over roots in the line
if ((numstattus<22) or (numstattus>26)) then -- #E22 #E23 #E24 #E26
stroneroot = '' -- do NOT return any f**king root
end--if
return numstattus, stroneroot
end--function lfhfindroot
------------------------------------------------------------------------
---- VARIABLES [R] ----
------------------------------------------------------------------------
function exporttable.ek (arxframent)
-- general unknown type
local vartymp = 0 -- temp variable without type
local varret = 0 -- final result string or table
-- special type "args" AKA "arx"
local arxourown = 0 -- metaized "args" from our own "frame"
local arxexxtra = 0 -- for methods via mw.getCurrentFrame()
-- general table
local tabinput = {} -- all non-empty lines from template
local taballheadingsnodupe = {}
local tabheadingshit = {}
-- general str
local strinrootin = '' -- from named 'in='
local strtext = '' -- huge string from the source template
local strtypofrut = '' -- from "numtyperoot" converted to string
local strguiltyhead = ''
local strguiltyroot = ''
local strtymp = ''
local strvisgud = '' -- visible good output
local strvisred = '' -- reduced visible good output for output table
local strinvkat = '' -- invisible category part
local strviserr = '' -- visible error message
local strtrakat = '' -- invisible tracking categories
-- general num
local numlung = 0
local numerr = 0 -- 0 OK | 1 internal | 2 "in=" bad | 3 template bad ...
local numdcba = 0
local numtypeinpu = 0 -- 0 raw | 1 new | 2 leg
local numtyperoot = 0 -- 0 unknown | (67 C) 73 I 77 M 78 N 80 P 85 U
local numlines3mi = 0 -- number of lines from template
local numhitshits = 0 -- number of hits (max ONE per line)
-- general boo
local boonocat = false -- from "nocat=true"
local boogivet = false -- from "givetable=true"
---- WELL ---
lfdtracemsg ('This is "mrootero", requested "detrc" report')
---- GET THE ARX (OUR OWN) ----
-- must be seized independently on "numerr" even if we already suck
arxourown = arxframent.args -- "args" from our own "frame"
if (type(arxourown)~='table') then
arxourown = {} -- guard against indexing error
numerr = 1 -- #E01 internal
end--if
---- SEIZE ONE NAMED AND OBLIGATORY PARAM ----
strinrootin = ''
if (numerr==0) then
vartymp = arxourown['in']
if (lfgstringrange(vartymp,1,200)) then -- empty or too long ignored
strinrootin = vartymp
end--if
if (strinrootin=='') then
numerr = 2 -- #E02 missing or bad length
else
if ((string.find(strinrootin,":",1,true)) or (string.find(strinrootin,"_",1,true))) then
numerr = 2 -- #E02 colon or underscore
end--if
end--if (strinrootin=='') else
end--if
lfdshowvar (numerr,'numerr','after seizure of anon param')
lfdshowvar (constrtemplate,'constrtemplate')
---- PROCESS 1 SPECIAL AND 2 HIDDEN NAMED PARAMS ----
-- "detrc=" and "nocat=" must be seized independently on "numerr"
-- even if we already suck, but type "table" must be ensured above !!!
boogivet = (arxourown['givetable']=='true')
boonocat = (arxourown['nocat']=='true')
if (arxourown["detrc"]=="true") then
lfdtracemsg ('Param "detrc=true" seized')
else
qboodetrc = false -- was preassigned to "true"
qstrtrace = '' -- shut up now
end--if
lfdshowvar (numerr,'numerr','done with special&hidden params')
lfdshowvar (boogivet,'boogivet')
lfdshowvar (boonocat,'boonocat')
---- FIND OUT THE TYPE OF INPUT AND ROOT ----
-- "Vorto -eo- enhavanta morfemon N (kapt)"
-- "Postfiksajxo ar'" -- "Radiko ide'" -- "Finajxo as"
-- "Memstara elemento da" -- "Liternomo co" -- "Antauxfiksajxo fi'"
-- note that minimal length of "Postfiksajxo" is 1 due to "-i-" and "-t-"
-- apo:s in the legacy patterns are troublesome in many ways, stupid
-- {{PAGENAME}} encodes apo to "'" and we must use "mw.text.decode"
numtypeinpu = 0 -- 0 raw | 1 new | 2 leg
numtyperoot = 0 -- 0 unknown | (67 C) 73 I 77 M 78 N 80 P 85 U
if (numerr==0) then
vartymp = string.find(strinrootin," ",1,true) -- space means more than bare
if (type(vartymp)=="number") then
if (string.byte(strinrootin,-1,-1)==41) then -- ")"
numtypeinpu = 1 -- new
else
numtypeinpu = 2 -- leg
end--if
end--if
lfdshowvar (numtypeinpu,'numtypeinpu')
end--if
if ((numerr==0) and (numtypeinpu==1)) then -- new
strtymp = lfidebracket(strinrootin,true,2) -- extract outer packaging
while true do -- fake loop
if (string.len(strtymp)~=32) then
numerr = 3 -- #E03
break -- to join mark
end--if
if (string.sub(strtymp,1,30)~="Vorto -eo- enhavanta morfemon ") then
numerr = 3 -- #E03
break -- to join mark
end--if
if (string.byte(strtymp,32,32)~=32) then
numerr = 3 -- #E03
break -- to join mark
end--if
numtyperoot = string.byte(strtymp,31,31)
if ((numtyperoot~=73) and (numtyperoot~=77) and (numtyperoot~=78) and (numtyperoot~=80) and (numtyperoot~=85)) then
numerr = 3 -- #E03
break -- to join mark
end--if
strinrootin = lfidebracket(strinrootin,false,2) -- all OK -> extract root
break -- finally to join mark
end--while -- fake loop -- join mark
end--if ((numerr==0) and (numtypeinpu==1)) then
if ((numerr==0) and (numtypeinpu==2)) then -- leg
strinrootin = mw.text.decode(strinrootin) -- fix possible apo
while true do -- fake loop
numlung = string.len(strinrootin)
if (numlung>=15) then
if (string.sub(strinrootin,1,9)=="Postfiksa") then
strinrootin = string.sub(strinrootin,14,-2) -- cut off apo too
strinrootin = '-' .. strinrootin .. '-'
numtyperoot = 73 -- "I" minimal length 1 (dubious "-i-" "-t-")
break -- to join mark -- success
end--if
end--if
if (numlung>=10) then
if (string.sub(strinrootin,1,7)=="Radiko ") then
strinrootin = string.sub(strinrootin,8,-2) -- cut off apo too
numtyperoot = 78 -- "N" minimal length 2
break -- to join mark -- success
end--if
end--if
if (numlung>=9) then
if (string.sub(strinrootin,1,4)=="Fina") then
strinrootin = string.sub(strinrootin,9,-1) -- no apo here
numtyperoot = 83 -- "U" minimal length 1
strinrootin = '-' .. strinrootin
break -- to join mark -- success
end--if
end--if
if (numlung>=20) then
if (string.sub(strinrootin,1,18)=="Memstara elemento ") then
strinrootin = string.sub(strinrootin,19,-1) -- no apo here
numtyperoot = 77 -- "M" minimal length 2
break -- to join mark -- success
end--if
end--if
if (numlung>=11) then
if (string.sub(strinrootin,1,10)=="Liternomo ") then
strinrootin = string.sub(strinrootin,11,-1) -- no apo here
numtyperoot = 77 -- "M" minimal length 1
break -- to join mark -- success
end--if
end--if
if (numlung>=18) then
if (string.sub(strinrootin,1,4)=="Anta") then
strinrootin = string.sub(strinrootin,16,-2) -- cut off apo too
strinrootin = strinrootin .. '-'
numtyperoot = 80 -- "P" minimal length 2, no like "abiotic" in -eo-
break -- to join mark -- success
end--if
end--if
numerr = 3 -- #E03 -- nothing found, invalid string legacy type fed in
break -- finally to join mark
end--while -- fake loop -- join mark
end--if ((numerr==0) and (numtypeinpu==2)) then
if (numerr==0) then
if (numtyperoot==0) then
strtypofrut = '??' -- unknown
else
strtypofrut = string.char(numtyperoot) -- 73 I 77 M 78 N 80 P 85 U
end--if
lfdshowvar (numerr,'numerr','done with root analysis')
lfdshowvar (numtyperoot,'numtyperoot')
lfdshowvar (strtypofrut,'strtypofrut')
lfdshowvar (strinrootin,'strinrootin','the isolated root')
end--if
---- CHECK WHETHER THE POINTED TEMPLATE EXISTS AT ALL AND EXPAND IT ----
-- we expect "constrtemplate" as the FULLPAGENAME
-- note that "mw.text.unstrip" is required due to wiki headings in the text
-- note that we need a separate "frame" if called from a module, pick
-- it using "mw.getCurrentFrame()"
strtext = ''
if (numerr==0) then
arxexxtra = mw.getCurrentFrame()
vartymp = arxexxtra:callParserFunction ('#ifexist:'..constrtemplate,'1','0')
if (vartymp=='1') then
vartymp = arxexxtra:expandTemplate { title = constrtemplate }
if ((type(vartymp))=='string') then
strtext = mw.text.unstrip (vartymp) -- result may be empty
end--if
end--if (vartymp=='1') then
if (strtext=='') then
numerr = 4 -- #E04 empty template
end--if
end--if
lfdshowvar (numerr,'numerr','done with template expansion')
lfdshowvar (strtext,'strtext')
---- COPY INTO TABLE STRIPPING OFF ALL BLANK LINES ----
-- * note that incoming "strtext" may be quasi-empty (empty lines and EOL:s)
-- * we always add an EOL to the text
if (numerr==0) then
do -- scope
local varkernel = 0
local strsingle = ''
local numsrclen = 0
local numsrcpos = 1 -- ONE based
local numtabinx = 0 -- ONE based -- INC-before-write
strtext = strtext .. string.char(10)
numsrclen = string.len(strtext) -- at least 2
while true do
if (numsrcpos>=numsrclen) then
break -- no chance for a non-empty line anymore
end--if
varkernel = string.find(strtext,string.char(10),numsrcpos,true)
if (type(varkernel)~="number") then
numerr = 1 -- #E01 internal
break -- ??
end--if
strsingle = ''
if (numsrcpos<varkernel) then
strsingle = string.sub(strtext,numsrcpos,(varkernel-1)) -- omit the LF
end--if
numsrcpos = varkernel + 1
if (strsingle~='') then
if (string.len(strsingle)==1) then
numerr = 5 -- #E05 empty or bad line -- too short (<2)
break
end--if
if ((string.byte(strsingle,1,1)==32) or (string.byte(strsingle,-1,-1)==32)) then
numerr = 5 -- #E05 empty or bad line -- leading or trailing space
break
end--if
numtabinx = numtabinx + 1
tabinput[numtabinx] = strsingle
end--if
end--while
numlines3mi = numtabinx
end--do scope
if ((numlines3mi<3) or (mathmod(numlines3mi,2)==1)) then
numerr = 16 -- #E16 -- at least 2 lines required and must be even
end--if
end--if
lfdshowvar (numerr,'numerr','done with tabelization')
lfdshowvar (numlines3mi,'numlines3mi','at least 2 lines required and must be even')
---- CORE HARD WORK WALKING THROUGH OUR TABLE AND PROCESSING LINES ----
if (numerr==0) then
do -- scope
local strmajlajn = ''
local strprevhed = '' -- stripped previous heading used for hit & err
local numindelx = 1 -- ONE-based read index in table
local numhitindx = 0 -- number of found hits -- INC-before-write
local numstaatus = 0
local boonowhead = true -- on "false" it is list with roots
while true do -- loop over alternating lines
if (numindelx>numlines3mi) then
break -- done
end--if
strmajlajn = tabinput [numindelx] -- pick line
if ((string.byte(strmajlajn,1,1)==61)~=boonowhead) then -- "="
numerr = 17 -- #E17 overall pattern bad (alternating lines needed)
break
end--if
if (boonowhead) then
if (string.len(strmajlajn)<7) then
numerr = 19 -- #E19 heading too short
break
end--if
if ((string.sub(strmajlajn,1,3)~="== ") or (string.sub(strmajlajn,-3,-1)~=" ==")) then
numerr = 19 -- #E19 bad equal signs
break
end--if
strmajlajn = string.sub(strmajlajn,4,-4) -- at least ONE char left
if (taballheadingsnodupe[strmajlajn]) then
strguiltyhead = strmajlajn
numerr = 20 -- #E20 dupe heading
break
end--if
taballheadingsnodupe[strmajlajn] = true -- we can't do this twice !!!
strprevhed = strmajlajn -- maybe we will need it later
else
numstaatus, strguiltyroot = lfhfindroot(strmajlajn,strinrootin)
if (numstaatus>=1) then
numerr = numstaatus -- -1 and ZERO are both OK
strguiltyhead = strprevhed
break -- CRAP
end--if
if (numstaatus==-1) then -- only ONE hit in ONE line is legal
numhitindx = numhitindx + 1
tabheadingshit [numhitindx] = strprevhed -- do NOT abort
end--if
end--if (boonowhead) else
numindelx = numindelx + 1 -- ONE-based read index in table
boonowhead = not boonowhead -- alternate
end--while -- loop over alternating lines
numhitshits = numhitindx -- copy to more durable variable
end--do scope
end--if (numerr==0) then
lfdshowvar (numerr,'numerr','done with processing lines')
lfdshowvar (tabheadingshit,'tabheadingshit')
lfdshowvar (numhitshits,'numhitshits')
---- BREW VISIBLE TEXT ----
-- "strinrootin" contains the raw morpheme or word but dashes can persist
-- based on "tabheadingshit" and "numhitshits"
-- here we fill "strvisred" and "strvisgud"
if (numerr==0) then
if (numhitshits==0) then
strvisred = contabvisi ['netr'] -- "ne troveblas" ...
else
strvisred = contabvisi [2] -- "troveblas en " follows list
numdcba = 1
while true do -- at least ONE iteration
if (numdcba>numhitshits) then
break
end--if
if (numdcba~=1) then
strvisred = strvisred .. ' kaj '
end--if
strvisred = strvisred .. tabheadingshit [numdcba]
numdcba = numdcba + 1
end--while
end--if (numhitshits==0) else
strvisgud = contabvisi [0] .. strinrootin .. contabvisi [1] .. strtypofrut .. ') ' .. strvisred .. '.'
end--if (numerr==0) then
---- BREW ROOT CAT:S ----
-- "strinrootin" contains the raw morpheme or word but dashes can persist
-- based on "tabheadingshit" and "numhitshits"
-- all will receive the same sorting hint ... if the morpheme or word
-- begins with a dash then remove it so that "-o" falls under "O" not
-- under "-" ... still keep possible irrelevant trailing dash
if ((numerr==0) and (not boonocat)) then
do -- scope
local strcathint = ''
local strfoundheading = ''
strcathint = strinrootin
if (string.byte(strcathint,1,1)==45) then
strcathint = string.sub (strcathint,2,-1) -- remove Boulder Dash
end--if
if (numhitshits==0) then
strinvkat = lfikataldigu('Kategorio',contabcats [0],strcathint)
else
strinvkat = lfikataldigu('Kategorio',contabcats [1],strcathint)
numdcba = 1
while true do -- at least ONE iteration
if (numdcba>numhitshits) then
break
end--if
strfoundheading = tabheadingshit [numdcba]
vartymp = contabcats [strfoundheading] -- risk of type "nil"
if (type(vartymp)=='string') then
strfoundheading = vartymp -- use the translated one
end--if
strinvkat = strinvkat .. lfikataldigu('Kategorio',strfoundheading,strcathint)
numdcba = numdcba + 1
end--while
end--if (numhitshits==0) else
end--do scope
end--if ((numerr==0) and (not boonocat)) then
---- BREW TRACKING CAT:S #E02...#E99 ----
-- no tracking cat:s for #E01
-- "nocat=true" suppresses even tracking cat:s
if ((numerr>1) and (not boonocat)) then
strtrakat = lfikataldigu('Kategorio','Eraro (mrooteo)') -- !!!FIXME!!!
end--if
---- WHINE IF YOU MUST #E01...#E99 ----
if (numerr~=0) then
strviserr = '<br><b>Eraro #E' .. tonumber(numerr) .. '</b>' -- !!!FIXME!!!
end--if
if (numerr==20) then -- #E20
strviserr = strviserr .. '<br>Dupe heading "' .. strguiltyhead .. '".'
end--if
if (numerr==21) then -- #E21
strviserr = strviserr .. '<br>Root list bad char (other than eo lowercase)' -- !!!FIXME!!! NOT yet detected
end--if
if (numerr==26) then -- #E26
strviserr = strviserr .. '<br>Dupe in section "' .. strguiltyhead .. '" with root "' .. strguiltyroot .. '".'
end--if
---- RETURN THE STRING OR INNER TABLE ----
-- on #E02 and higher we risk partial results in "strvisgud" and "strinvkat"
lfdtracemsg ('Ready to return string glued together from 1 + 1 + 4 parts or table')
lfdshowvar (strvisgud,'strvisgud')
lfdshowvar (strvisred,'strvisred','table only')
lfdshowvar (strinvkat,'strinvkat')
lfdshowvar (strviserr,'strviserr')
lfdshowvar (strtrakat,'strtrakat')
if (boogivet) then
varret = { [0]=numerr, strvisgud, strinvkat, strvisred, qstrtrace }
else
if (numerr==0) then
varret = strvisgud .. strinvkat
else
varret = strviserr .. strtrakat
end--if
if (qboodetrc) then -- "qstrtrace" declared separately outside main function
varret = "<br>" .. qstrtrace .. "<br><br>" .. varret
end--if
end--if
return varret
end--function
---- RETURN THE JUNK OUTER TABLE ----
return exporttable