Modulo:mtbllingvoj
Salti al navigilo
Salti al serĉilo
|
Memtesto ne disponeblas. Jen la dokumentaĵa subpaĝo. |
--[===[
MODULE "MTMPLLOADDATA" (multiple private template to load data)
"eo.wiktionary.org/wiki/Modulo:mtbllingvoj" <!--2022-Jan-21-->
"id.wiktionary.org/wiki/Modul:mtblbahasa"
Purpose: translate (the transcludable part of) wikitext of a template
(name is hardcoded to "tblbahasa") to 4 LUA tables that can be
used repeatedly via the infamous "mw.loadData" command
Utilo: traduki na (la transkluzivigebla parto de) vikiteksto de sxablono
(nomo fiksita al "tbllingvoj") al 4 LUA-tabeloj kiuj povas esti
uzataj ripete per la famacxa ordono "mw.loadData"
Manfaat: menerjemahkan (bagian yang bisa mentransklusikan) wikiteks templat
(namanya tetap "tblbahasa") menjadi 4 tabel LUA yang bisa digunakan
beberapa kali melalui perintah "mw.loadData" yang terkenal buruk
Syfte: oeversaetta (den transkluderingsbara delen av) wikitext fraan en mall
(namn fastslaget till "tbllingvoj") till 4 LUA-tabeller som kan
anvaendas upprepade gaanger medelst det oekaenda
kommandot "mw.loadData"
Incoming: * nothing (imported via "mw.loadData", no ordinary caller)
Returned: * LUA table containing 2 ... 4 inner LUA tables and up to
7 items of status data :
* [0] (status, integer) status code
* [1] (status, integer) full bloat of the source text from template
* [2] (status, integer) bloat of the source text from template
after removing both outer areas but still
keeping all excessive whitespace between
* [3] (status, integer) octet position of error (excessive
whitespace does count, two outer areas don't)
* [4] (status, integer) number of done lines, or number of line
where an error occurred (empty lines do NOT count)
* [5] (status, string) error string on some
errors, otherwise empty string
* on #E06 raw complete faulty line
* on #E07 early participant in sorting crime
* on #E08...#E10 "FORWARD DUPE" "REVERSE DUPE" "EXTRA DUPE"
* [6] (status, string) error string on some
errors, otherwise empty string
* on #E06 report with string len and earliest and last
char and reverse and extra settings
* on #E07 latter participant in sorting crime
* on #E08...#E10 offending dupe string
* [7] (main, table) list lable with key/index ZERO-based
and value "cy"
* [8] (main, table) forward table with key/index "cy"
and value complete line
* [9] (main, table) reverse table with key/index "c0"
and value "cy"
* [10] (main, table) extra table with key/index "c1"
and value "cy"
Note that this module is NOT generic and CANNOT be made generic due to the
principle that data received by "mw.loadData" must be static and thus it is
not possible to submit parameters (namely name of the template to be seized)
to this module.
The name of this module is the name of the addressed template prefixed
by "m", for example "Template:tblgods" -> "Module:mtblgods".
It is permissible to read several templates by one module and merge the
content in order to facilitate editing, for example "Module:mtblgods" reads
"Template:tblgodsaf" + "Template:tblgodsgr" + "Template:tblgodssz".
Status codes:
* #E00: OK
* #E01: code reserved for caller, cannot occur here
* #E02: template not found
* #E03: bad template size, must be 10...1'000'000 octets
* #E04: failure removing two areas or needed "[[" not found
* #E05: bad length of line, must be 6...10'000 octets
* #E06: failed to extract elements "cy" and "c0" and "c1" from the CSV line
* #E07: sorting crime in "cy"
* #E08 forward dupe "cy" (not that the sorting requirement makes
a forward dupe mostly impossible, the issue is usually caught
in the sort check, but we allow the earliest and last entry to
be excluded from sorting thus must check for dupe nevertheless)
* #E09: reverse dupe in "c0"
* #E10: extra dupe in "c1"
* #E11: bad number of lines, must be 2...10'000
Empty lines are trimmed off and multiple spaces are reduced to single ones.
Note that the used "#ifexist:" function expects a wikilink, not a URL. This
means among others that percent-encoding CANNOT be used, UTF8 is required.
]===]
------------------------------------------------------------------------
---- CONSTANTS [O] ----
------------------------------------------------------------------------
-- uncommentable constant strings EO vs ID
local construstm = string.char(0xC5,0x9C) .. "ablono:tbllingvoj" -- EO -- "SXablono"
-- local construstm = "Templat:tblbahasa" -- ID
-- control flags
local conboohavreverse = true -- change to "false" to disable reverse table
local conboohavextra = true
local conbootwocynoso = true -- change to "false" to enforce so for all "cy"
local conbooc1unique = false -- change to "false" to allow multip "-" in "c1" !!! DONE !!!
------------------------------------------------------------------------
---- HIGH LEVEL FUNCTIONS [H] ----
------------------------------------------------------------------------
-- Local function LFHEXTRACT
-- Extract 3 substrings ("cy", "c0", "c1") from a CSV line beginning with the
-- "cy" part in double rectangular brackets WITHOUT following comma. Note that
-- excessive spaces have already been reduced, but an optional space before
-- and after every comma and after the "cy" part still can occur. There is
-- NO EOL char at the end.
-- Example of minimal imaginable line : "[[a]]0"
local function lfhextract (strline)
local strcky = ''
local strck0 = ''
local strck1 = ''
local strtump = ''
local numpanjang = 0
local numindxe = 0
local numfas = 0 -- 0:cy -- 1:c0 -- 2:c1 -- 3:abort
local numchch = 0
local numchnx = 0
local boocrap = false
local boopopospace = false
local boononempty = false
numpanjang = string.len (strline)
if (numpanjang<6) then
boocrap = true
else
boocrap = (string.sub(strline,1,2)~="[[")
end--if
if (not boocrap) then
numindxe = 2 -- ZERO-based and skipping "[["
while (true) do
if (numindxe==numpanjang) then
break -- end of string, LF EOL not used here
end--if
numchnx = 0
numchch = string.byte(strline,(numindxe+1),(numindxe+1)) -- pick
numindxe = numindxe + 1
if (numindxe<numpanjang) then
numchnx = string.byte(strline,(numindxe+1),(numindxe+1)) -- prepeek
end--if
if ((numchch==44) or ((numchch==93) and (numchnx==93))) then -- "," or "]]"
if (numchch~=44) then
numindxe = numindxe + 1 -- skip both
end--if
numfas = numfas + 1
if (numfas==3) then
break
end--if
boopopospace = false -- !!!CRUCIAL!!!
boononempty = false -- !!!CRUCIAL!!!
else
if (numchch==32) then
boopopospace = boononempty -- here trim away leading spaces too
else
if (boopopospace) then
strtump = string.char(32,numchch)
boopopospace = false -- !!!CRUCIAL!!!
else
strtump = string.char(numchch)
end--if
if (numfas==0) then
strcky = strcky .. strtump -- this is slow
end--if
if (numfas==1) then
strck0 = strck0 .. strtump -- this is slow
end--if
if (numfas==2) then
strck1 = strck1 .. strtump -- this is slow
end--if
boononempty = true -- !!!CRUCIAL!!!
end--if (numchch==32) else
end--if
end--while
if (strcky=='') then
strck0 = '' -- broken "strcky" ruins "strck0" but NOT vice-versa
end--if
end--if
return strcky, strck0, strck1
end--function lfhextract
------------------------------------------------------------------------
---- VARIABLES [R] ----
------------------------------------------------------------------------
-- general table --
local tmplloaddata = {} -- outer table
local tbllist = {}
local tblforward = {}
local tblreverse = {}
local tblextra = {}
-- general unknown type
local vartmp = 0 -- variable without type
-- general type "frame"
local arxframent = 0
-- general str
local strbig = '' -- big string
local strdupearl = '' -- early participant in sort crime or ...
local strsortlat = '' -- latter participant in sort crime or ...
local strprevious = ''
local strcy = ''
local strc0 = ''
local strc1 = ''
local strttmp = '' -- temp
-- general num
local numerr = 0 -- 0: OK -- 1: template not found ...
local numbiglen = 0 -- full
local numbeg = 0
local numend = 0
local numtrmlen = 0 -- after trimming away 2 areas but not more
local numtrmpos = 0
local numlnjlen = 0
local numchv = 0
local numchw = 0
local numcntlin = 0 -- processed lines
-- general boo
local boogotchar = false
local boopostpone = false
local boochecksort = false
------------------------------------------------------------------------
---- MAIN [Z] ----
------------------------------------------------------------------------
---- SEIZE THE INFAMOUS "FRAME" OBJECT (THERE IS NO ORDINARY CALLER) ----
arxframent = mw.getCurrentFrame () -- use this if no main function exists
---- CHECK WHETHER THE POINTED TEMPLATE EXISTS AT ALL & EXPAND IT IF SO ----
strttmp = arxframent:callParserFunction ('#ifexist:'..construstm,'1','0')
if (strttmp=='1') then
vartmp = arxframent:expandTemplate { title = construstm }
if ((type(vartmp))=='string') then
strbig = vartmp -- may be empty
end--if
else
numerr = 2 -- #E02
end--if
---- CHECK LENGTH ----
if (numerr==0) then
numbiglen = string.len(strbig)
if ((numbiglen<10) or (numbiglen>1000000)) then
numerr = 3 -- #E03
end--if
end--if
---- TRIM AWAY TWO AREAS ----
if (numerr==0) then
vartmp = 0 -- ONE-based or type "nil"
numbeg = 0 -- ONE-based thus ZERO is invalid
numend = 0 -- ONE-based thus ZERO is invalid
while (true) do -- search for all "[["
vartmp = string.find(strbig, "[[" , (vartmp+1), true)
if (not vartmp) then
break -- no more hit
end--if
if (numbeg==0) then
numbeg = vartmp
else
numend = vartmp
end--if
end--while
if ((numbeg==0) or (numend==0) or ((numbeg+5)>numend)) then -- HALF size
numerr = 4 -- #E04 failure removing two areas
else
while (true) do -- search for next LF EOL
if (numend>numbiglen) then -- "numend" is ONE-based
break -- "numend" is ONE-based and OFF-BY-ONE now
end--if
numchw = string.byte(strbig,numend,numend) -- pick
numend = numend + 1
if (numchw==10) then
break -- "numend" is ONE-based and OFF-BY-ONE now
end--if
end--while
if ((numbeg+10)>numend) then
numerr = 4 -- #E04 failure removing two areas
else
strbig = string.sub (strbig,numbeg,(numend-1))
numtrmlen = string.len (strbig)
end--if
end--if ((numbeg==0) or (numend==0) or ((numbeg+5)>numend)) else
end--if (numerr==0) then
---- PARSE THE BIG TEXT TABLE AND FILL 2 ... 4 LUA TABLES ----
if (numerr==0) then
strprevious = '' -- used to detect sorting crimes in "cy"
numtrmpos = 0 -- ZERO-based: position in the text
numcntlin = 0 -- ZERO-based: number of processed lines, index in "tbllist"
while (true) do -- outer loop over all lines
strttmp = ''
boogotchar = false
while (true) do -- upper inner loop to find a line
if (numtrmpos>=numtrmlen) then
break -- upper inner loop, EOF
end--if
numchw = string.byte(strbig,(numtrmpos+1),(numtrmpos+1)) -- pick
numtrmpos = numtrmpos + 1
if ((numchw~=10) and (numchw~=32)) then -- trim leading spaces too
boogotchar = true
break -- upper inner loop, found line
end--if
end--while -- upper inner
if (not boogotchar) then
break -- outer loop, EOF
end--if
boopostpone = false
while (true) do -- inner loop to seize a line
if (numchw==10) then
break -- inner loop, captured an EOL, do NOT store it anywhere
end--if
if (numchw==32) then
boopostpone = true -- no need to trim away leading spaces here
else
if (boopostpone) then
strttmp = strttmp .. string.char(32,numchw)
boopostpone = false -- !!!CRUCIAL!!!
else
strttmp = strttmp .. string.char(numchw)
end--if
end--if
if (numtrmpos>=numtrmlen) then
break -- inner loop, EOF
end--if
numchw = string.byte(strbig,(numtrmpos+1),(numtrmpos+1)) -- pick
numtrmpos = numtrmpos + 1
end--while
numlnjlen = string.len(strttmp)
if ((numlnjlen<6) or (numlnjlen>10000)) then
numerr = 5 -- #E05 -- bad length of single line
break -- outer loop
end--if
strcy, strc0, strc1 = lfhextract (strttmp) -- !!! 3 results !!!
if ((strcy=='') or ((strc0=='') and conboohavreverse) or ((strc1=='') and conboohavextra)) then
numchv = string.byte(strttmp,1,1)
numchw = string.byte(strttmp,numlnjlen,numlnjlen)
strdupearl = strttmp -- raw line
strsortlat = "len=" .. tostring(numlnjlen) ..
" beg=" .. tostring(numchv) ..
" end=" .. tostring(numchw) ..
" rev=" .. tostring(conboohavreverse) ..
" ext=" .. tostring(conboohavextra)
numerr = 6 -- #E06 -- failed to extract elements
break -- outer loop
end--if
boochecksort = (not conbootwocynoso) or ((numtrmpos<numtrmlen) and (numcntlin>1))
if (boochecksort and (strcy<=strprevious)) then
strdupearl = strprevious
strsortlat = strcy -- "strcy" should be bigger but is not :-(
numerr = 7 -- #E07 -- sorting crime in "cy"
break -- outer loop
end--if
if (tblforward [strcy]) then
strdupearl = "FORWARD DUPE"
strsortlat = strcy
numerr = 8 -- #E08 -- forward dupe in "cy"
break -- outer loop
end--if
tbllist [numcntlin] = strcy -- always store at key/index ZERO-based
tblforward [strcy] = strttmp -- always store at key/index "cy"
if (conboohavreverse) then
vartmp = tblreverse [strc0]
if (vartmp) then
strdupearl = "REVERSE DUPE" -- absolutely prohibited
strsortlat = vartmp
numerr = 9 -- #E09 -- reverse dupe in "c0"
break -- outer loop
end--if
tblreverse [strc0] = strcy -- store at key/index "c0"
end--if
if (conboohavextra) then
vartmp = tblextra [strc1]
if (vartmp) then
strdupearl = "EXTRA DUPE" -- may skip stor "-" thus no risk for dupe
strsortlat = vartmp
numerr = 10 -- #E10 -- extra dupe in "c1"
break -- outer loop
end--if
if ((strc1~="-") or (conbooc1unique)) then
tblextra [strc1] = strcy -- conditionally store at key/index "c1"
end--if
end--if
strprevious = strcy -- sorting is OBLIGATORY (with exception, see above)
numcntlin = numcntlin + 1 -- successfully stored one line
end--while -- outer loop over all lines
end--if (numerr==0) then
if (numerr==0) then
if ((numcntlin<2) or (numcntlin>10000)) then
numerr = 11 -- #E11 -- bad number of lines
end--if
end--if
---- RETURN THE JUNK ----
tmplloaddata[0] = numerr
tmplloaddata[1] = numbiglen
tmplloaddata[2] = numtrmlen
tmplloaddata[3] = numtrmpos -- equal "numtrmlen" on success
tmplloaddata[4] = numcntlin
tmplloaddata[5] = strdupearl
tmplloaddata[6] = strsortlat
if (numerr==0) then
tmplloaddata[7] = tbllist -- always on success
tmplloaddata[8] = tblforward -- always on success
if (conboohavreverse) then
tmplloaddata[9] = tblreverse -- can be disabled
end--if
if (conboohavextra) then
tmplloaddata[10] = tblextra -- can be disabled
end--if
end--if
return tmplloaddata -- only allowed data type is "table"