Modulo:mtbllingvoj

El Vikivortaro
Salti al navigilo Salti al serĉilo
Padlock.svg Ĉi tiu modulo estas multfoje bindita kaj protektitaVi povas inspekti la protokolon pri protektoj.
Se vi konas la eblajn sekvojn kaj havas sufiĉajn rajtojn, tiam vi povas zorgeme ekredakti.
Se vi ne kuraĝas aŭ ne rajtas redakti tiam vi povas proponi la deziratan ŝanĝon en la diskutejo.
Memtesto ne disponeblas. Jen la dokumentaĵa subpaĝo.

--[===[

MODULE "MTMPLLOADDATA" (multiple private template to load data)

"eo.wiktionary.org/wiki/Modulo:mtbllingvoj" <!--2022-Jan-21-->
"id.wiktionary.org/wiki/Modul:mtblbahasa"

Purpose: translate (the transcludable part of) wikitext of a template
         (name is hardcoded to "tblbahasa") to 4 LUA tables that can be
         used repeatedly via the infamous "mw.loadData" command

Utilo: traduki na (la transkluzivigebla parto de) vikiteksto de sxablono
       (nomo fiksita al "tbllingvoj") al 4 LUA-tabeloj kiuj povas esti
       uzataj ripete per la famacxa ordono "mw.loadData"

Manfaat: menerjemahkan (bagian yang bisa mentransklusikan) wikiteks templat
         (namanya tetap "tblbahasa") menjadi 4 tabel LUA yang bisa digunakan
         beberapa kali melalui perintah "mw.loadData" yang terkenal buruk

Syfte: oeversaetta (den transkluderingsbara delen av) wikitext fraan en mall
       (namn fastslaget till "tbllingvoj") till 4 LUA-tabeller som kan
       anvaendas upprepade gaanger medelst det oekaenda
       kommandot "mw.loadData"

Incoming: * nothing (imported via "mw.loadData", no ordinary caller)

Returned: * LUA table containing 2 ... 4 inner LUA tables and up to
            7 items of status data :
            * [0] (status, integer) status code
            * [1] (status, integer) full bloat of the source text from template
            * [2] (status, integer) bloat of the source text from template
                  after removing both outer areas but still
                  keeping all excessive whitespace between
            * [3] (status, integer) octet position of error (excessive
                  whitespace does count, two outer areas don't)
            * [4] (status, integer) number of done lines, or number of line
                  where an error occurred (empty lines do NOT count)
            * [5] (status, string) error string on some
                                   errors, otherwise empty string
                   * on #E06 raw complete faulty line
                   * on #E07 early participant in sorting crime
                   * on #E08...#E10 "FORWARD DUPE" "REVERSE DUPE" "EXTRA DUPE"
            * [6] (status, string) error string on some
                                   errors, otherwise empty string
                   * on #E06 report with string len and earliest and last
                             char and reverse and extra settings
                   * on #E07 latter participant in sorting crime
                   * on #E08...#E10 offending dupe string
            * [7] (main, table) list lable with key/index ZERO-based
                                and value "cy"
            * [8] (main, table) forward table with key/index "cy"
                                and value complete line
            * [9] (main, table) reverse table with key/index "c0"
                                and value "cy"
            * [10] (main, table) extra table with key/index "c1"
                                 and value "cy"

Note that this module is NOT generic and CANNOT be made generic due to the
principle that data received by "mw.loadData" must be static and thus it is
not possible to submit parameters (namely name of the template to be seized)
to this module.

The name of this module is the name of the addressed template prefixed
by "m", for example "Template:tblgods" -> "Module:mtblgods".

It is permissible to read several templates by one module and merge the
content in order to facilitate editing, for example "Module:mtblgods" reads
"Template:tblgodsaf" + "Template:tblgodsgr" + "Template:tblgodssz".

Status codes:
* #E00: OK
* #E01: code reserved for caller, cannot occur here
* #E02: template not found
* #E03: bad template size, must be 10...1'000'000 octets
* #E04: failure removing two areas or needed "[[" not found
* #E05: bad length of line, must be 6...10'000 octets
* #E06: failed to extract elements "cy" and "c0" and "c1" from the CSV line
* #E07: sorting crime in "cy"
* #E08  forward dupe "cy" (not that the sorting requirement makes
        a forward dupe mostly impossible, the issue is usually caught
        in the sort check, but we allow the earliest and last entry to
        be excluded from sorting thus must check for dupe nevertheless)
* #E09: reverse dupe in "c0"
* #E10: extra dupe in "c1"
* #E11: bad number of lines, must be 2...10'000

Empty lines are trimmed off and multiple spaces are reduced to single ones.

Note that the used "#ifexist:" function expects a wikilink, not a URL. This
means among others that percent-encoding CANNOT be used, UTF8 is required.

]===]

------------------------------------------------------------------------

---- CONSTANTS [O] ----

------------------------------------------------------------------------

  -- uncommentable constant strings EO vs ID

  local construstm = string.char(0xC5,0x9C) .. "ablono:tbllingvoj"  -- EO -- "SXablono"
  -- local construstm = "Templat:tblbahasa"                            -- ID

  -- control flags

  local conboohavreverse = true -- change to "false" to disable reverse table
  local conboohavextra = true
  local conbootwocynoso = true -- change to "false" to enforce so for all "cy"
  local conbooc1unique = false -- change to "false" to allow multip "-" in "c1" !!! DONE !!!

------------------------------------------------------------------------

---- HIGH LEVEL FUNCTIONS [H] ----

------------------------------------------------------------------------

-- Local function LFHEXTRACT

-- Extract 3 substrings ("cy", "c0", "c1") from a CSV line beginning with the
-- "cy" part in double rectangular brackets WITHOUT following comma. Note that
-- excessive spaces have already been reduced, but an optional space before
-- and after every comma and after the "cy" part still can occur. There is
-- NO EOL char at the end.

-- Example of minimal imaginable line : "[[a]]0"

local function lfhextract (strline)

  local strcky = ''
  local strck0 = ''
  local strck1 = ''
  local strtump = ''
  local numpanjang = 0
  local numindxe = 0
  local numfas = 0 -- 0:cy -- 1:c0 -- 2:c1 -- 3:abort
  local numchch = 0
  local numchnx = 0
  local boocrap = false
  local boopopospace = false
  local boononempty = false

  numpanjang = string.len (strline)
  if (numpanjang<6) then
    boocrap = true
  else
    boocrap = (string.sub(strline,1,2)~="[[")
  end--if

  if (not boocrap) then

    numindxe = 2 -- ZERO-based and skipping "[["

    while (true) do
      if (numindxe==numpanjang) then
        break -- end of string, LF EOL not used here
      end--if
      numchnx = 0
      numchch = string.byte(strline,(numindxe+1),(numindxe+1)) -- pick
      numindxe = numindxe + 1
      if (numindxe<numpanjang) then
        numchnx = string.byte(strline,(numindxe+1),(numindxe+1)) -- prepeek
      end--if
      if ((numchch==44) or ((numchch==93) and (numchnx==93))) then -- "," or "]]"
        if (numchch~=44) then
          numindxe = numindxe + 1 -- skip both
        end--if
        numfas = numfas + 1
        if (numfas==3) then
          break
        end--if
        boopopospace = false -- !!!CRUCIAL!!!
        boononempty = false -- !!!CRUCIAL!!!
      else
        if (numchch==32) then
          boopopospace = boononempty -- here trim away leading spaces too
        else
          if (boopopospace) then
            strtump = string.char(32,numchch)
            boopopospace = false -- !!!CRUCIAL!!!
          else
            strtump = string.char(numchch)
          end--if
          if (numfas==0) then
            strcky = strcky .. strtump -- this is slow
          end--if
          if (numfas==1) then
            strck0 = strck0 .. strtump -- this is slow
          end--if
          if (numfas==2) then
            strck1 = strck1 .. strtump -- this is slow
          end--if
          boononempty = true -- !!!CRUCIAL!!!
        end--if (numchch==32) else
      end--if
    end--while

    if (strcky=='') then
      strck0 = '' -- broken "strcky" ruins "strck0" but NOT vice-versa
    end--if

  end--if

  return strcky, strck0, strck1

end--function lfhextract

------------------------------------------------------------------------

---- VARIABLES [R] ----

------------------------------------------------------------------------

  -- general table --

  local tmplloaddata = {}  -- outer table

  local tbllist    = {}
  local tblforward = {}
  local tblreverse = {}
  local tblextra   = {}

  -- general unknown type

  local vartmp = 0         -- variable without type

  -- general type "frame"

  local arxframent = 0

  -- general str

  local strbig = ''        -- big string
  local strdupearl = ''    -- early participant in sort crime or ...
  local strsortlat = ''    -- latter participant in sort crime or ...
  local strprevious = ''
  local strcy = ''
  local strc0 = ''
  local strc1 = ''
  local strttmp = ''       -- temp

  -- general num

  local numerr = 0         -- 0: OK -- 1: template not found ...
  local numbiglen = 0      -- full
  local numbeg = 0
  local numend = 0
  local numtrmlen = 0      -- after trimming away 2 areas but not more
  local numtrmpos = 0
  local numlnjlen = 0
  local numchv = 0
  local numchw = 0
  local numcntlin = 0      -- processed lines

  -- general boo

  local boogotchar = false
  local boopostpone = false
  local boochecksort = false

------------------------------------------------------------------------

---- MAIN [Z] ----

------------------------------------------------------------------------

  ---- SEIZE THE INFAMOUS "FRAME" OBJECT (THERE IS NO ORDINARY CALLER) ----

  arxframent = mw.getCurrentFrame () -- use this if no main function exists

  ---- CHECK WHETHER THE POINTED TEMPLATE EXISTS AT ALL & EXPAND IT IF SO ----

  strttmp = arxframent:callParserFunction ('#ifexist:'..construstm,'1','0')
  if (strttmp=='1') then
    vartmp = arxframent:expandTemplate { title = construstm }
    if ((type(vartmp))=='string') then
      strbig = vartmp -- may be empty
    end--if
  else
    numerr = 2 -- #E02
  end--if

  ---- CHECK LENGTH ----

  if (numerr==0) then
    numbiglen = string.len(strbig)
    if ((numbiglen<10) or (numbiglen>1000000)) then
      numerr = 3 -- #E03
    end--if
  end--if

  ---- TRIM AWAY TWO AREAS ----

  if (numerr==0) then

    vartmp = 0 -- ONE-based or type "nil"
    numbeg = 0 -- ONE-based thus ZERO is invalid
    numend = 0 -- ONE-based thus ZERO is invalid

    while (true) do -- search for all "[["
      vartmp = string.find(strbig, "[[" , (vartmp+1), true)
      if (not vartmp) then
        break -- no more hit
      end--if
      if (numbeg==0) then
        numbeg = vartmp
      else
        numend = vartmp
      end--if
    end--while

    if ((numbeg==0) or (numend==0) or ((numbeg+5)>numend)) then -- HALF size
      numerr = 4 -- #E04 failure removing two areas
    else
      while (true) do -- search for next LF EOL
        if (numend>numbiglen) then -- "numend" is ONE-based
          break -- "numend" is ONE-based and OFF-BY-ONE now
        end--if
        numchw = string.byte(strbig,numend,numend) -- pick
        numend = numend + 1
        if (numchw==10) then
          break -- "numend" is ONE-based and OFF-BY-ONE now
        end--if
      end--while
      if ((numbeg+10)>numend) then
        numerr = 4 -- #E04 failure removing two areas
      else
        strbig = string.sub (strbig,numbeg,(numend-1))
        numtrmlen = string.len (strbig)
      end--if
    end--if ((numbeg==0) or (numend==0) or ((numbeg+5)>numend)) else

  end--if (numerr==0) then

  ---- PARSE THE BIG TEXT TABLE AND FILL 2 ... 4 LUA TABLES ----

  if (numerr==0) then

    strprevious = '' -- used to detect sorting crimes in "cy"
    numtrmpos = 0 -- ZERO-based: position in the text
    numcntlin = 0 -- ZERO-based: number of processed lines, index in "tbllist"

    while (true) do -- outer loop over all lines
      strttmp = ''
      boogotchar = false
      while (true) do -- upper inner loop to find a line
        if (numtrmpos>=numtrmlen) then
          break -- upper inner loop, EOF
        end--if
        numchw = string.byte(strbig,(numtrmpos+1),(numtrmpos+1)) -- pick
        numtrmpos = numtrmpos + 1
        if ((numchw~=10) and (numchw~=32)) then -- trim leading spaces too
          boogotchar = true
          break -- upper inner loop, found line
        end--if
      end--while -- upper inner
      if (not boogotchar) then
        break -- outer loop, EOF
      end--if
      boopostpone = false
      while (true) do -- inner loop to seize a line
        if (numchw==10) then
          break -- inner loop, captured an EOL, do NOT store it anywhere
        end--if
        if (numchw==32) then
          boopostpone = true -- no need to trim away leading spaces here
        else
          if (boopostpone) then
            strttmp = strttmp .. string.char(32,numchw)
            boopostpone = false -- !!!CRUCIAL!!!
          else
            strttmp = strttmp .. string.char(numchw)
          end--if
        end--if
        if (numtrmpos>=numtrmlen) then
          break -- inner loop, EOF
        end--if
        numchw = string.byte(strbig,(numtrmpos+1),(numtrmpos+1)) -- pick
        numtrmpos = numtrmpos + 1
      end--while
      numlnjlen = string.len(strttmp)
      if ((numlnjlen<6) or (numlnjlen>10000)) then
        numerr = 5 -- #E05 -- bad length of single line
        break -- outer loop
      end--if
      strcy, strc0, strc1 = lfhextract (strttmp) -- !!! 3 results !!!
      if ((strcy=='') or ((strc0=='') and conboohavreverse) or ((strc1=='') and conboohavextra)) then
        numchv = string.byte(strttmp,1,1)
        numchw = string.byte(strttmp,numlnjlen,numlnjlen)
        strdupearl = strttmp -- raw line
        strsortlat = "len="  .. tostring(numlnjlen) ..
                     " beg=" .. tostring(numchv) ..
                     " end=" .. tostring(numchw) ..
                     " rev=" .. tostring(conboohavreverse) ..
                     " ext=" .. tostring(conboohavextra)
        numerr = 6 -- #E06 -- failed to extract elements
        break -- outer loop
      end--if
      boochecksort = (not conbootwocynoso) or ((numtrmpos<numtrmlen) and (numcntlin>1))
      if (boochecksort and (strcy<=strprevious)) then
        strdupearl = strprevious
        strsortlat = strcy -- "strcy" should be bigger but is not :-(
        numerr = 7 -- #E07 -- sorting crime in "cy"
        break -- outer loop
      end--if
      if (tblforward [strcy]) then
        strdupearl = "FORWARD DUPE"
        strsortlat = strcy
        numerr = 8 -- #E08 -- forward dupe in "cy"
        break -- outer loop
      end--if
      tbllist [numcntlin] = strcy -- always store at key/index ZERO-based
      tblforward [strcy] = strttmp -- always store at key/index "cy"
      if (conboohavreverse) then
        vartmp = tblreverse [strc0]
        if (vartmp) then
          strdupearl = "REVERSE DUPE" -- absolutely prohibited
          strsortlat = vartmp
          numerr = 9 -- #E09 -- reverse dupe in "c0"
          break -- outer loop
        end--if
        tblreverse [strc0] = strcy -- store at key/index "c0"
      end--if
      if (conboohavextra) then
        vartmp = tblextra [strc1]
        if (vartmp) then
          strdupearl = "EXTRA DUPE" -- may skip stor "-" thus no risk for dupe
          strsortlat = vartmp
          numerr = 10 -- #E10 -- extra dupe in "c1"
          break -- outer loop
        end--if
        if ((strc1~="-") or (conbooc1unique)) then
          tblextra [strc1] = strcy -- conditionally store at key/index "c1"
        end--if
      end--if
      strprevious = strcy -- sorting is OBLIGATORY (with exception, see above)
      numcntlin = numcntlin + 1 -- successfully stored one line
    end--while -- outer loop over all lines

  end--if (numerr==0) then

  if (numerr==0) then
    if ((numcntlin<2) or (numcntlin>10000)) then
      numerr = 11 -- #E11 -- bad number of lines
    end--if
  end--if

  ---- RETURN THE JUNK ----

  tmplloaddata[0] = numerr
  tmplloaddata[1] = numbiglen
  tmplloaddata[2] = numtrmlen
  tmplloaddata[3] = numtrmpos -- equal "numtrmlen" on success
  tmplloaddata[4] = numcntlin
  tmplloaddata[5] = strdupearl
  tmplloaddata[6] = strsortlat
  if (numerr==0) then
    tmplloaddata[7] = tbllist -- always on success
    tmplloaddata[8] = tblforward -- always on success
    if (conboohavreverse) then
      tmplloaddata[9] = tblreverse -- can be disabled
    end--if
    if (conboohavextra) then
      tmplloaddata[10] = tblextra -- can be disabled
    end--if
  end--if

  return tmplloaddata -- only allowed data type is "table"