Wikifunctions:Suggest a function
Do you have an idea for a new function? Suggest it here! It may help to refer to our glossary.
You can go and create a function right away if you have the user-rights, and it aligns with other work.
Note that for now we only support strings, Booleans, Natural numbers, and lists as input and output types of functions. More types are coming in the next few months.
Once created, consider adding new Functions to the catalogue.
Proposed functions requiring only available types (string, Boolean, Natural number, list)
String
String character discard functions
- remove emoticons/emoji (Z11553): remove all emoticons and emoji from the input string
- remove stereochemical specificity in SMILES string
- simplify SMILES string according to some basic simplifications
- remove html tags (Z12815): Given a string containing html tag(s), returns a string without html tags
- remove characters not suitable for markup in XML or HTML (Z15855): https://www.w3.org/TR/unicode-xml/#Suitable
- remove U+FEFF (Z14145): should not be part of html or xml, see https://www.w3.org/TR/unicode-xml/#Suitable
String character replacement functions
String search functions
String escaping and unescaping functions
String encoding and decoding functions
- Unicode normalising functions (there are several types of normalisation)
- Done String to codepoint list (Z868): menghasilkan daftar karakter dari untaian yang dimasukkan
- Backslash-U with delimiters ASCII encoding of Unicode encode
- Can someone elaborate on this? No example cases were given on the document, and backslash-U with delimiters is anyway not that prevalent as far as I have seen. BrightSunMan (talk) 15:24, 26 December 2023 (UTC)
- Backslash-U with delimiters ASCII encoding of Unicode decode
- XML and HTML ASCII encoding of Unicode encode
- XML and HTML ASCII encoding of Unicode decode
- HTML named character encode - HTML named character escape (Z10987)
- Done HTML named character decode - unescape named HTML characters (Z10938)
- Punycode encode - Punycode encode (Z10178) (part only, not whole url); see also IDNA encode (Z10185)
- Punycode decode - Punycode decode (Z10181) (part only, not whole url); see also IDNA decode (Z10188)
- Unified English Braille encode (discarding invalid characters?)
- Unified English Braille decode
- ASCII Braille encode (Z15838): see https://en.wikipedia.org/wiki/Braille_ASCII ASCII Braille encode (discarding invalid characters?)
- ASCII Braille decode (Z15840): see https://en.wikipedia.org/wiki/Braille_ASCII ASCII Braille decode
- Done NATO phonetic alphabet code word decode - decode NATO phonetic alphabet code (Z10970) ("ALFA BRAVO CHARLIE NINE" ⇒ "ABC9")
- Done NATO phonetic alphabet ICAO pronunciations encode - NATO phonetic alphabet ICAO pronunciations encode (Z11642) ("ABC9" ⇒ "AL FAH. BRAH VOH. CHAR LEE. NIN-er.") (discarding invalid characters?)
- Done NATO phonetic alphabet ICAO pronunciations decode - NATO phonetic alphabet ICAO pronunciations decode (Z11668) ("AL FAH. BRAH VOH. CHAR LEE. NIN-er." ⇒ "ABC9")
- NATO phonetic alphabet ICAO IPA transcription encode - NATO phonetic alphabet ICAO IPA transcription encode (Z11670) ("ABC9" ⇒ "ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə") (discarding invalid characters?)
- Done NATO phonetic alphabet ICAO IPA transcription decode - NATO phonetic alphabet ICAO IPA transcription decode (Z11672) ("ˈælfa ˈbraːˈvo ˈtʃɑːli ˈnaɪnə" ⇒ "ABC9")
- NATO phonetic alphabet DIN 5009 IPA transcription encode - NATO phonetic alphabet DIN 5009 IPA transcription encode (Z11674) ("ABC9" ⇒ "ˈalfa ˈbravo ˈtʃali ˈnaɪnə") (discarding invalid characters?)
- Done NATO phonetic alphabet DIN 5009 IPA transcription decode - NATO phonetic alphabet DIN 5009 IPA transcription decode (Z11676) ("ˈalfa ˈbravo ˈtʃali ˈnaɪnə" ⇒ "ABC9")
String presentation functions
- add locale-specific quotation marks to string
- Done format large natural number strings by adding commas (Z13473): অনেকগুলো অংক রয়েছে এমন স্বাভাবিক সংখ্যায় কমা যোগ করে সাজায়।
- Shouldn't the output depend on the locale? See mw.language:formatNum. —Dexxor (talk) 17:15, 4 September 2023 (UTC)
String colour notation functions
- complementary colour in RGB colour model ("#FF0000" ⇒ "#00FFFF")
- Any specification on invalid inputs? MilkyDefer 11:22, 5 August 2023 (UTC)
- Great question. I don't think there is a position documented on Wikifunctions for how to handle invalid input to a function. Can we throw exceptions? Return null? Dhx1 (talk) 13:23, 6 August 2023 (UTC)
- This shouldn't be a string function. This should be a type that represents a RGB color (with corresponding validation function (hopefully it can just be three unsigned 8bit integers)) and a function that returns the complementary color. 0xDeadbeef (talk) 12:38, 7 August 2023 (UTC)
String notation validation checks
- Done check if string is a nucleic acid notation - is DNA nucleic acid notation (Z11342)
- check if string is a simplified molecular-input line-entry system (SMILES) notation - is SMILES notation (Z11208)
- check if string is an en:International_Chemical_Identifier
- check if string is a SMILES arbitrary target specification (SMARTS) notation
- check if string is an ABC notation
- check if string is a LilyPond notation
- Doing... check if string is a portable game notation for a chess game (is portable game notation (Z15867), figuring out how to add newlines to the test input)
- is Forsyth–Edwards Notation (Z14643) check if string is Forsyth–Edwards Notation for a chess position
- Done check if string is a Whyte notation - is Whyte notation (Z10524)
- check if string is a UIC classification of locomotive axle arrangements notation
- Done check if string is an AAR wheel arrangement notation - is AAR wheel arrangement notation (Z14226)
- Done check if string is an IPv4 address - is IPv4 (Z10476)
- Done check if string is an IPv6 address - is IPv6 (Z10786)
- Done check if a string is a valid ISBN-10 - is ISBN-10 (Z11705)
- check if a string is a valid ISBN-13 (probably just a simple variant of is EAN (Z10821), dropping/validating the hyphens)
- Done check if a string is a valid ISSN - is ISSN (Z10765)
- check if a string is a valid DOI
- Something about implementation difficulties: https://stackoverflow.com/questions/27910/finding-a-doi-in-a-document-or-page Alexander-Mart-Earth (talk) 14:28, 21 December 2023 (UTC)
- check if a string is a valid ISWN
String validation checks
- Done check if string A is an anagram of string B - is anagram (simple) (Z10973)
- Done is heterogram (Z11573) check if string is a heterogram
- is tautogram (Z11577) check if string is a tautogram
- Doing... check if string is in lower camel case
- Donecheck if string is a valid ISO 639-1 language code - is ISO 639-1 language code (Z13482)
- Done check if string is a valid ISO 639-2 language code - is ISO 639-2 language code (Z14083)
- check if string is a valid ISO 3166 country code
- check if string is a valid ISO 8601 date/time (2023-08-03 ⇒ true; 2023-02-30 ⇒ false; 2023-08-03 15:00:00.000 ⇒ true; 2023-08-03 25:00:00.000 ⇒ false)
- check if string is a valid EDTF date/time
- Doing... check if string is a valid email address (watch out, see this list of falsehoods about email addresses to create unit tests - email addresses are more complicated than they seem) — is valid email address (Z10410) creating test cases in progress. Currently it is stuck on figuring out what exactly is a valid emaill address. Nearly every errata for RFC:3696 is about that.
- Doing... check if string is a valid Wikidata item — Is valid Qid (Z10696) (possibly stuck on phab:T343593?)
- check if string is a valid ISO 6709 coordinates
String analysis functions
- tokenize on white space (Z13407): break the initial string into a list of tokens, divided by whitespace
- words from string (Z13402): Extract a list of words from a string
- Perhaps make a variant of the above for each monolingual text, to deal with different concepts of punctuation/words.
- distinct words from string (Z13411): return a list of distinct words from the initial string, in the preserved order, with case-sensitive duplicates removed
- distinct lowercased words from string (Z13415): return a list of distinct words (turned to lowercase) from the initial string, in the preserved order, with case-insensitive duplicates removed
Monolingual text
String Wikitext operations
...
Natural number
- rectified linear unit (ReLU) - https://www.wikifunctions.org/view/en/Z13909
List
Basic list/iterable functions
- Done are all elements of the list the same type (Z13220): true if all elements of the list have the same type, or if empty
- fold/reduce
- right fold (Z12753): combine the first element with the result of recursively combining the rest, according to a combining function
- left fold (Z12781): combine the result of recursively combining the rest, with the last element, according to a combining function
- group
Complex list functions
- all distinct permutations (Z12745): returns a list of all distinct permutations of the input list, each is itself a list
- zip lists together: for [ A .. Z ] and [ 1 .. 26 ] return [ [ A, 1 ], [ B, 2 ], .. ]
- Unsure what happens if input lists are of different lengths.
- If possible this function should be able to zip more than 2 lists together... 3, 4, n? Perhaps the input should be list(list, list, list, list, ..).
CSV list operations
- Done csv to list of strings (Z12794): Converts a validly formatted (RFC 4180) comma-separated value series into a valid list of strings (not including the commas or row start or row end characters), where any whitespace and quotes are unchanged. Be careful to validly render CSV with quoted fields.
- list of strings to csv
Functions with functions as arguments
- compose (Z10111): returns the composition of two functions
- fold/reduce
- Reduce Function (Z876): iterates the application of a two-parameter function, the first parameter uses the initial object or the previous result, the second parameter uses the next item on the list
- right fold (Z12753): combine the first element with the result of recursively combining the rest, according to a combining function
- left fold (Z12781): combine the result of recursively combining the rest, with the last element, according to a combining function
- is sorted (Z13322): checks if a list is in order such that a comparator function returns true when each sequential pair of items are passed in as arguments
- sort, by a given function
- test whether certain functions have specific properties of homogeneous relations for particular lists/sets
- is identity relation over elements of list (Z13419): I = {(x, x) | x ∈ X}; that is, x1Ix2 holds if and only if x1 = x2. See https://en.wikipedia.org/wiki/Homogeneous_relation
- is empty relation over elements of list (Z14112): E = ∅; that is, E(x1,x2) is always false. See https://en.wikipedia.org/wiki/Homogeneous_relation
- is function commutative for these arguments (Z14762): tests if applying the function to the arguments is independent of which order the arguments are in: f(a,b) = f(b,a)
- is function associative for these arguments (Z14765): tests if applying the function twice to the arguments is independent of which grouping of arguments is processed first: f(a,f(b,c)) = f(f(a,b),c)
- remove first element matching filter from list
Morphological functions
morphology is the part of linguistics that studies how language parts are 'shaped' and change diachronically and when inflected. Hausa, Igbo, Malayalam, Bangla/Bengali and Dagbani are focus languages for Wikidata's lexicographic dataset, which is an important aspect of Abstract Wikipedia.
mul - Multiple languages
- inputs: natural number (new numeric type) and language Z-number; output: 'singular', 'dual', 'paucal', 'plural', etc. as string
- Doing...: use Z15977 instead (Z13899)
ase - American Sign Language
bn - Bangla
cy - Welsh
w:en:colloquial Welsh morphology
dag - Dagbani
de - German
- tense * person * number for each verb
- tenses: present, past, ...?
- person: first, second, third
- number: singular, plural
- Done regular German First person singular present verb (Z11256)
- Done regular German Second person singular present verb (Z11264)
- Doing... third person singular present
- Done regular German First person plural present verb (Z11268)
- Doing... regular German verb in the second person plural present (Z11272)
- second person singular preterite
en - English
- English verb to agent noun (Z11390) Verb -> agent noun, e.g. "dance"->"dancer"
- Done English nominative to accusative (Z11651): converts a nominative (subject) pronoun to the accusative (objective) case
- Done suffix English word (Z13254): Add any suffix to an English word with regular changes to spelling
- Join English morphemes (extends suffix English word (Z13254) to cases like re + en + able + er + s → re-enablers. suffix English word (Z13254) will correctly join re-enable + ers or re- + enablers, but re + enablers → “renablers” (incorrect). English morpheme agglutination (Z13275) tests the Reduce function to produce “detoxification” from a list of four morphemes (orchestrator limit exceeded with five). I doubt we’ll want to derive “toxify” from “toxic”, however.
- Derive lemmas from a form. This is envisaged as the converse of Join English morphemes. The focus would be identifying the base form (the lexeme’s lemma) rather than further segmenting the lemma. For example, “underlay” should return “underlie” (for which it is the past participle) and the noun “underlay” (for which it is the lemma) and (perhaps) the verb “underlay”, which might be the tendency of an unproductive hen or the activity of a carpet-fitter. As this is a purely functional converse, every string will have itself as a possible lemma.
- Generate Numerical prefixes of various kinds from a natural number input.
fr - French
- French masculine adjective to feminine (Z11590) Masculine adjective -> feminine, e.g. "exact"->"exacte"
- Conjugated verb => Infinitive, e.g. "alla" => "aller", "mordit" => "mordre"
ha - Hausa
A notated demo sentence ("Aishà taa jeefar dà kàren Indoo" ― "Aisha threw away Indo's dog") is available at http://intent.xigt.org
ig - Igbo
ldn - Láadan
section moved to WF:human languages/Z1882
ml - Malayalam
Proposed functions requiring future types
Note these functions cannot be implemented properly until the needed types are requested and approved.
If one wishes to nevertheless attempt to define and implement them,
- the functions and implementations should indicate prominently in their labels that their input/output types must be adjusted once support for the appropriate replacement types become available; and
- the functions should not be used in the implementations of any other functions, as the later adjustment of input/output types to appropriate replacements will break those implementations.
String manipulation functions
- Done final N characters of string (Z14460): return only the last N characters of the initial string
- Done first N characters of string (Z14592): returns a substring from the beginning of a specified string up to a number of characters
- Done Replicate string n-times (Z12624): Replicates a string n times: (e.g. f("a",5) -> "aaaaa")
String analysis functions
- count distance between two letters in given alphabet (default to 26-charcater western alphabet. case insensitive. e.g. "a" & "A" ⇒ 0; "K" & "N" ⇒ 3)
- Done Hamming distance between two strings of equal length, e.g. "Wikipedia" & "Wikimedia" ⇒ 1. - hamming distance between two strings (Z11328)
- Levenshtein distance between two strings (e.g. "kitten" to "sitting" => 3 : kitten > sitten > sittin > sitting)
String encoding and decoding functions
(would be better with types representing a stream of bytes)
- Done BASE16 encode - Base16 Encode (Z11003)
- Done BASE16 decode - Base16 Decode (Z11007)
- Done BASE32 encode - Base32 Encode (Z14189)
- Done BASE32 decode - Base32 Decode (Z14195)
- BASE45 encode
- BASE45 decode
- Done BASE64 encode - Base64 Encode (Z10057)
- Done BASE64 decode - Base64 decode (Z10062)
- Hexadecimal UTF-8 encode ("ABC ₤" ⇒ "41 42 43 20 E2 82 A4")
- Hexadecimal UTF-8 decode ("41 42 43 20 E2 82 A4" ⇒ "ABC ₤")
- Decimal UTF-8 encode ("ABC ₤" ⇒ "65 66 67 32 226 130 164")
- Decimal UTF-8 decode ("65 66 67 32 226 130 164" ⇒ "ABC ₤")
- Octal UTF-8 encode ("ABC ₤" ⇒ "101 102 103 40 342 202 244")
- Octal UTF-8 decode ("101 102 103 40 342 202 244" ⇒ "ABC ₤")
- Binary UTF-8 encode ("ABC ₤" ⇒ "01000001 01000010 01000011 00100000 11100010 10000010 10100100")
- Binary UTF-8 decode ("01000001 01000010 01000011 00100000 11100010 10000010 10100100" ⇒ "ABC ₤")
- Unicode code point encode ("ABC ₤" ⇒ "41 42 43 20 20A4") - Unicode code point encode hex (Z10785)
- Unicode code point decode ("41 42 43 20 20A4" ⇒ "ABC ₤")
- Done chr of codepoint (Z11534) Unicode code point decimal decode - single character ("65" ⇒ "A")
- Create regular expression object/string (i.e: "test" & "i" to /test/i)
Natural language functions
- Choose singular or plural based on number (e.g. singularOrPlural("person",6") -> "people")
- Note that there are also dual and other grammatical numbers in other languages. 魔琴 (talk) 18:54, 26 October 2023 (UTC)
- relevant interwiki link: d:WD:property proposal/plural forms Arlo Barnes (talk) 04:15, 9 February 2024 (UTC)
Cryptographic hash functions
(would be better with types representing a stream of bytes)
- To do MD2 - MD2 (Z10135)
- To do MD4 - MD4 (Z10136)
- To do MD5 - MD5 (Z10137)
- To do RIPEMD-128 - RIPEMD-128 (Z10138)
- To do RIPEMD-160 - RIPEMD-160 (Z10139)
- To do BLAKE2b-160 - BLAKE2b-160 (Z10140)
- To do BLAKE2b-256 - BLAKE2b-256 (Z10141)
- To do BLAKE2b-384 - BLAKE2b-384 (Z10142)
- To do BLAKE2b-512 - BLAKE2b-512 (Z10143)
- To do BLAKE2s-128 - BLAKE2s-128 (Z10144)
- To do BLAKE2s-160 - BLAKE2s-160 (Z10145)
- To do BLAKE2s-224 - BLAKE2s-224 (Z10146)
- To do BLAKE2s-256 - BLAKE2s-256 (Z10147)
- To do SHA-224 - SHA-224 (Z10149)
- To do HMAC-SHA-256
- To do SHAKE-128 - SHAKE-128 (Z10150)
- To do SHAKE-256 - SHAKE-256 (Z10151)
Colour functions
- return colour contrast ratio (per [1]) of two RGB colours (provided as strings e.g. "#FF0000")
Date, time, and calendric functions
Note: 'time' type not yet supported, use 'string' (or for strictly numeric values, 'natural number')
Bengali calendar
Gregorian to Bangla (Z12926): Returns the Bangla equivalent date of a Gregorian date format. The input fields represent the Year, month, and day, respectively.
Chinese calendar
French Republican Calendar
decimalises and secularises the Gregorian
- day names: is the day name part of the French Republican Calendar 'rural' naming? (Z13006): tests if input is one of the French-language names for days in the FRC Not done yet
Gregorian
widely used calendar derived from the Julian, basis for ISO 8601
- check if year is leap year
- Done is leap year (Gregorian calendar) (Z10996) - (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ false)
- return weekday (2023-08-03 ⇒ Thursday (or should this return Q129, for Thursday (Q129)?))
- date to weekday number (0-6)
- date to ISO week number ISO week date (Q2110154)
- advance n days (2023-08-03 & "69" ⇒ 2023-10-11)
- go back n days
- string to date
- date to ISO 8601 string
- date to year (yyyy)
- date to month of the year (1-12)
- date to month name (January-December)
- date to day of the month (1-31)
- date to hour of the day (0-23)
- date to minutes (0-59)
- date to seconds (0-59)
Holocene calendar
Indian national calendar
Islamic
a solar calendar, also called Hijri
Julian
mostly used by astronomers, some historians, and some Orthodox Christian denominations
- check if year is leap year
- Done is leap year (Julian calendar) (Z11015) - (2020 ⇒ true; 2023 ⇒ false; 2100 ⇒ true)
Mesoamerican calendars
including civil and clerical forms
Persian
also called Jalali
- check if year is leap year
- Done is leap year (Jalali calendar) (Z11011) - (1399 ⇒ true; 1400 ⇒ false)
Thai calendar
Hebrew calendar
Basic numerical functions
- round up ("1.289" & "2" ⇒ "1.29"; "5678" & "2" ⇒ "5700")
- So if the number is floating point, round to n decimal places, and if not, round to n significant figures. Is that right? BrightSunMan (talk) 19:36, 24 December 2023 (UTC)
- round down
- return integer value (5678.678 ⇒ 5678)
- decode Roman numerals ("X" ⇒ 10; 2023 ⇒ MMXXIII)
- Done : Roman to Arabic numeral (Z11023): Convert a Roman numeral to Arabic numeral
- encode as Roman numerals (10 ⇒ "X"; MMXXIII ⇒ 2023)
- Done : Arabic to Roman numeral (Z11022): Convert a natural number [1, 4999] to roman numeral
- English cardinal (Z13587): expresses a natural number in English words (23 ⇒ "twenty-three")
- English ordinal (23 ⇒ "twenty-third")
- Done natural number to English ordinal (Z14526) (Composition only, with English cardinal (Z13587) and English cardinal to ordinal (Z14523))
- Natural number with English ordinal suffix (Z14531)
- Done (Python only)
- English cardinal to ordinal (Z14523): Converts standard English cardinal forms like “twenty-three” to the corresponding ordinal form, like “twenty-third”.
- Done (Python only)
- Done: Body Mass Index (80kg and 1.80m ⇒ 24)
- Done: (!) Body Mass Index (metric) (Z12526): Calculate a BMI given a mass in kilograms and height in meters
- Done: (!) Body Mass Index (imperial) (Z12572): Calculate a BMI given a mass in pounds and height in inches
- Convert money from US$ to anything else
- requires source of conversion rates, which is a hole in function-likeness
- Done Kronecker delta (Z15849): returns 1 if the two natural number inputs are equal, and 0 if they are unequal
- Arabic numeral to Etruscan numeral
- Etruscan numeral to Arabic numeral
Data serialization functions
- parse a string as JSON
- extract string from JSON object based on JSONPath (
{"name":"Alice"}
, "$.name" ⇒ "Alice")- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
- This seems to be a good idea, thanks! I moved and splitted the proposal accordingly. --1-Byte (talk) 09:51, 6 August 2023 (UTC)
- is it okay to go ahead to create this 'extract string from JSON object based on JSONPath' as a function ? Dolphyb (talk) 16:14, 15 February 2024 (UTC)
- Why not first convert a JSON string to an object, and then have a function that extracts fields based on JSONPath? Doing Stringly-typed things like this proposal as defined isn't a good idea. 0xDeadbeef (talk) 16:16, 5 August 2023 (UTC)
Basic list/iterable functions requiring numeric types
- Sum the elements of a numeric list - sum the elements of a numeric list (Z14038)
- Product of the elements of a numeric list
- Done length of a list (Z12681): number of objects on a list
- flat a list (Z12676): flatten a list to limited depth
- Slice of list elements: for the supplied list, return a list of elements that are at indexes between a supplied range n:m
- Zero indexing is used (first element is index 0)?
- n and m are are included in the range?
- What happens if n and/or m are invalid indexes?
- Remove slice of elements from list: return the supplied list with elements between a supplied range of indexes removed
- Zero indexing is used (first element is index 0)?
- n and m are are included in the range?
- What happens if n and/or m are invalid indexes?
- Done First n elements of list: return a list of the first n elements of a supplied list - get the first n elements of a list (Z13366)
- Done Remove first n elements of list: returns the supplied list without the first n elements - remove first n elements of list (Z13369)
- Done Last n elements of list: return a list of the last n elements of a supplied list - get the last n elements of a list (Z13362)
- Done Remove last n elements of list: returns the supplied list without the last n elements - remove last n elements of a list (Z13373)
- Every nth element of list: returns every nth element of the supplied list
- Done - get the nth element of a list (Z13397): When given a valid index (1-based) return the nth element of the supplied list otherwise return nothing
- Remove every nth element of list: removes every nth element of the supplied list -
- Done remove the nth element from a list (Z13429): When given a valid index remove the element at the position and return the supplied list otherwise return nothing
- sample n objects from list (return up to n random objects from the list)
- Jaccard similarity coefficient (see https://en.wikipedia.org/wiki/Jaccard_index)
Geodetics functions
w:en:planetary coordinate system, w:en:well-known text representation of coordinate reference systems
Earth
- convert coordinates outside of the ranges (-180, 180) for longitude and (-90, 90) for latitude to a canonical form
Mars
- convert coordinates outside of the ranges [0, 360) for longitude and (-90, 90) for latitude to a canonical form
Unit conversion functions
- Fahrenheit to Celsius (Z15560): Converts Fahrenheit (°F) to Celsius (°C)