Formal Language

In mathematics, logic and computer science, a formal language is a set of finite-length words (i.e. character strings) drawn from some finite alphabet, and the scientific theory that deals with these entities is known as formal language theory. Note that we can talk about formal language in many contexts (scientific, legal, linguistic and so on), meaning a mode of expression more careful and accurate, or more mannered than everyday speech. The sense of formal language dealt with in this article is the precise sense studied in formal language theory. An alphabet might be \left \{ a , b \right \}, and a string over that alphabet might be ababba. A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols a and b. The empty word (that is, length-zero string) is allowed and is often denoted by e, \epsilon or \Lambda. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded). Some examples of formal languages:
  • the set of all words over {a, b}
  • the set \left \{ a^{n}\right\}, n is a prime number and a^{n} means a repeated n times
  • the set of syntactically correct programs in a given programming language; or
  • the set of inputs upon which a certain Turing machine halts.
A formal language can be specified in a great variety of ways, such as: Several operations can be used to produce new languages from given ones. Suppose L_{1} and L_{2} are languages over some common alphabet.
  • The concatenation L_{1}L_{2} consists of all strings of the form vw where v is a string from L_{1} and w is a string from L_{2}.
  • The intersection of L_{1} and L_{2} consists of all strings which are contained in L1 and also in L_{2}.
  • The union of L_{1} and L_{2} consists of all strings which are contained in L_{1} or in L_{2}.
  • The complement of the language L_{1} consists of all strings over the alphabet which are not contained in L_{1}.
  • The right quotient L_{1}/L_{2} of L_{1} by L_{2} consists of all strings v for which there exists a string w in L_{2} such that vw is in L_{1}.
  • The Kleene star L_{1}^{*} consists of all strings which can be written in the form w_{1}w_{2}...w_{n} with strings w_{i} in L_{1} and n \ge 0. Note that this includes the empty string \epsilon because n = 0 is allowed.
  • The reverse L_{1}^{R} contains the reversed versions of all the strings in L_{1}.
  • The shuffle of L_{1} and L_{2} consists of all strings which can be written in the form v_{1}w_{1}v_{2}w_{2}...v_{n}w_{n} where n \ge 1 and v_{1},...,v_{n} are strings such that the concatenation v_{1}...v_{n} is in L_{1} and w_{1},...,w_{n} are strings such that w_{1}...w_{n} is in L_{2}.
A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to the language?" This is the domain of computability theory and complexity theory.

 

<< PreviousWord BrowserNext >>
fundamental interaction
floppy disk
fencing
felix bloch
fugue
fugue state
force
fluid dynamics
foosball
family law
foonly
functional group
freebsd
fractal
fluid
faq
fibonacci number
file sharing
fontainebleau
fighter aircraft
february 25
finite state machine
functional programming
february 29
francis scott key
fsu
free to choose
melbourne grand prix circuit
monaco grand prix
fission
fusion
four color theorem
fahrenheit 451
franks
francis xavier
fossil
family educational rights and privacy act
forgetting
free radical
fay wray
forgetting curve
field programmable gate array
forgetting rate
free running sleep