Pipeline (Unix)

In UNIX and other Unix-like operating systems, a pipeline is a set of processes chained by their standard streams, so that the output of each process ("stdout") feeds directly as input ("stdin") of the next one. Filter programs are often used in this way. The concept was named by analogy to a physical pipeline.
   
This feature of UNIX was borrowed by other operating systems, such as Taos and MS-DOS, and eventually became the pipes and filters design pattern of software engineering. Unix pipelines should not be confused with other data processing pipelines found in modern computer systems, although the general concept is quite similar.

Creating pipelines from the shell

Most Unix shells have a special syntax construct for the creation of pipelines. Typically, one simply writes the filter commands in sequence, separated by the ASCII vertical bar character "|" (which, for this reason, is often called "pipe character" by Unix users). The shell starts the processes and arranges for the necessary connections between their standard streams (including some amount of buffer storage).

Error stream

By default, the standard error streams ("stderr") of the processes are not passed on through the pipe; instead, they are merged and directed to the console. However, many shells have additional syntax for changing this behaviour. In the csh shell, for instance, using "|&" instead of "| " signifies that the standard error stream too should be merged with the standard output and fed to the next process.

Example

Below is an example of a pipeline that implements a kind of spell checker for the web resource indicated by a URL http://www.wikipedia.org/wiki/Pipeline.
  curl http://www.wikipedia.org/wiki/Pipeline | \  sed 's/^a-zA-Z//g' | \  tr 'A-Z ' 'a-z\n' | \  grep 'a-z' | \  sort -u | \  comm -23 - /usr/dict/words 
Here is an explanation of the pipeline:
  • First the curl program obtains the HTML contents of a web page.
  • The contents of this page are piped through sed, which removes all characters which are not spaces or letters.
  • tr then changes all of the uppercase letters into their corresponding lowercase counterparts, and converts the spaces in the lines of text to newlines.
  • Each 'word' is now on a separate line.
  • grep is used to remove lines of whitespace.
  • sort sorts the list of 'words' into alphabetical order, and removes duplicates.
  • Finally, comm finds which of the words in the list are not in the given dictionary file (in this case, /usr/dict/words).

Creating pipelines by program

Pipelines can be created also under program control.

Implementation

In most Unix-like systems, all processes of a pipeline are started at the same time, with their streams appropriately connected, and managed by the scheduler together with all other processes running on the machine. An important aspect of this, setting Unix pipes apart from other pipe implementations, is the concept buffering: a sending program may produce 1000 bytes per second, and a receiving program may only be able to accept 100 bytes per second, but the data is held in a buffer, or queue, by the operating system so that the receiving program need not worry about dropping data on account of it being too busy to receive it. Buffers also collect data from their senders as soon as it is made available, so that a sender need not finish its job, or exit, before a receiver can start its work on the product. Other implementations of pipes have provided pipe-like functionality without multitasking. Under MS-DOS, for example, only one process could be running at the same time, but it could start another process, which would then need to complete before the initial process could recover control. So when pipes were used, the command.com shell would first create a temporary buffer file, making sure this file is its standard output. Then it would start the "sending" process, this process would inherit the buffer file as its standard output and would write its entire output to it. Once the sending process has terminated, the shell would close the buffer file and open it again, in read mode, as its standard input, and then run the second, receiving process, which would again inherit it as standard input. This provided similar functionality to Unix shell pipes, but required processes to complete their work before handing their ouput off to the receiver. This was impractical for long-running processes, therefore MS-DOS programs often offered their own output "pagination" if they output lots of text data on their standard output, rather than relying on the user to pipe them through the standard more.exe utility. Let's note that many, if not most non-trivial MS-DOS programs did not use MS-DOS file handles for either input not output, and instead read the keyboard directly at BIOS level or performed BIOS calls directly for output, or even wrote to video memory directly. Such programs could not be redirected thru pipes in either direction. Tools like netcat can connect pipes to TCP/IP sockets, following the Unix philosophy of "everything is a file".

History

The pipeline concept and the vertical-bar notation was invented by Douglas McIlroy, one of the authors of the early command shells, after he noticed that much of the time they were processing the output of one program as the input to another. The idea was eventually ported to other operating systems, such as DOS, OS/2, Windows NT, BeOS, and Mac OS X, often with the same notation.

See also

 

<< PreviousWord BrowserNext >>
iford manor
toulo de graffenried
john chinaman
andy linden
colonel matron kathleen annie louise best
horned melon
sylvain lgar
louise of savoy
mtis
mount sinai school district
wh smith literary award
rudi fischer
tony bettenhausen
keysville
roundelay
korean social democratic party
quartz crystal microbalances
adilshah dynasty
nizam shahi dynasty
qutb shahi dynasty
chondoist chongu party
resistance, rebellion, and death
ian temby
list of famous families
battle of kobryn
philippe, comte de paris
mike nazaruk
gettysburg (movie)
eufor
leonard huxley (physicist)
greg james
ken watson
magi nation duel
stanley morison
mauri rose
david elphinstone
alternating knot
reassortment
apple pascal
pkp
battle of tippermuir
manny ayulo
apolo
vanya on 42nd street