A computer program is a sequence or set of instructions in a programming language for a computer to execute. It is one component of software, which also includes documentation and other intangible components.[1]
A computer program in its human-readable form is called source code. Source code needs another computer program to execute because computers can only execute their native machine instructions. Therefore, source code may be translated to machine instructions using a compiler written for the language. (Assembly language programs are translated using an assembler.) The resulting file is called an executable. Alternatively, source code may execute within an interpreter written for the language.[2]
If the executable is requested for execution, then the operating system loads it into memory and starts a process.[3] The central processing unit will soon switch to this process so it can fetch, decode, and then execute each machine instruction.[4]
If the source code is requested for execution, then the operating system loads the corresponding interpreter into memory and starts a process. The interpreter then loads the source code into memory to translate and execute each statement. Running the source code is slower than running an executable.[5] Moreover, the interpreter must be installed on the computer.
The "Hello, World!" program is used to illustrate a language's basic syntax. The syntax of the language BASIC (1964) was intentionally limited to make the language easy to learn.[6] For example, variables are not declared before being used.[7] Also, variables are automatically initialized to zero.[7] Here is an example computer program, in Basic, to average a list of numbers:[8]
Once the mechanics of basic computer programming are learned, more sophisticated and powerful languages are available to build large computer systems.[9]
See also: History of computing, History of programming languages and History of software.
Improvements in software development are the result of improvements in computer hardware. At each stage in hardware's history, the task of computer programming changed dramatically.
In 1837, Jacquard's loom inspired Charles Babbage to attempt to build the Analytical Engine.[10] The names of the components of the calculating device were borrowed from the textile industry. In the textile industry, yarn was brought from the store to be milled. The device had a store which consisted of memory to hold 1,000 numbers of 50 decimal digits each.[11] Numbers from the store were transferred to the mill for processing. It was programmed using two sets of perforated cards. One set directed the operation and the other set inputted the variables.[12] However, the thousands of cogged wheels and gears never fully worked together.[13]
Ada Lovelace worked for Charles Babbage to create a description of the Analytical Engine (1843). The description contained Note G which completely detailed a method for calculating Bernoulli numbers using the Analytical Engine. This note is recognized by some historians as the world's first computer program.[13]
In 1936, Alan Turing introduced the Universal Turing machine, a theoretical device that can model every computation.[14] It is a finite-state machine that has an infinitely long read/write tape. The machine can move the tape back and forth, changing its contents as it performs an algorithm. The machine starts in the initial state, goes through a sequence of steps, and halts when it encounters the halt state.[15] All present-day computers are Turing complete.[16]
The Electronic Numerical Integrator And Computer (ENIAC) was built between July 1943 and Fall 1945. It was a Turing complete, general-purpose computer that used 17,468 vacuum tubes to create the circuits. At its core, it was a series of Pascalines wired together.[17] Its 40 units weighed 30 tons, occupied 1800square feet, and consumed $650 per hour (in 1940s currency) in electricity when idle. It had 20 base-10 accumulators. Programming the ENIAC took up to two months. Three function tables were on wheels and needed to be rolled to fixed function panels. Function tables were connected to function panels by plugging heavy black cables into plugboards. Each function table had 728 rotating knobs. Programming the ENIAC also involved setting some of the 3,000 switches. Debugging a program took a week.[18] It ran from 1947 until 1955 at Aberdeen Proving Ground, calculating hydrogen bomb parameters, predicting weather patterns, and producing firing tables to aim artillery guns.[19]
Instead of plugging in cords and turning switches, a stored-program computer loads its instructions into memory just like it loads its data into memory.[20] As a result, the computer could be programmed quickly and perform calculations at very fast speeds.[21] Presper Eckert and John Mauchly built the ENIAC. The two engineers introduced the stored-program concept in a three-page memo dated February 1944.[22] Later, in September 1944, John von Neumann began working on the ENIAC project. On June 30, 1945, von Neumann published the First Draft of a Report on the EDVAC, which equated the structures of the computer with the structures of the human brain.[21] The design became known as the von Neumann architecture. The architecture was simultaneously deployed in the constructions of the EDVAC and EDSAC computers in 1949.[23]
The IBM System/360 (1964) was a family of computers, each having the same instruction set architecture. The Model 20 was the smallest and least expensive. Customers could upgrade and retain the same application software.[24] The Model 195 was the most premium. Each System/360 model featured multiprogramming[24] —having multiple processes in memory at once. When one process was waiting for input/output, another could compute.
IBM planned for each model to be programmed using PL/1.[25] A committee was formed that included COBOL, Fortran and ALGOL programmers. The purpose was to develop a language that was comprehensive, easy to use, extendible, and would replace Cobol and Fortran.[25] The result was a large and complex language that took a long time to compile.[26]
Computers manufactured until the 1970s had front-panel switches for manual programming.[27] The computer program was written on paper for reference. An instruction was represented by a configuration of on/off settings. After setting the configuration, an execute button was pressed. This process was then repeated. Computer programs also were automatically inputted via paper tape, punched cards or magnetic-tape. After the medium was loaded, the starting address was set via switches, and the execute button was pressed.[27]
A major milestone in software development was the invention of the Very Large Scale Integration (VLSI) circuit (1964).[28] Following World War II, tube-based technology was replaced with point-contact transistors (1947) and bipolar junction transistors (late 1950s) mounted on a circuit board.[28] During the 1960s, the aerospace industry replaced the circuit board with an integrated circuit chip.[28]
Robert Noyce, co-founder of Fairchild Semiconductor (1957) and Intel (1968), achieved a technological improvement to refine the production of field-effect transistors (1963).[29] The goal is to alter the electrical resistivity and conductivity of a semiconductor junction. First, naturally occurring silicate minerals are converted into polysilicon rods using the Siemens process.[30] The Czochralski process then converts the rods into a monocrystalline silicon, boule crystal.[31] The crystal is then thinly sliced to form a wafer substrate. The planar process of photolithography then integrates unipolar transistors, capacitors, diodes, and resistors onto the wafer to build a matrix of metal–oxide–semiconductor (MOS) transistors.[32] [33] The MOS transistor is the primary component in integrated circuit chips.[29]
Originally, integrated circuit chips had their function set during manufacturing. During the 1960s, controlling the electrical flow migrated to programming a matrix of read-only memory (ROM). The matrix resembled a two-dimensional array of fuses.[28] The process to embed instructions onto the matrix was to burn out the unneeded connections.[28] There were so many connections, firmware programmers wrote a computer program on another chip to oversee the burning.[28] The technology became known as Programmable ROM. In 1971, Intel installed the computer program onto the chip and named it the Intel 4004 microprocessor.[34]
The terms microprocessor and central processing unit (CPU) are now used interchangeably. However, CPUs predate microprocessors. For example, the IBM System/360 (1964) had a CPU made from circuit boards containing discrete components on ceramic substrates.[35]
The Intel 4004 (1971) was a 4-bit microprocessor designed to run the Busicom calculator. Five months after its release, Intel released the Intel 8008, an 8-bit microprocessor. Bill Pentz led a team at Sacramento State to build the first microcomputer using the Intel 8008: the Sac State 8008 (1972).[36] Its purpose was to store patient medical records. The computer supported a disk operating system to run a Memorex, 3-megabyte, hard disk drive.[28] It had a color display and keyboard that was packaged in a single console. The disk operating system was programmed using IBM's Basic Assembly Language (BAL). The medical records application was programmed using a BASIC interpreter.[28] However, the computer was an evolutionary dead-end because it was extremely expensive. Also, it was built at a public university lab for a specific purpose.[36] Nonetheless, the project contributed to the development of the Intel 8080 (1974) instruction set.[28]
In 1978, the modern software development environment began when Intel upgraded the Intel 8080 to the Intel 8086. Intel simplified the Intel 8086 to manufacture the cheaper Intel 8088.[37] IBM embraced the Intel 8088 when they entered the personal computer market (1981). As consumer demand for personal computers increased, so did Intel's microprocessor development. The succession of development is known as the x86 series. The x86 assembly language is a family of backward-compatible machine instructions. Machine instructions created in earlier microprocessors were retained throughout microprocessor upgrades. This enabled consumers to purchase new computers without having to purchase new application software. The major categories of instructions are:
VLSI circuits enabled the programming environment to advance from a computer terminal (until the 1990s) to a graphical user interface (GUI) computer. Computer terminals limited programmers to a single shell running in a command-line environment. During the 1970s, full-screen source code editing became possible through a text-based user interface. Regardless of the technology available, the goal is to program in a programming language.
Programming language features exist to provide building blocks to be combined to express programming ideals.[38] Ideally, a programming language should:[38]
The programming style of a programming language to provide these building blocks may be categorized into programming paradigms.[39] For example, different paradigms may differentiate:[39]
Each of these programming styles has contributed to the synthesis of different programming languages.[39]
A programming language is a set of keywords, symbols, identifiers, and rules by which programmers can communicate instructions to the computer.[40] They follow a set of rules called a syntax.[40]
Programming languages get their basis from formal languages.[41] The purpose of defining a solution in terms of its formal language is to generate an algorithm to solve the underlining problem.[41] An algorithm is a sequence of simple instructions that solve a problem.[42]
See main article: Programming language generations. The evolution of programming languages began when the EDSAC (1949) used the first stored computer program in its von Neumann architecture.[43] Programming the EDSAC was in the first generation of programming language.
The key characteristic of an assembly language program is it forms a one-to-one mapping to its corresponding machine language target.[47]
See main article: Imperative programming.
Imperative languages specify a sequential algorithm using declarations, expressions, and statements:[51]
var x: integer;
2 + 2
yields 4x := 2 + 2; [[Conditional_(computer_programming)#If–then(–else)|if]] x = 4 then do_something;
FORTRAN (1958) was unveiled as "The IBM Mathematical FORmula TRANslating system". It was designed for scientific calculations, without string handling facilities. Along with declarations, expressions, and statements, it supported:
It succeeded because:
However, non-IBM vendors also wrote Fortran compilers, but with a syntax that would likely fail IBM's compiler.[53] The American National Standards Institute (ANSI) developed the first Fortran standard in 1966. In 1978, Fortran 77 became the standard until 1991. Fortran 90 supports:
COBOL (1959) stands for "COmmon Business Oriented Language". Fortran manipulated symbols. It was soon realized that symbols did not need to be numbers, so strings were introduced.[54] The US Department of Defense influenced COBOL's development, with Grace Hopper being a major contributor. The statements were English-like and verbose. The goal was to design a language so managers could read the programs. However, the lack of structured statements hindered this goal.[55]
COBOL's development was tightly controlled, so dialects did not emerge to require ANSI standards. As a consequence, it was not changed for 15 years until 1974. The 1990s version did make consequential changes, like object-oriented programming.[55]
ALGOL (1960) stands for "ALGOrithmic Language". It had a profound influence on programming language design.[56] Emerging from a committee of European and American programming language experts, it used standard mathematical notation and had a readable, structured design. Algol was first to define its syntax using the Backus–Naur form.[56] This led to syntax-directed compilers. It added features like:
Algol's direct descendants include Pascal, Modula-2, Ada, Delphi and Oberon on one branch. On another branch the descendants include C, C++ and Java.[56]
BASIC (1964) stands for "Beginner's All-Purpose Symbolic Instruction Code". It was developed at Dartmouth College for all of their students to learn.[8] If a student did not go on to a more powerful language, the student would still remember Basic.[8] A Basic interpreter was installed in the microcomputers manufactured in the late 1970s. As the microcomputer industry grew, so did the language.[8]
Basic pioneered the interactive session.[8] It offered operating system commands within its environment:
However, the Basic syntax was too simple for large programs.[8] Recent dialects added structure and object-oriented extensions. Microsoft's Visual Basic is still widely used and produces a graphical user interface.[7]
C programming language (1973) got its name because the language BCPL was replaced with B, and AT&T Bell Labs called the next version "C". Its purpose was to write the UNIX operating system.[49] C is a relatively small language, making it easy to write compilers. Its growth mirrored the hardware growth in the 1980s.[49] Its growth also was because it has the facilities of assembly language, but uses a high-level syntax. It added advanced features like:
C allows the programmer to control which region of memory data is to be stored. Global variables and static variables require the fewest clock cycles to store. The stack is automatically used for the standard variable declarations. Heap memory is returned to a pointer variable from the malloc
function.
main
function.[58] Global variables are visible to main
and every other function in the source code.On the other hand, variable declarations inside of main
, other functions, or within {
}
block delimiters are local variables. Local variables also include formal parameter variables. Parameter variables are enclosed within the parenthesis of a function definition.[59] Parameters provide an interface to the function.
static
prefix are also stored in the global and static data region.[57] Unlike global variables, static variables are only visible within the function or block. Static variables always retain their value. An example usage would be the function int increment_counter{static int counter = 0; counter++; return counter;}
static
prefix, including formal parameter variables,[61] are called automatic variables[58] and are stored in the stack.[57] They are visible inside the function or block and lose their scope upon exiting the function or block.malloc
library function to allocate heap memory.[63] Populating the heap with data is an additional copy function. Variables stored in the heap are economically passed to functions using pointers. Without pointers, the entire block of data would have to be passed to the function via the stack.In the 1970s, software engineers needed language support to break large projects down into modules.[64] One obvious feature was to decompose large projects physically into separate files. A less obvious feature was to decompose large projects logically into abstract data types.[64] At the time, languages supported concrete (scalar) datatypes like integer numbers, floating-point numbers, and strings of characters. Abstract datatypes are structures of concrete datatypes, with a new name assigned. For example, a list of integers could be called integer_list
.
In object-oriented jargon, abstract datatypes are called classes. However, a class is only a definition; no memory is allocated. When memory is allocated to a class and bound to an identifier, it is called an object.[65]
Object-oriented imperative languages developed by combining the need for classes and the need for safe functional programming.[66] A function, in an object-oriented language, is assigned to a class. An assigned function is then referred to as a method, member function, or operation. Object-oriented programming is executing operations on objects.[67]
Object-oriented languages support a syntax to model subset/superset relationships. In set theory, an element of a subset inherits all the attributes contained in the superset. For example, a student is a person. Therefore, the set of students is a subset of the set of persons. As a result, students inherit all the attributes common to all persons. Additionally, students have unique attributes that other people do not have. Object-oriented languages model subset/superset relationships using inheritance.[68] Object-oriented programming became the dominant language paradigm by the late 1990s.[64]
C++ (1985) was originally called "C with Classes".[69] It was designed to expand C's capabilities by adding the object-oriented facilities of the language Simula.[70]
An object-oriented module is composed of two files. The definitions file is called the header file. Here is a C++ header file for the GRADE class in a simple school application:
// Used to allow multiple source files to include// this header file without duplication errors.// ----------------------------------------------
class GRADE ;
A constructor operation is a function with the same name as the class name.[71] It is executed when the calling operation executes the [[new and delete (C++)|new]]
statement.
A module's other file is the source file. Here is a C++ source file for the GRADE class in a simple school application:
GRADE::GRADE(const char letter)
int GRADE::grade_numeric(const char letter)
Here is a C++ header file for the PERSON class in a simple school application:
class PERSON ;
Here is a C++ source file for the PERSON class in a simple school application:
PERSON::PERSON (const char *name)
Here is a C++ header file for the STUDENT class in a simple school application:
// A STUDENT is a subset of PERSON.// --------------------------------class STUDENT : public PERSON;
Here is a C++ source file for the STUDENT class in a simple school application:
STUDENT::STUDENT (const char *name): // Execute the constructor of the PERSON superclass. // ------------------------------------------------- PERSON(name)
Here is a driver program for demonstration:
int main(void)
Here is a makefile to compile everything:
all: student_dvr
clean: rm student_dvr *.o
student_dvr: student_dvr.cpp grade.o student.o person.o c++ student_dvr.cpp grade.o student.o person.o -o student_dvr
grade.o: grade.cpp grade.h c++ -c grade.cpp
student.o: student.cpp student.h c++ -c student.cpp
person.o: person.cpp person.h c++ -c person.cpp
See main article: Declarative programming.
Imperative languages have one major criticism: assigning an expression to a non-local variable may produce an unintended side effect.[72] Declarative languages generally omit the assignment statement and the control flow. They describe what computation should be performed and not how to compute it. Two broad categories of declarative languages are functional languages and logical languages.
The principle behind a functional language is to use lambda calculus as a guide for a well defined semantic.[73] In mathematics, a function is a rule that maps elements from an expression to a range of values. Consider the function:
times_10(x) = 10 * x
The expression 10 * x
is mapped by the function times_10
to a range of values. One value happens to be 20. This occurs when x is 2. So, the application of the function is mathematically written as:
times_10(2) = 20
A functional language compiler will not store this value in a variable. Instead, it will push the value onto the computer's stack before setting the program counter back to the calling function. The calling function will then pop the value from the stack.[74]
Imperative languages do support functions. Therefore, functional programming can be achieved in an imperative language, if the programmer uses discipline. However, a functional language will force this discipline onto the programmer through its syntax. Functional languages have a syntax tailored to emphasize the what.[75]
A functional program is developed with a set of primitive functions followed by a single driver function.[72] Consider the snippet:
function max(a, b){/* code omitted */}
function min(a, b){/* code omitted */}
function range(a, b, c) {
return max(a, max(b, c)) - min(a, min(b, c));
}
The primitives are max
and min
. The driver function is range
. Executing:
put(range(10, 4, 7));
will output 6.
Functional languages are used in computer science research to explore new language features.[76] Moreover, their lack of side-effects have made them popular in parallel programming and concurrent programming.[77] However, application developers prefer the object-oriented features of imperative languages.[77]
Lisp (1958) stands for "LISt Processor".[78] It is tailored to process lists. A full structure of the data is formed by building lists of lists. In memory, a tree data structure is built. Internally, the tree structure lends nicely for recursive functions.[79] The syntax to build a tree is to enclose the space-separated elements within parenthesis. The following is a list of three elements. The first two elements are themselves lists of two elements:
((A B) (HELLO WORLD) 94)
Lisp has functions to extract and reconstruct elements.[80] The function head
returns a list containing the first element in the list. The function tail
returns a list containing everything but the first element. The function cons
returns a list that is the concatenation of other lists. Therefore, the following expression will return the list x
:
cons(head(x), tail(x))
One drawback of Lisp is when many functions are nested, the parentheses may look confusing.[75] Modern Lisp environments help ensure parenthesis match. As an aside, Lisp does support the imperative language operations of the assignment statement and goto loops.[81] Also, Lisp is not concerned with the datatype of the elements at compile time.[82] Instead, it assigns (and may reassign) the datatypes at runtime. Assigning the datatype at runtime is called dynamic binding.[83] Whereas dynamic binding increases the language's flexibility, programming errors may linger until late in the software development process.[83]
Writing large, reliable, and readable Lisp programs requires forethought. If properly planned, the program may be much shorter than an equivalent imperative language program.[75] Lisp is widely used in artificial intelligence. However, its usage has been accepted only because it has imperative language operations, making unintended side-effects possible.[77]
ML (1973)[84] stands for "Meta Language". ML checks to make sure only data of the same type are compared with one another.[85] For example, this function has one input parameter (an integer) and returns an integer:
ML is not parenthesis-eccentric like Lisp. The following is an application of times_10
:
times_10 2
It returns "20 : int". (Both the results and the datatype are returned.)
Like Lisp, ML is tailored to process lists. Unlike Lisp, each element is the same datatype.[86] Moreover, ML assigns the datatype of an element at compile-time. Assigning the datatype at compile-time is called static binding. Static binding increases reliability because the compiler checks the context of variables before they are used.[87]
Prolog (1972) stands for "PROgramming in LOGic". It is a logic programming language, based on formal logic. The language was developed by Alain Colmerauer and Philippe Roussel in Marseille, France. It is an implementation of Selective Linear Definite clause resolution, pioneered by Robert Kowalski and others at the University of Edinburgh.[88]
The building blocks of a Prolog program are facts and rules. Here is a simple example:
animal(X) :- cat(X). % each cat is an animalanimal(X) :- mouse(X). % each mouse is an animal
big(X) :- cat(X). % each cat is bigsmall(X) :- mouse(X). % each mouse is small
eat(X,Y) :- mouse(X), cheese(Y). % each mouse eats each cheeseeat(X,Y) :- big(X), small(Y). % each big animal eats each small animal
After all the facts and rules are entered, then a question can be asked:
Will Tom eat Jerry?
The following example shows how Prolog will convert a letter grade to its numeric value:
Here is a comprehensive example:[89]
1) All dragons billow fire, or equivalently, a thing billows fire if the thing is a dragon:
4) A thing is a creature if the thing is a dragon:
5) Norberta is a dragon, and Puff is a creature. Norberta is the mother of Puff.
Rule (2) is a recursive (inductive) definition. It can be understood declaratively, without the need to understand how it is executed.
Rule (3) shows how functions are represented by using relations. Here, the mother and father functions ensure that every individual has only one mother and only one father.
Prolog is an untyped language. Nonetheless, inheritance can be represented by using predicates. Rule (4) asserts that a creature is a superclass of a dragon.
Questions are answered using backward reasoning. Given the question:
Practical applications for Prolog are knowledge representation and problem solving in artificial intelligence.
Object-oriented programming is a programming method to execute operations (functions) on objects.[90] The basic idea is to group the characteristics of a phenomenon into an object container and give the container a name. The operations on the phenomenon are also grouped into the container.[90] Object-oriented programming developed by combining the need for containers and the need for safe functional programming.[91] This programming method need not be confined to an object-oriented language.[92] In an object-oriented language, an object container is called a class. In a non-object-oriented language, a data structure (which is also known as a record) may become an object container. To turn a data structure into an object container, operations need to be written specifically for the structure. The resulting structure is called an abstract datatype.[93] However, inheritance will be missing. Nonetheless, this shortcoming can be overcome.
Here is a C programming language header file for the GRADE abstract datatype in a simple school application:
/* Used to allow multiple source files to include *//* this header file without duplication errors. *//* ---------------------------------------------- */
typedef struct GRADE;
/* Constructor *//* ----------- */GRADE *grade_new(char letter);
int grade_numeric(char letter);
The grade_new
function performs the same algorithm as the C++ constructor operation.
Here is a C programming language source file for the GRADE abstract datatype in a simple school application:
GRADE *grade_new(char letter)
int grade_numeric(char letter)
In the constructor, the function calloc
is used instead of malloc
because each memory cell will be set to zero.
Here is a C programming language header file for the PERSON abstract datatype in a simple school application:
typedef struct PERSON;
/* Constructor *//* ----------- */PERSON *person_new(char *name);
Here is a C programming language source file for the PERSON abstract datatype in a simple school application:
PERSON *person_new(char *name)
Here is a C programming language header file for the STUDENT abstract datatype in a simple school application:
typedef struct STUDENT;
/* Constructor *//* ----------- */STUDENT *student_new(char *name);
Here is a C programming language source file for the STUDENT abstract datatype in a simple school application:
STUDENT *student_new(char *name)
Here is a driver program for demonstration:
int main(void)
Here is a makefile to compile everything:
all: student_dvr
clean: rm student_dvr *.o
student_dvr: student_dvr.c grade.o student.o person.o gcc student_dvr.c grade.o student.o person.o -o student_dvr
grade.o: grade.c grade.h gcc -c grade.c
student.o: student.c student.h gcc -c student.c
person.o: person.c person.h gcc -c person.c
The formal strategy to build object-oriented objects is to:[94]
For example:
The syntax of a computer program is a list of production rules which form its grammar.[95] A programming language's grammar correctly places its declarations, expressions, and statements.[96] Complementing the syntax of a language are its semantics. The semantics describe the meanings attached to various syntactic constructs.[97] A syntactic construct may need a semantic description because a production rule may have an invalid interpretation.[98] Also, different languages might have the same syntax; however, their behaviors may be different.
The syntax of a language is formally described by listing the production rules. Whereas the syntax of a natural language is extremely complicated, a subset of the English language can have this production rule listing:[99]
The words in bold-face are known as non-terminals. The words in 'single quotes' are known as terminals.[100]
From this production rule listing, complete sentences may be formed using a series of replacements.[101] The process is to replace non-terminals with either a valid non-terminal or a valid terminal. The replacement process repeats until only terminals remain. One valid sentence is:
However, another combination results in an invalid sentence:
Therefore, a semantic is necessary to correctly describe the meaning of an eat activity.
One production rule listing method is called the Backus–Naur form (BNF).[102] BNF describes the syntax of a language and itself has a syntax. This recursive definition is an example of a meta-language.[97] The syntax of BNF includes:
::=
which translates to is made up of a[n] when a non-terminal is to its right. It translates to is when a terminal is to its right.|
which translates to or.<
and >
which surround non-terminals.Using BNF, a subset of the English language can have this production rule listing:
Using BNF, a signed-integer has the production rule listing:[103]
Notice the recursive production rule:
Notice the leading zero possibility in the production rules:
Two formal methods are available to describe semantics. They are denotational semantics and axiomatic semantics.[104]
Software engineering is a variety of techniques to produce quality computer programs.[105] Computer programming is the process of writing or editing source code. In a formal environment, a systems analyst will gather information from managers about all the organization's processes to automate. This professional then prepares a detailed plan for the new or modified system.[106] The plan is analogous to an architect's blueprint.[106]
The systems analyst has the objective to deliver the right information to the right person at the right time.[107] The critical factors to achieve this objective are:[107]
Achieving performance objectives should be balanced with all of the costs, including:[108]
Applying a systems development process will mitigate the axiom: the later in the process an error is detected, the more expensive it is to correct.[109]
The waterfall model is an implementation of a systems development process.[110] As the waterfall label implies, the basic phases overlap each other:[111]
A computer programmer is a specialist responsible for writing or modifying the source code to implement the detailed plan.[106] A programming team is likely to be needed because most systems are too large to be completed by a single programmer.[113] However, adding programmers to a project may not shorten the completion time. Instead, it may lower the quality of the system.[113] To be effective, program modules need to be defined and distributed to team members.[113] Also, team members must interact with one another in a meaningful and effective way.[113]
Computer programmers may be programming in the small: programming within a single module.[114] Chances are a module will execute modules located in other source code files. Therefore, computer programmers may be programming in the large: programming modules so they will effectively couple with each other.[114] Programming-in-the-large includes contributing to the application programming interface (API).
Modular programming is a technique to refine imperative language programs. Refined programs may reduce the software size, separate responsibilities, and thereby mitigate software aging. A program module is a sequence of statements that are bounded within a block and together identified by a name.[115] Modules have a function, context, and logic:[116]
The module's name should be derived first by its function, then by its context. Its logic should not be part of the name.[116] For example, function compute_square_root(x)
or function compute_square_root_integer(i : integer)
are appropriate module names. However, function compute_square_root_by_division(x)
is not.
The degree of interaction within a module is its level of cohesion.[116] Cohesion is a judgment of the relationship between a module's name and its function. The degree of interaction between modules is the level of coupling.[117] Coupling is a judgement of the relationship between a module's context and the elements being performed upon.
The levels of cohesion from worst to best are:[118]
function read_sales_record_print_next_line_convert_to_float
. Coincidental cohesion occurs in practice if management enforces silly rules. For example, "Every module will have between 35 and 50 executable statements."[118]function perform_arithmetic(perform_addition, a, b)
.function initialize_variables_and_open_files
. Another example, stage_one
, stage_two
, ...function read_part_number_update_employee_record
.function read_part_number_update_sales_record
.The levels of coupling from worst to best are:[117]
perform_arithmetic(perform_addition, a, b)
. Instead, control should be on the makeup of the returned object.Data flow analysis is a design method used to achieve modules of functional cohesion and data coupling.[119] The input to the method is a data-flow diagram. A data-flow diagram is a set of ovals representing modules. Each module's name is displayed inside its oval. Modules may be at the executable level or the function level.
The diagram also has arrows connecting modules to each other. Arrows pointing into modules represent a set of inputs. Each module should have only one arrow pointing out from it to represent its single output object. (Optionally, an additional exception arrow points out.) A daisy chain of ovals will convey an entire algorithm. The input modules should start the diagram. The input modules should connect to the transform modules. The transform modules should connect to the output modules.[120]
Computer programs may be categorized along functional lines. The main functional categories are application software and system software. System software includes the operating system, which couples computer hardware with application software. The purpose of the operating system is to provide an environment where application software executes in a convenient and efficient manner.[121] Both application software and system software execute utility programs. At the hardware level, a microcode program controls the circuits throughout the central processing unit.
See main article: Application software. Application software is the key to unlocking the potential of the computer system.[122] Enterprise application software bundles accounting, personnel, customer, and vendor applications. Examples include enterprise resource planning, customer relationship management, and supply chain management software.
Enterprise applications may be developed in-house as a one-of-a-kind proprietary software.[123] Alternatively, they may be purchased as off-the-shelf software. Purchased software may be modified to provide custom software. If the application is customized, then either the company's resources are used or the resources are outsourced. Outsourced software development may be from the original software vendor or a third-party developer.[124]
The potential advantages of in-house software are features and reports may be developed exactly to specification.[125] Management may also be involved in the development process and offer a level of control.[126] Management may decide to counteract a competitor's new initiative or implement a customer or vendor requirement.[127] A merger or acquisition may necessitate enterprise software changes. The potential disadvantages of in-house software are time and resource costs may be extensive.[123] Furthermore, risks concerning features and performance may be looming.
The potential advantages of off-the-shelf software are upfront costs are identifiable, the basic needs should be fulfilled, and its performance and reliability have a track record.[123] The potential disadvantages of off-the-shelf software are it may have unnecessary features that confuse end users, it may lack features the enterprise needs, and the data flow may not match the enterprise's work processes.[123]
One approach to economically obtaining a customized enterprise application is through an application service provider.[128] Specialty companies provide hardware, custom software, and end-user support. They may speed the development of new applications because they possess skilled information system staff. The biggest advantage is it frees in-house resources from staffing and managing complex computer projects.[128] Many application service providers target small, fast-growing companies with limited information system resources.[128] On the other hand, larger companies with major systems will likely have their technical infrastructure in place. One risk is having to trust an external organization with sensitive information. Another risk is having to trust the provider's infrastructure reliability.[128]
See also: Operating system. An operating system is the low-level software that supports a computer's basic functions, such as scheduling processes and controlling peripherals.[121]
In the 1950s, the programmer, who was also the operator, would write a program and run it. After the program finished executing, the output may have been printed, or it may have been punched onto paper tape or cards for later processing.[27] More often than not the program did not work. The programmer then looked at the console lights and fiddled with the console switches. If less fortunate, a memory printout was made for further study. In the 1960s, programmers reduced the amount of wasted time by automating the operator's job. A program called an operating system was kept in the computer at all times.[129]
The term operating system may refer to two levels of software.[130] The operating system may refer to the kernel program that manages the processes, memory, and devices. More broadly, the operating system may refer to the entire package of the central software. The package includes a kernel program, command-line interpreter, graphical user interface, utility programs, and editor.[130]
The kernel's main purpose is to manage the limited resources of a computer:
Originally, operating systems were programmed in assembly; however, modern operating systems are typically written in higher-level languages like C, Objective-C, and Swift.
A utility program is designed to aid system administration and software execution. Operating systems execute hardware utility programs to check the status of disk drives, memory, speakers, and printers.[139] A utility program may optimize the placement of a file on a crowded disk. System utility programs monitor hardware and network performance. When a metric is outside an acceptable range, a trigger alert is generated.[140]
Utility programs include compression programs so data files are stored on less disk space.[139] Compressed programs also save time when data files are transmitted over the network.[139] Utility programs can sort and merge data sets.[140] Utility programs detect computer viruses.[140]
See main article: Microcode. A microcode program is the bottom-level interpreter that controls the data path of software-driven computers.[141] (Advances in hardware have migrated these operations to hardware execution circuits.)[141] Microcode instructions allow the programmer to more easily implement the digital logic level[142] —the computer's real hardware. The digital logic level is the boundary between computer science and computer engineering.[143]
A logic gate is a tiny transistor that can return one of two signals: on or off.[144]
These five gates form the building blocks of binary algebra—the digital logic functions of the computer.
Microcode instructions are mnemonics programmers may use to execute digital logic functions instead of forming them in binary algebra. They are stored in a central processing unit's (CPU) control store.[145] These hardware-level instructions move data throughout the data path.
The micro-instruction cycle begins when the microsequencer uses its microprogram counter to fetch the next machine instruction from random-access memory.[146] The next step is to decode the machine instruction by selecting the proper output line to the hardware module.[147] The final step is to execute the instruction using the hardware module's set of gates.
Instructions to perform arithmetic are passed through an arithmetic logic unit (ALU).[148] The ALU has circuits to perform elementary operations to add, shift, and compare integers. By combining and looping the elementary operations through the ALU, the CPU performs its complex arithmetic.
Microcode instructions move data between the CPU and the memory controller. Memory controller microcode instructions manipulate two registers. The memory address register is used to access each memory cell's address. The memory data register is used to set and read each cell's contents.[149]
Microcode instructions move data between the CPU and the many computer buses. The disk controller bus writes to and reads from hard disk drives. Data is also moved between the CPU and other functional units via the peripheral component interconnect express bus.[150]