Data (computer science) explained

In computer science, data (treated as singular, plural, or as a mass noun) is any sequence of one or more symbols; datum is a single symbol of data. Data requires interpretation to become information. Digital data is data that is represented using the binary number system of ones (1) and zeros (0), instead of analog representation. In modern (post-1960) computer systems, all data is digital.

Data exists in three states: data at rest, data in transit and data in use. Data within a computer, in most cases, moves as parallel data. Data moving to or from a computer, in most cases, moves as serial data. Data sourced from an analog device, such as a temperature sensor, may be converted to digital using an analog-to-digital converter. Data representing quantities, characters, or symbols on which operations are performed by a computer are stored and recorded on magnetic, optical, electronic, or mechanical recording media, and transmitted in the form of digital electrical or optical signals.[1] Data pass in and out of computers via peripheral devices.

Physical computer memory elements consist of an address and a byte/word of data storage. Digital data are often stored in relational databases, like tables or SQL databases, and can generally be represented as abstract key/value pairs. Data can be organized in many different types of data structures, including arrays, graphs, and objects. Data structures can store data of many different types, including numbers, strings and even other data structures.

Characteristics

Metadata helps translate data to information. Metadata is data about the data. Metadata may be implied, specified or given.

Data relating to physical events or processes will have a temporal component. This temporal component may be implied. This is the case when a device such as a temperature logger receives data from a temperature sensor. When the temperature is received it is assumed that the data has a temporal reference of now. So the device records the date, time and temperature together. When the data logger communicates temperatures, it must also report the date and time as metadata for each temperature reading.

Fundamentally, computers follow a sequence of instructions they are given in the form of data. A set of instructions to perform a given task (or tasks) is called a program. A program is data in the form of coded instructions to control the operation of a computer or other machine.[2] In the nominal case, the program, as executed by the computer, will consist of machine code. The elements of storage manipulated by the program, but not actually executed by the central processing unit (CPU), are also data. At its most essential, a single datum is a value stored at a specific location. Therefore, it is possible for computer programs to operate on other computer programs, by manipulating their programmatic data.

To store data bytes in a file, they have to be serialized in a file format. Typically, programs are stored in special file types, different from those used for other data. Executable files contain programs; all other files are also data files. However, executable files may also contain data used by the program which is built into the program. In particular, some executable files have a data segment, which nominally contains constants and initial values for variables, both of which can be considered data.

The line between program and data can become blurry. An interpreter, for example, is a program. The input data to an interpreter is itself a program, just not one expressed in native machine language. In many cases, the interpreted program will be a human-readable text file, which is manipulated with a text editor program. Metaprogramming similarly involves programs manipulating other programs as data. Programs like compilers, linkers, debuggers, program updaters, virus scanners and such use other programs as their data.

For example, a user might first instruct the operating system to load a word processor program from one file, and then use the running program to open and edit a document stored in another file. In this example, the document would be considered data. If the word processor also features a spell checker, then the dictionary (word list) for the spell checker would also be considered data. The algorithms used by the spell checker to suggest corrections would be either machine code data or text in some interpretable programming language.

In an alternate usage, binary files (which are not human-readable) are sometimes called data as distinguished from human-readable text.[3]

The total amount of digital data in 2007 was estimated to be 281 billion gigabytes (281 exabytes).[4] [5]

Data keys and values, structures and persistence

Keys in data provide the context for values. Regardless of the structure of data, there is always a key component present. Keys in data and data-structures are essential for giving meaning to data values. Without a key that is directly or indirectly associated with a value, or collection of values in a structure, the values become meaningless and cease to be data. That is to say, there has to be a key component linked to a value component in order for it to be considered data.

Data can be represented in computers in multiple ways, as per the following examples:

RAM

Keys

Organised recurring data structures

Sorted or ordered data

Peripheral storage

Indexed data

Abstraction and indirection

  1. The taxonomic rank-structure of classes, which is an example of a hierarchical data structure; and
  2. at run time, the creation of references to in-memory data-structures of objects that have been instantiated from a class library.

It is only after instantiation that an object of a specified class exists. After an object's reference is cleared, the object also ceases to exist. The memory locations where the object's data was stored are garbage and are reclassified as unused memory available for reuse.

Database data

Parallel distributed data processing

See also

Notes and References

  1. Web site: Data. Lexico. 14 January 2022. dead. https://web.archive.org/web/20190623094330/https://www.lexico.com/en/definition/data . 2019-06-23 .
  2. Web site: Computer program. The Oxford pocket dictionary of current english. 11 October 2012. live. https://web.archive.org/web/20111128202415/http://www.encyclopedia.com/topic/computer_program.aspx#2. 28 November 2011.
  3. Web site: file(1). OpenBSD manual pages. 24 December 2015. 4 February 2018. live. https://web.archive.org/web/20180205000843/https://man.openbsd.org/file.1. 5 February 2018.
  4. News: Paul, Ryan. Study: amount of digital info > global storage capacity. 12 March 2008. Ars Technics. 13 March 2008. live. https://web.archive.org/web/20080313111238/http://arstechnica.com/news.ars/post/20080312-study-amount-of-digital-info-global-storage-capacity.html. 13 March 2008.
  5. Web site: Gantz, John F.. The diverse and exploding digital universe. International Data Corporation via EMC. 2008. 12 March 2008. etal . dead . https://web.archive.org/web/20080311234210/http://www.emc.com/leadership/digital-universe/expanding-digital-universe.htm . 11 March 2008.