The following is provided as an overview of and topical guide to databases:
Database – organized collection of data, today typically in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies).
What type of things are databases?
editDatabases can be described as all of the following:
- Information – sequence of symbols that can be interpreted as a message. Information can be recorded as signs, or transmitted as signals.
- Data – values of qualitative or quantitative variables, belonging to a set of items. Data in computing (or data processing) are often represented by a combination of items organized in rows and multiple variables organized in columns. Data are typically the results of measurements and can be visualised using graphs or images.
- Computer data – information in a form suitable for use with a computer. Data is often distinguished from programs. A program is a sequence of instructions that detail a task for the computer to perform. In this sense, data is everything in software that is not program code.
Types of databases
edit- Active database – includes an event driven architecture (often in the form of ECA rules) which can respond to conditions both inside and outside the database.
- Animation database – stores fragments of animations or human movements and which can be accessed, analyzed and queried to develop and assemble new animations.
- Back-end database – accessed by users indirectly through an external application rather than by application programming stored within the database itself or by low level manipulation of the data (e.g. through SQL commands).
- Bibliographic database – database of bibliographic records, an organized digital collection of references to published literature, including journal and newspaper articles, conference proceedings, reports, government and legal publications, patents, books, etc.
- Centralized database – database located and maintained in one location, unlike a distributed database.
- Cloud database – runs on a cloud computing platform, such as Amazon EC2, GoGrid and Rackspace.
- Collection database – collection catalog of a museum or archive implemented using a computerized database, in which the institution's objects or material are catalogued.
- Collective Optimization Database – open repository to enable sharing of benchmarks, data sets and optimization cases from the community, provide web services and Plug-in (computing)|plugins to analyze optimization data and predict program transformations or better hardware designs for multi-objective optimizations based on statistical and machine learning techniques provided there is enough information collected in the repository from multiple users.
- Configuration management database –
- Cooperative database – holds information on customers and their transactions.
- Correlation database – database management system (DBMS) that is data model independent and designed to efficiently handle unplanned, ad hoc queries in an analytical system environment.
- Current database – conventional database that stores data that is valid now.
- Directory – repository or database of information which is optimized for reading, under the assumption that data updates are very rare compared to data reads. Commonly, a directory supports search and browsing in addition to simple lookups.
- Distributed database – database in which storage devices are not all attached to a common CPU.
- Document-oriented database – computer program designed for storing, retrieving, and managing document-oriented, or Semi-structured model|semi structured data, information.
- EDA database – database specialized for the purpose of electronic design automation.
- Endgame tablebase – computerized database that contains precalculated exhaustive analysis of a chess endgame position.
- Food composition database (FCDB) – provides detailed information on the nutritional composition of foods.
- Full-text database – database that contains the complete text of books, dissertations, journals, magazines, newspapers or other kinds of textual documents. Also called a "complete-text database".
- Government database – collects personal information for various reasons (mass surveillance, Schengen Information System in the European Union, social security, statistics, etc.).
- Graph database – uses graph structures with nodes, edges, and properties to represent and store data.
- Knowledge base – special kind of database for knowledge management. A knowledge base provides a means for information to be collected, organised, shared, searched and utilised.
- Mobile database – can be connected to by a mobile computing device over a mobile network.
- Navigational database – database in which objects (or records) in it are found primarily by following references from other objects.
- Non-native speech database – speech database of non-native pronunciations of English.
- Online database – database accessible from a network, including from the Internet.
- Operational database – accessed by an Operational System to carry out regular operations of an organization.
- Parallel database – improves performance through parallelization of various operations, such as loading data, building indexes and evaluating queries.
- Probabilistic database – uncertain database in which the possible worlds have associated probabilities.
- Real-time database – processing system designed to handle workloads whose state is constantly changing (Buchmann).
- Relational database – collection of data items organized as a set of formally described tables from which data can be accessed easily.
- Spatial database – database that is optimized to store and query data that is related to objects in space, including points, lines and polygons.
- Temporal database – database with built-in time aspects, for example a temporal data model and a temporal version of Structured Query Language (SQL).
- Time series database – a time series is an associative array of numbers indexed by a datetime or a datetime range. These time series are often called profiles or curves, depending upon the market. A time series of stock prices might be called a price curve, or a time series of energy consumption might be called a load profile. Despite the disparate naming, the operations performed on them are sufficiently common as to demand special database treatment.
- Triplestore – purpose-built database for the storage and retrieval of triples, a triple being a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".
- Very large database (VLDB) – contains an extremely high number of tuples (database rows), or occupies an extremely large physical filesystem storage space.
- Virtual private database (VPD) – masks data in a larger database so that security allows only the use of apparently private data.
- Vulnerability database – platform aimed at collecting, maintaining, and disseminating information about discovered vulnerabilities targeting real computer systems.
- XLDB – Stands for "eXtremely Large Data Base".
- XML database – data stored in XML format, where it can be queried, exported and serialized into the desired format.
History of databases
editDatabase use
edit- Database usage requirements –
- Database theory – encapsulates a broad range of topics related to the study and research of the theoretical realm of databases and database management systems.
- Database machine – or is a computer or special hardware that stores and retrieves data from a database. Also called a "back end processor"
- Database server – computer program that provides database services to other computer programs or computers, as defined by the client-server model.
- Database application – computer program whose primary purpose is entering and retrieving information from a computer-managed database.
- Database management system (DBMS) – software package with computer programs that control the creation, maintenance, and use of a database.
- Database connection – facility in computer science that allows client software to communicate with database server software, whether on the same machine or not.
- Datasource – name given to the connection set up to a database from a server. The name is commonly used when creating a query to the database. The Database Source Name (DSN) does not have to be the same as the filename for the database. For example, a database file named "friends.mdb" could be set up with a DSN of "school". Then DSN "school" would then be used to refer to the database when performing a query.
- Data Source Name (DSN) – are data structures used to describe a connection to a data source. Sometimes known as a database source name though data sources are not limited to databases.
- Database administrator (DBA) – is a person responsible for the installation, configuration, upgrade, administration, monitoring and maintenance of physical[clarification needed] databases.
- Lock –
- Comparison of database tools – (provides tables for comparing general and technical information for a number of available database administrator tools.)
- Database-centric architecture – software architectures in which databases play a crucial role. Also called "data-centric architecture".
- Intelligent database – was put forward as a system that manages information (rather than data) in a way that appears natural to users and which goes beyond simple record keeping.
- Two-phase locking (2PL) – is a concurrency control method that guarantees serializability.
- Locks with ordered sharing – comprises several variants of the Two phase locking (2PL) concurrency control protocol generated by changing the blocking semantics of locks upon conflicts.
- Load file – in the litigation community is commonly referred to as the file used to import data (coded, captured or extracted data from ESI processing) into a database; or the file used to link images.
- Database publishing – area of automated media production in which specialized techniques are used to generate paginated documents from source data residing in traditional databases.
- Halloween Problem – a phenomenon in databases in which an update operation causes a change in the physical location of a row, potentially allowing the row to be visited more than once during the operation.
- Log shipping – process of automating the backup of a database and transaction log files on a primary (production) database server, and then restoring them onto a standby server.
Database languages
edit- Data definition language –
- Data manipulation language –
- Query language –
- Information retrieval query language – query language used to make queries into database, where the semantics of the query are defined not by a precise rendering of a formal syntax, but by an interpretation of the most suitable results of the query.
- SQL (Structured Query Language) – special-purpose programming language designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS).
- XQuery – a query and functional programming language that queries and transforms collections of structured and unstructured data.
Database security
edit- Database activity monitoring (DAM) – database security technology for monitoring and analyzing database activity that operates independently of the database management system (DBMS) and does not rely on any form of native (DBMS-resident) auditing or native logs such as trace or transaction logs.
- Database audit –
- Database forensics – branch of digital forensic science relating to the forensic study of databases and their related metadata.
- Negative database – credit card terminology for a list of credit card owners who chargeback a lot.
Database design
edit- Entity–relationship model (ER model) – abstract and conceptual representation of data.
- Database normalization – process of organizing the fields and tables of a relational database to minimize redundancy and dependency.
- Database refactoring – simple change to a database schema that improves its design while retaining both its behavioral and informational semantics.
Database programming
edit- Database abstraction layer – application programming interface which unifies the communication between a computer application and databases such as SQL Server, DB2, MySQL, PostgreSQL, Oracle or SQLite.
- Object–relational mapping (ORM, O/RM, and O/R mapping) – in computer software is a programming technique for converting data between incompatible type systems in object-oriented programming languages.
Database management
edit- Database virtualization – it is the decoupling of the database layer, which lies between the storage and application layers within the application stack.
- Database tuning – describes a group of activities used to optimize and homogenize the performance of a database.
- Database caching – effective approach to achieve high scalability and performance.
- Data migration § Database migration –
- Database preservation – usually involves converting the information stored in a database, without losing the characteristics (Context, Content, Structure, Appearance and Behaviour) of the data, to a format which can be used in the long term, even if the technology and daily life knowledge changes.
- Database integrity – ensures that data entered into the database is accurate, valid, and consistent.
Database management systems
edit- Database model –
- Database normalization – organizing tables based on their attributes so that the data presented can avoid having redundancy and dependency.
- Database storage structures –
- Distributed database management system –
- Federated database system – type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database.
- Referential integrity – .
- Relational algebra – offshoot of first-order logic (and of algebra of sets), deals with a set of finitary relations (see also relation (database)) that is closed under certain operators.
- Relational calculus – consists of two calculi, the tuple relational calculus and the domain relational calculus, that are part of the relational model for databases and provide a declarative way to specify database queries.
- Relational database – collection of data items organized as a set of formally described tables from which data can be accessed easily.
- Relational database management system (RDBMS) – database management system (DBMS) that is based on the relational model as introduced by E.
- Relational model – for database management is a database model based on first-order predicate logic, first formulated and proposed in 1969 by Edgar F.
- Object–relational database (ORD) – database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. Also called object–relational database management system (ORDBMS).
- Transaction processing –
Concepts
edit- Database – organized collection of data, today typically in digital form.
- ACID – (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably.
- Create, read, update and delete (CRUD) – are the four basic functions of persistent storage.
- Null –
- Candidate key – minimal superkey for a relation.
- Foreign key – referential constraint between two tables.
- Primary key –
- Superkey – set of attributes of a relation variable for which it holds that in all relations assigned to that variable, there are no two distinct tuples (rows) that have the same values for the attributes in this set.
- Surrogate key – unique identifier in a database for either an entity in the modeled world or an object in the database.
- Armstrong's axioms – set of axioms (or, more precisely, inference rules) used to infer all the functional dependencies on a relational database.
- NoSQL – class of database management system identified by its non-adherence to the widely used relational database management system (RDBMS) model:
Objects
edit- Relation –
- View –
- Database transaction –
- Transaction log – history of actions executed by a database management system to guarantee ACID properties over crashes or hardware failures. Also called "transaction journal", "database log" or "binary log".
- Database trigger – procedural code that is automatically executed in response to certain events on a particular table or view in a database.
- Index –
- Stored procedure – subroutine available to applications that access a relational database system.
- Cursor –
- Partition –
Components
edit- Concurrency control – ensures that correct results for concurrent operations are generated, while getting those results as quickly as possible.
- Data dictionary – as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format." Also called a "metadata repository".
- Java Database Connectivity – .
- Open Database Connectivity –
- Query language –
- Query optimizer – component of a database management system that attempts to determine the most efficient way to execute a query.
- Query plan – ordered set of steps used to access or modify information in a SQL relational database management system. Also called a "query execution plan".
Functions
edit- Database administration – work done by a database administrator, some of which may be automated.
- Query optimization – function of many relational database management systems in which multiple query plans for satisfying a query are examined and a good query plan is identified.
- Database replication –
Database products
edit- List of object-oriented database management systems –
- List of relational database management systems –
- Document-oriented database – computer program designed for storing, retrieving, and managing document-oriented, or Semi-structured model|semi structured data, information.
Database models
edit- Database model – theoretical foundation of a database and fundamentally determines in which manner data can be stored, organized, and manipulated in a database system. It thereby defines the infrastructure offered by a particular database system. The most popular example of a database model is the relational model.
Models
edit- Flat file database – various means to encode a database model (most commonly a table) as a single file.
- Hierarchical database model – data model in which the data is organized into a tree-like structure.
- Database model § Dimensional model –
- Network model – database model conceived as a flexible way of representing objects and their relationships.
- Relational model –
- Entity–relationship model –
- Graph database – uses graph structures with nodes, edges, and properties to represent and store data.
- Object database – database management system in which information is represented in the form of objects as used in object-oriented programming. Also called an "object-oriented database management system".
- Entity–attribute–value model –
Other models
edit- Online analytical processing § Multidimensional databases –
- Semantic data model –
- Star schema – is the simplest style of data warehouse schema. Also called "star-join schema", "data cube", or "multi-dimensional schema".
- XML database –
Implementations
edit- Flat file database –
- Deductive database – database system that can make deductions.
- Document-oriented database –
- Object–relational database –
- Temporal database – database with built-in time aspects, for example a temporal data model and a temporal version of Structured Query Language (SQL).
- XML database –
- Triplestore – purpose-built database for the storage and retrieval of triples, a triple being a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".
Data warehouse
editCreating the data warehouse
editConcepts
edit- Dimension –
- Dimensional modeling (DM) – is the name of a set of techniques and concepts used in data warehouse design.
- Fact –
- Online analytical processing (OLAP) – or is an approach to swiftly answer multi-dimensional analytical (multi-dimensional analytical|MDA) queries.
- Star schema –
- Aggregate –
Variants
edit- Anchor Modeling –
- Column-oriented DBMS – database management system (DBMS) that stores data tables as sections of columns of data rather than as rows of data, like most relational DBMSs.
- Data Vault Modeling –
- HOLAP –
- MOLAP – stands for Multidimensional Online Analytical Processing.
- ROLAP – stands for Relational Online Analytical Processing.
- Operational data store (ODS) – database designed to integrate data from multiple sources for additional operations on the data.
Elements
edit- Data dictionary – /Metadata –
- Data mart – access layer of the data warehouse environment that is used to get data out to the users.
- Sixth normal form (6NF) – term in relational database theory, used in two different ways.
- Surrogate key –
Fact
edit- Fact table – consists of the measurements, metrics or facts of a business process.
- Early-arriving fact –
- Measure –
Dimension
edit- Dimension table – one of the set of companion tables to a fact table.
- Degenerate dimension – dimension key in the fact table that does not have its own dimension table, because all the interesting attributes have been placed in analytic dimensions.
- Slowly changing dimension –
Filling
edit- Extract-Transform-Load (ETL) –
- Data extraction – act or process of retrieving data out of (usually unstructured or poorly structured) data sources for further data processing or data storage (data migration).
- Data transformation – converts data from a source data format into destination data.
- Data loading –
Using the data warehouse
editConcepts
edit- Business intelligence (BI) – is defined as the ability for an organization to take all its capabilities and convert them into knowledge, ultimately, getting the right information to the right people, at the right time, via the right channel.
- Dashboard –
- Data mining – is the process that results in the discovery of new patterns in large data sets. It is the analysis step of the "Knowledge Discovery in Databases" process, or KDD.
- Decision support system (DSS) –
- OLAP cube – set of data, organized in a way that facilitates non-predetermined queries for aggregated information, or in other words, online analytical processing.
Languages
edit- Data Mining Extensions (DMX) –
- MultiDimensional eXpressions (MDX) –
- XML for Analysis (XMLA) –
Tools
editPeople
edit- Edgar F. Codd – English Computer scientist who introduced the relational database model
- Bill Inmon –
- Ralph Kimball (Born 1944) – author on the subject of data warehousing and business intelligence.
Products
editDatabase-related organizations
editDatabase-related publications
edit- Ling Liu and Tamer M. Özsu (Eds.) (2009). "Encyclopedia of Database Systems, 4100 p. 60 illus. ISBN 978-0-387-49616-0. Table of Content available at http://refworks.springer.com/mrw/index.php?id=1217
- Beynon-Davies, P. (2004). Database Systems. 3rd Edition. Palgrave, Houndmills, Basingstoke.
- Connolly, Thomas and Carolyn Begg. Database Systems. New York: Harlow, 2002.
- Date, C. J. (2003). An Introduction to Database Systems, Fifth Edition. Addison Wesley. ISBN 0-201-51381-1.
- Gray, J. and Reuter, A. Transaction Processing: Concepts and Techniques, 1st edition, Morgan Kaufmann Publishers, 1992.
- Kroenke, David M. and David J. Auer. Database Concepts. 3rd ed. New York: Prentice, 2007.
- Lightstone, S.; Teorey, T.; Nadeau, T. (2007). Physical Database Design: the database professional's guide to exploiting indexes, views, storage, and more. Morgan Kaufmann Press. ISBN 978-0-12-369389-1.
- Teorey, T.; Lightstone, S. and Nadeau, T. Database Modeling & Design: Logical Design, 4th edition, Morgan Kaufmann Press, 2005. ISBN 0-12-685352-5
Database scholars
editSee also
editReferences
editExternal links
edit- DB File extension – information about files with the DB extension