Information Impact International Information Quality
Site Search:



High IQ Team Login
Member ID:

Password:


Not Registered?
View Member Benefits

Forgot Password?

 

Other Resources - Glossary

 



Search the Glossary:

 



 

 

          
λ The Greek letter "lambda" used to represent the mean of a Poisson distribution.
μ The Greek letter "mu" used to represent the mean of a population.
Accessibility The characteristic of being able to access data when it is required.
Accuracy to reality A characteristic of information quality measuring the degree to which a data value (or set of data values) correctly represents the attributes of the real-world object or event.
Accuracy to surrogate source A measure of the degree to which data agrees with an original, acknowledged authoritative source of data about a real world object or event, such as a form, document, or unaltered electronic data received from outside the organization. See also Accuracy.
Aggregation The process of associating objects of different types together in a meaningful whole. Also called composition.
Algorithm A set of statements or a formula to calculate a result or solve a problem in a defined set of steps.
Alias A secondary and non-standard synonym or alternate name of an enterprise standard business term, entity type or attribute name, used only for cross reference of an official name to legacy or software package data name.
ANSI Acronym for American National Standards Institute, the U.S. body that sets standards.
Application A collection of computer hardware, computer programs, databases, procedures, and knowledge workers that work together to perform a related group of services or business processes.
Application architecture A graphic representation of a system showing the process, data, hardware, software, and communications components of the system across a business value chain.
Archival database A copy of a database saved in its exact state for historical purposes, recovery, or restoration.
Artificial Intelligence (AI) The capability of a system to perform functions normally associated with human intelligence, such as reasoning, learning, and self-improvement.
Association See Relationship.
Associative entity type An entity type that describes the relationship of a pair of entity types that have a many-to-many relationship or cardinality. For example, COURSE COMPLETION DATE has meaning only in the context of the relationship of a STUDENT and COURSE OFFERING entity types.
Asynchronous replication Replication in which a primary data copy is considered complete once the update transaction completes, and secondary replicated data copies are queued to be updated as soon as possible or on a predefined schedule.
Atomic value An individual data value representing the lowest level of meaningful fact.
Attribute An inherent property, characteristic, or fact that describes an entity or object. A fact that has the same format, interpretation, and domain for all occurrences of an entity type. An attribute is a conceptual representation of a type of fact that is implemented as a field in a record or data element in a database file.
Attributive entity type An entity type that cannot exist on its own and contains attributes describing another entity. An attributive entity type resolves a one-to-many relationship between an entity type and a descriptive attribute that may contain multiple values. Also called characteristic or dependent entity type.
Audit trail Data that can be used to trace activity such as database transactions.
Authentication The process of verifying that a person requesting a resource, such as data or a transaction, has authority or permission to access that resource.
Availability A percentage measure of the reliability of a system indicating the percentage of time the system or data is accessible or usable, compared to the amount of time the system or data should be accessible or usable.
Backup To restore a database to its state at a previous point in time. Backup is achieved : (1) from an archived or a snapshot copy of the database at a specified time; or (2) from an archived copy of a database and applying the logged update activity of changes since that archived copy was made.
Benchmarking The process of analyzing and comparing an organization’s processes to that of other organizations to identify Best practices.
Best practice A process, standard or component that is generally recognized to produce superior results when compared with similar processes, standards or components.
Bias A vested interest, or strongly held paradigm or condition that may skew the results of sampling, measuring, or reporting the findings of a quality assessment. For example, if information producers audit their own data quality, they will have a bias to overstate its quality. If data is sampled in such a way that it does not reflect the entire population sampled, the sample result will be biased.
Biased sampling Sampling procedures that result in a sample that is not truly representative of the population sampled.
Bounds See Confidence interval.
Boyce/Codd Normal Form (BCNF) (1) A relation R is in Boyce/Codd normal form (BCNF) if and only if every determinant is a candidate key. (2) A table is in BCNF if every attribute that is a unique identify of attributes describing an entity is a candidate key of that entity.
Business application model A graphic illustration of the conceptual application systems, both manual and automated, including their dependencies, required to perform the processes of an organization.
Business information resource data The Set of information resource data that must be known to information producers and knowledge workers in order to understand the meaning of information, the business rules that governs its quality and the stakeholders who create or require it.
Business information steward A business subject-matter expert designated and accountable for overseeing some parts of data definition for a collection of data for the enterprise, such as data definition integrity, legal restriction compliance standards, data quality standards, and authorization security.
Business process A synonym for value chain, the term is used to differentiate a value chain of activities from a functional process or functional set of activities.
Business process model A graphic and descriptive representation of business processes or value chains that cut across functions and organizations. The model may be expressed in different levels of detail, including decomposition into successive lower levels of activities.
Business process reengineering the process of analyzing, redefining, and redesigning business activities to eliminate or minimize activities that add cost and to maximize activities that add value.
Business resource category A business classification of data about a resource the enterprise must manage across business functions and organizations, used as a basis for high-level information modeling. The internal resource categories are human resource, financial, materials and products, facilities and tangible assets, and information. External resources include business partners, such as suppliers and distributors; customers; and external environment, such as regulation and economic factors. Also called subject area.
Business rule A statement expressing a policy or condition that governs business actions and establishes data integrity guidelines.
Business rule conformance See Validity.
Business term A word, phrase, or expression that has a particular meaning to the enterprise.
Business value chain See Value chain
Candidate key A key that can serve to uniquely identify occurrences of an entity type. A candidate key must have two properties : (1) Each occurrence or record must have a different value of the key, so that a key value identifies only one occurrence; and (2) No attribute in the key can be eliminated without nullifying the first property.
Cardinality The number of occurrences that may exist between occurrences of two related entity types. The cardinalities between a pair of related entity types are : one to one, one to many, or many to many. See Relationship.
CASE Acronym for Computer-Aided Systems Engineering. the application of automated technologies to business and information modeling and software engineering.
CASS (Coding Accuracy Support System) : A system for verifying the integrity of United States addresses against a USPS maintained database containing every mailing address in the United States. The system is concerned with just the addresses, not the people or organizations residing at these addresses.
Catalog The component of a Database Management System (DBMS) where physical characteristics about the database are stored, such as its physical design schema, table or file names, primary keys, foreign key relationships, and other data required for the DBMS to manage the data.
Cause-and-effect diagram A chart in the shape of a "fishbone" used to analyze the relationship between error cause and error effect. The diagram, invented by Ishikawa, shows a specific effect and possible causes or error. The errors are drawn in four categories, each a bone on the fish. The categories are : (1) Human (Ishikawa called this manpower), (2)Methods, (3) Machines, and (4) Materials.
Central tendency The phenomenon that data measured from a process generally aggregates around a value somewhere between the high and low values.
Champion In Six Sigma, the executive or manager who "owns" a process to be improved, and whose role is an advocate for the improvement project, with oversight and management of critical elements, reporting project success to up-line management, and who removes barriers to enable project improvement success.
Checklist A technique for quality improvement to identify steps to perform or items to check before work is complete.
Class word See Domain type.
Cleansing See Data cleansing.
Cluster (1) A way of storing records or rows from one or more tables together physically, based on a common key or partial key value (ER). (2) Groups of objects that have similar characteristics or behaviors that are significantly different from other objects that are discovered through data analysis or mining (Stat).
Cluster sampling Sampling a population by taking samples from a smaller number of subgroups (such as geographic areas) of the population. The subsamples from each cluster are combined to make up the final sample. For example, in sampling sales data for a chain of stores, one may choose to take a subsample of a representative subset of stores (each a cluster) into a cluster sample rather than randomly select sales data from every store.
Code (1) To represent data in a form that can be accepted by an application program. (2) : A shorthand representation or abbreviation of a specific value of an attribute.
Commit A DML command that signals a successful end of a transaction and confirms that a record(s) inserted, updated, or deleted in the database is complete.
Common cause A source of unacceptable variation or defect caused by the process or system itself. See also Special cause.
Completeness A characteristic of information quality measuring the degree to which all required data is known. (1) Fact completeness is a measure of data definition quality expressed as a percentage of the attributes about an entity type that need to be known to assure that they are defined in the model and implemented in a database. For example, "80 percent of the attributes required to be known about customers have fields in a database to store the attribute values." (2) Value completeness is a measure of data content quality expressed as a percentage of the columns or fields of a table or file that should have values in them, in fact do so. For example, "95 percent of the columns for the customer table have a value in them." Also referred to as Coverage. (3) Occurrence completeness is a measure of the percent of records in an information collection that it should have to represent all occurrences of the real world objects it should know. For example, does a Department of Corrections have a record for each Offender it is responsible to know about? (IQ).
Conceptual data model See Data model.
Concurrency (1) A characteristic of information quality measuring the degree to which the timing of equivalence of data is stored in redundant or distributed database files. The measure data concurrency may describe the minimum, maximum, and average information float time from when data is available in one data source and when it becomes available in another data source. Or it may consist of the relative percent of data from a data source that is propagated to the target within a specified time frame.
Concurrency assessment An audit of the timing of equivalence of data stored in redundant or distributed database files. See Equivalence.
Concurrency control A DBMS mechanism of locking records used to manage multiple transactions access to shared data.
Conditional relationship An association that is optional depending on the nature of the related entities or on the rules of the business environment.
Confidence interval, or confidence interval of the mean The upper and lower limits or values, or bounds on either side of a sample mean for which a confidence level is valid.
Confidence level The degree of certainty, expressed as a percentage, of being sure that the value for the mean of a population is within a specific range of values around the mean of a sample. For example, a 95 percent confidence level indicates that one is 95 percent sure that the estimate of the mean is within a desired precision or range of values called a confidence interval. Stated another way, a 95 percent confidence level means that out of 100 samples from the same population, the mean of the population is expected to be contained within the confidence interval in 95 of the 100 samples.
Confidence limits See Confidence interval.
Configuration management The process of identifying and defining configurable items in an environment by controlling their release and any subsequent changes throughout the development life cycle; recording and reporting the status of those items and change requests; and verifying the completeness and correctness of configurable items.
Consensus The agreement of a group with a judgment, decision, or data definition in which the stakeholders have participated and can say, "I can live with it."
Consistency A measure of information quality expressed as the degree to which a set of data is equivalent in redundant or distributed databases.
Constraint A business rule that places a restriction on business actions and therefore restrictions the resulting data. For example, "only wholesale customers may place wholesale orders."
Contamination See Information quality contamination.
Control The mechanisms used to manage processes to maintain acceptable performance.
Control chart A graphical device for reporting process performance over time for monitoring process quality performance.
Control group A selected set of people, objects, or processes to be observed to record behavior or performance characteristics. Used to compare behavior and performance to another group in which changes or improvements have been made.
Conversion The process of preparing, reengineering, cleansing and transforming data, and loading it into a new target data architecture.
Corporate data See Enterprise data.
Correlation A predictive relationship that exists between two factors, such that when one of the factors changes, you can predict the nature of change in the other factor. For example, if information quality goes up, the costs of information scrap and rework go down.
Cost of acquisition (1) The cost of acquiring a new customer, including identifying, marketing and presales activities to get the first sale. (2) The costs of acquiring products, such as software packages, and services. This should be weighed against the cost of ownership.
Cost of information quality assessment The costs associated with measurement and quality conformance assurance as a component of the cost of quality information.
Cost of nonquality information The total costs associated with failure or nonquality information and information services, including, but not limited to reruns, rework, downstream data verification, data correction, data transformation to nonstandard definition or format, work arounds.
Cost of ownership The total costs of ownership of products, such as software packages, and services, including planning, acquiring, process redesign, implementation, and support required for the successful use of the product or service.
Cost of quality information The total costs associated with providing nonquality information or information services. The costs consists of costs of failure or nonquality information plus the costs of assessment and conformance plus the costs of information process improvement and data defect prevention.
Cost of retention The cost of managing customer relationships that result in subsequent sales to existing customers.
Coverage See Completeness.
Critical information Information that if missing or wrong can cause enterprise-threatening loss of money, life, or liability, such as failure to properly calculate pension withholding, not setting the airplane flaps correctly for take-off, or prescribing the wrong drug.
Cross-functional The characteristic of data or process that is of interest to more than one business or functional area.
Currency A characteristic of information quality measuring the degree to which data represents reality from the required point in time. For example, one information view may require data currency to be the most up-to-date point, such as stock prices for stock trades, while another may require data to be the last stock price of the day, for stock price running average.
Customer The persons or organizations whose needs the enterprise must meet, and whose satisfaction with its products and services, including information, determines enterprise success or failure.
Customer life cycle The states of existence and relative time periods of a typical customer from being a prospect to becoming an active customer, to becoming nonactive and a “former” customer.
Customer lifetime revenue The net present value of the average customer revenue over the life of relationship with the enterprise.
Customer lifetime value (LTV) The net present value of the average profit of a typical customer over the life of relationship with the enterprise.
Customer segment A meaningful aggregation of customers for the purpose of marketing or determining customer lifetime value.
Customer-supplier relationship See Information customer-supplier relationship.
CUSUM Abbreviation for Cumulative Summation, a more sensitive method for detecting out-of-control measurements than a simple control chart. The CUSUM indicates when a process has been off aim for too long a period of time.
Cycle time The time required for a process (or subprocess) to execute from start to completion.
d A symbol representing the set of deviations of a set of items from the mean of the set of items, expressed as d = x-x bar for each value of x
Data 1) Symbols, numbers or other representation of facts; 2) The raw material from which information is produced when it is put in a context that gives it meaning. See also Information.
Data administration See Data management.
Data administrator One who manages or provides data administration functions.
Data analyst One who identifies data requirements, defines data, and synthesizes it into data models.
Data architect One who is responsible for the development of data models.
Data audit See Information quality assessment.
Data cleansing An information scrap-and-rework process to correct data errors in a collection of data in order to bring the level of quality to an acceptable level to meet the information customers’ needs.
Data cleanup See Data cleansing.
Data consistency assessment The process of measuring data equivalence and information float or timeliness in an interface-based information value chain.
Data content quality The subset of information quality referring to the quality of data values.
Data defect prevention The process of information process improvement to eliminate or minimize the possibility of data errors from getting into an information product or database.
Data definition The specification of the meaning, valid values or ranges (domain), and business integrity rules for an entity type or attribute. Data definition includes name, definition, and relationships, as well as domain value definition and business rules that govern business actions that are reflected in data. These components represent the "information product specification" components of Information Resource Data or meta data.
Data Definition Language (DDL) The language used to describe database schemas or designs.
Data definition quality A component of information quality measuring the degree to which data definition accurately, completely, and understandably defines what the information producers and knowledge workers should know in order to perform their job processes effectively. Data definition quality is a measure of the quality of the information product specification.
Data dictionary A repository of information (meta data) defining and describing the data resource. A repository containing meta data. An active data dictionary, such as a catalog, is one that is capable of interacting with and controlling the environment about which it stores information or meta data. An integrated data dictionary is one that is capable of controlling the data and process environments. A passive data dictionary is one that is capable of storing meta data or data about the data resource, but is not capable of interacting with or controlling the computerized environment external to the data dictionary. See also Repository.
Data dissemination The distribution of a copy or extract of information in any form, from electronic to paper from a database or data source to other parties. This is NOT to be confused with data or information sharing. (Q)
Data element The smallest unit of named data that has meaning to a knowledge worker. A data element is the implementation of an attribute. Synonymous with data item and field.
Data flow diagram A graphic representation of the "flow" of data through business functions or processes. It illustrates the processes, data stores, external entities, data flows, and their relationships.
Data independence The property of being able to change the overall logical or physical structure of the data without changing the application program's view of the data.
Data intermediary See Data scribe.
Data intermediation The design of and performance of processes in which the actual creator or originator of knowledge does not capture that knowledge electronically, but gives it in paper or other form to be entered into a database by someone else.
Data management The management and control of data as an enterprise asset. It includes strategic information planning, establishing data-related standards, policies, and procedures, and data modeling and information architecture. Also called data administration.
Data Manipulation Language (DML) The language used to access data in one or more databases.
Data mart A subset of enterprise data along with software to extract data from a data warehouse or operational data store, summarize and store it, and to analyze and present information to support trend analysis and tactical decisions and processes. The scope can be that of a complete data subject such as Customer or Product Sales, or of a particular business area or line of business, such as Retail Sales. A data mart architecture, whether subject or business area, must be an enterprise-consistent architecture.
Data mining The process of analyzing large volumes of data using pattern recognition or knowledge discovery techniques to identify meaningful trends and relationships represented in data in large databases.
Data model A logical map or representation of real-world objects and events that represents the inherent properties of the data independently of software, hardware, or machine performance considerations. The model shows data attributes grouped into third normal form entities, and the relationships among those entities.
Data presentation quality A component of information quality measuring the degree to which information-bearing mechanisms, such as screens, reports, and other communication media, are easy to understand, efficient to use, and minimize the possibility of mistakes in its use.
Data quality See Information quality.
Data quality assessment See Information quality assessment.
Data reengineering The process of analyzing, standardizing, and transforming data from unarchitected or nonstandardized files or databases into an enterprise-standardized information architecture.
Data replication The controlled process of propagating equivalent data values from a source database to one or more duplicate copies in other databases.
Data resource management See Information resource management.
Data scribe A role in which individuals transcribe data in one form, such as a paper document, to another form, such as into a computer database; for example, a data entry clerk entering data from a paper order form into a database.
Data store Any place in a system where data is stored. This includes manual files, machine-readable files, data tables, and databases. A data store on a logical data flow diagram is related to one or more entities in the data model.
Data transformation The process of defining and applying algorithms to change data from one form or domain value set to another form or domain value set in a target data architecture to improve its value and usability for the information stakeholders.
Data type An attribute of a data element or field that specifies the DBMS type of physical values, such as numeric, alphanumeric, packed decimal, floating point, or datetime.
Data value A specific representation of a fact for an attribute at a point in time.
Data visualization Graphical presentation of patterns and trends represented by data relationships.
Data warehouse A collection of software and data organized to collect, cleanse, transform, and store data from a variety of sources, and analyze and present information to support decision-making, tactical and strategic business processes.
Data warehouse audits and controls A collection of checks and balances to assure the extract, cleansing, transformation, summarization, and load processes are in control and operate properly. The controls must assure the right data is extracted from the right sources, transformed, cleansed, summarized correctly, and loaded to the right target files.
Data-driven development See Value-centric development.
Database administration The function of managing the physical aspects of the data resource, including physical database design to implement the conceptual data model; and database integrity, performance, and security
Database integrity The characteristic of data in a database in which the data conforms to the physical integrity constraints, such as referential integrity and primary key uniqueness, and is able to be secured and recovered in the event of an application, software, or hardware failure. Database integrity does not imply data accuracy or other information quality characteristics not able to be provided by the DBMS functions.
Database marketing The use of collected and managed information about one’s customers and prospects to provide better service and establish long-term relationships with them. Database marketing involves analyzing and designing pertinent customer information needs, collecting, maintaining, and analyzing that data to support mass customization of marketing campaigns to decrease costs, improve response, and to build customer loyalty, reduce attrition, and increase customer satisfaction.
Database server The distributed implementation of a set of database management functions in which one dedicated collection of database management functions, accessing one or more databases on that mode, serves multiple knowledge workers or clients that provide a human-machine interface for the requesting of a creation of data.
DDL Acronym for Data Definition Language.
Decision Support System (DSS) Applications that use data in a free-form fashion to support managerial decisions by applying ad hoc query, summarization, trend analysis, exception identification, and "what-if" questions.
Defect An item that does not conform to its quality standard or customer expectation.
Defect Prevention Software Software that enables the identification and elimination of information quality problems at the electronic source of data capture, such as nonconformance to all business rules or the identification of potential duplicate records, or the non-uniqueness of primary identifiers such as tax-it numbers. (IQ)
Defect rate See Error rate.
Definition conformance The characteristic of data, such that the data values represent a fact consistent with the agreed-upon definition of the attribute. For example, a value of "6/7/1997" actually represents the "Order Date : the date an order is placed by the customer," and not the system date created when the order is entered into the system.
Delphi approach An approach used to achieve consensus, that involves individual judgments made independently, group discussion of the rationales for disparate judgments, and a consensus judgment being agreed upon by the participants.
Demography The study of human populations, especially with reference to size, density, distribution and other vital statistics.
Derived data Data that is created or calculated from other data within the database or system.
Deviation (d) The difference in value of an item in a set of items and the mean (x bar) of the set as expressed in the formula d = x-x bar, where d = deviation, x = the value of an item in a set, and x bar is the mean or average of all items in the set.
Devil's advocate A technique used in decision making in which someone plays the role of challenging the predominant position in order to expose potential flaws, influence critical thinking and prevent biased and potentially harmful decisions.
DFD Acronym for Data Flow Diagram.
DIF Acronym for Data Interchange Format.
Dimension A category for summarizing or viewing data (e.g., time period, product, product line, geographic area, and organization).
Directory A table, block, index, or folder containing addresses and locations or relationships of data or files and used as a way of organizing files.
Discount rate The market rate of interest representing the cost to borrow money. This rate may be applied to future income to calculate its net present value.
DMAIC Acronym for Define-Measure-Analyze-Improve-Control, the Six Sigma method for process improvement.
DML Acronym for Data Manipulation Language.
Domain (1) Set or range of valid values for a given attribute or field, or the specification of business rules for determining the valid values. (2) The area or field of reference of an application or problem set.
Domain value redundancy A dysfunctional characteristic of an attribute or field in which the same fact of information is represented by more than one value. For example, unit of measure code having domain values of "doz," "dz," and "12" may all represent the fact that the unit of measure is "one dozen."
Domain chaos A dysfunctional characteristic of an attribute or field in which multiple types of facts are represented by more. For example, unit of measure code for one product has a domain value of "doz," to represent a unit of measure of "one dozen," while for another product unit of measure code has a value of "150," to represent a the reorder point quantity.
Domain type A general classification that characterizes the kind of values that may be values of a specific attribute, such as a number, date, currency amount, or percent. The domain type name may be used as a component of an attribute name. Also called a class word.
Drill down The process of accessing more detailed data from summary data to identify exceptions and trends. May be multitier.
Drill through The process of accessing the original source data from a replicated or transformed copy to verify equivalence to the record-of-origin data.
DSS Acronym for Decision Support Systems.
E-commerce Acronym for electronic commerce, the conducting of business transactions over the Internet (I-Net).
EDI Acronym for Electronic Data Interchange.
Edit and validatation The process of assuring data being created conforms to the governing business rules and is correct. Database integrity controls and software routines can edit and validate conformance to business rules. Information producers must validate correctness of data.
EIS Acronym for Executive Information System.
Empty value A data element that has no value has been capture, and for which the real-world object represented has no corresponding value. For example, there is no date value for the data element, "Last date of service" for an active Employee. Contrast with Missing value. (Stat, Q)
End-consumer The persons or organizations whose needs a product or service provider must meet, and whose satisfaction with its products and services, including information, determines enterprise success or failure. A customer may be a direct, immediate Customer or the End-consumer of the product or service.
Enterprise data The data of an organization or corporation that is owned by the enterprise and managed by a business area. Characteristics of corporate data are that it is essential to run the business and/or it is shared by more than one organizational unit within the enterprise.
Entity integrity The assurance that a primary key value will identify no more than one occurrence of an entity type, and that no attribute of the primary key may contain a null value. Based on this premise, the real-world entities are uniquely distinguishable from all other entities.
Entity life cycle The phases, or distinct states, through which an occurrence of an object moves over a definable period of time. The subtypes of an entity that are mutually exclusive over a given time frame. Also referred to as entity life history and state transition diagram.
Entity Relationship Diagram (ERD) See Entity relationship model.
Entity relationship model A graphical representation illustrating the entity types and the relationships of those entity types of interest to the enterprise.
Entity subtype A specialized subset of occurrences of a more general entity type, having one or more different attributes or relationships not inherent in the other occurrences of the generalized entity type. For example, an hourly employee will have different attributes from a salaried employee, such as hourly pay rate and monthly salary.
Entity supertype A generalized entity in which some occurrences belong to a distinct, more specialized subtype.
Entity type A classification of the types of real-world objects (such as person, place, thing, concept, or events of interest to the enterprise) that have common characteristics. Sometimes the term entity is used as a short name.
Entity/process matrix A matrix that shows the relationships of the processes, identified in the business process model, with the entity types identified in the information model. The model illustrates which processes create, update, or reference the entity types.
Equivalence A characteristic of information quality that measures the degree to which data stored in multiple places is conceptually equal. Equivalence indicates the data has equal values or is in essence the same. For example, a value of "F" for Gender Code for J. J. Jones in database A and a value of "1" for Sex Code for J. J. Jones in database B mean the same thing : J. J. Jones is female. The measure equivalence is the percent of fields in records within one data collection that are semantically equivalent to their corresponding fields within another data collection or database. Also called semantic equivalence.
ERD Acronym for Entity Relationship Diagram.
Error cause removal Elimination of cause(s) of error in a way that prevents recurrence of the error.
Error event A measure of the frequency that errors occur in a process. Also called failure rate (in manufactured products), or defect rate.
Error rate A measure of the frequency that errors occur in a process. Also called failure rate (in manufactured products), or defect rate.
Event An occurrence of something that happens that is of interest to the enterprise.
Executive Information System (EIS) A graphical application that supports executive processes, decisions, and information requirements. Presents highly summarized data with drill-down capability, and access to key external data.
Expert system (1) A specific class of knowledge base system in which the knowledge, or rules, are based on the skills and experience of a specific expert or group of experts in a given field. (2) A branch of artificial intelligence. An expert system attempts to represent and use knowledge in the same way a human expert does. Expert systems simulate the human trait of thinking.
Export The function of extracting information from a repository or database and packaging it to an export/import file.
Extensibility The ability to dynamically augment a database (or data dictionary) schema with knowledge worker-defined data types. This includes addition of new data types and class definitions for representation and manipulation of unconventional data such as text data, audio data, image data, and data associated with artificial intelligence applications.
Fact (1) Something that is known or needs to be known. (2) In data warehousing, a specific numerical sum that represents a key business performance measure.
Fact completeness See Completeness.
Fact table The primary table in dimensional modeling that contains key business measurements. The facts are viewed by various Dimensions. See also Enterprise fact.
Failure cost See Costs of nonquality information.
Failure mode (1) The precipitating defect or mechanism that causes a failure. (2) The result or consequence of a failure or the manifestation of a failure. (3) The way in which a failure occurs and its impact on the normal process.
Failure model analysis (FMA) A procedure to determine the precipitating cause or symptoms that occur just before or after a process failure. The procedure analyses failure mode data from current and previous process designs with a goal to define improvements to prevent recurrence of failure. See also Information process improvement.
Failure rate A measure of the frequency that defective items are produced by a process; hence, the frequency with which the process fails. See also Error rate.
False Negative (1) In quality measurement, the condition of measuring a value for accuracy (or validity) and finding it to be not accurate (or not valid) when it is accurate (or valid). (2) In record matching, the condition of failing to identify that two records represent the same real world object.
False positive (1) In quality measurement, the condition of measuring a value for accuracy (or validity) and finding it to be accurate (or valid) when it is not. (2) In record matching, the condition of incorrectly identifying that two records represent the same real world object, when they actually represent two unique real world objects.
Feedback loop A formal mechanism for communicating information about process performance and information quality to the process owner and information producers.
Field A data element or data item in a data structure or record.
Fifth Normal Form (5NF) (1) A relation R is in fifth normal form (5NF) (also called Projection Join Normal Form (PJ/NF)) if and only if every join dependency in R is a consequence of the candidate keys of R. (2) A table is in 5NF if a relation or record in which all elements within a concatenated key are independent of each other and cannot be derived from the remainder of the key.
File integrity The degree to which documents in a file retain their original form and utility (i.e., no misfiled or torn documents).
Filter See Information quality measure.
First Normal Form (1NF) (1) A relation R is in first normal form (1NF) if and only if all underlying domains contain atomic values only. (2) A table is in 1NF if it can be represented as a two-dimensional table, and for every attribute there exists one single meaningful and atomic value, never a repeating group of values.
Fishbone diagram See Cause-and-effect diagram.
Flexibility A characteristic of information quality measuring the degree to which the information architecture or database is able to support organizational or process reengineering changes with minimal modification of the existing objects and relationships, only adding new objects and relationships.
FMA See Failure mode analysis.
Focus group A facilitated group of customers that evaluates a product or service against those of competitors, in order to clearly define customer preferences and quality expectations.
Foolproofing Building edit and validation routines in application programs or procedures to reduce inadvertent human error.
Foreign key A data element in one entity (or relation) that is the primary key of another entity that serves to implement a relationship between the entities.
Fourth Normal Form (4NF) (1) A relation R is in fourth normal form (4NF) if and only if, whenever there exists an MVD in R, say A ->-> B, then all attributes of R are also functionally dependent upon A. In other words, the only dependencies (FDs or MVDs) in R are of the form K -> X (i.e., a functional dependency from a candidate K to some other attribute X). Equivalently, R is in 4NF if it is in BCNF and all MVDs in R are in fact FDs. (2) A table is in 4NF if no row of the table contains two or more independent multivalued facts about an entity.
Frequency distribution The relation number of occurrences of values of an attribute, including a graphic representation of that "distribution" of values.
Functional dependence The degree to which an attribute is an inherent characteristic of an entity type. If an attribute is an inherent characteristic of an entity type, that attribute is fully functionally dependent on any candidate key of that entity type. See Normal form.
Generalization The process of aggregating similar types of objects together in a less-specialized type based upon common attributes and behaviors. The identification of a common supertype of two or more specialized (sub)types. See also Specialization.
Heuristics A method or rule of thumb for obtaining a solution through inference or trial-and-error using approximate methods while evaluating progress toward a goal.
Hidden complaint An unhappy customer who has a complaint about a product or service, but who does NOT tell the provider organization.
Highly summarized Data that is summarized to more than two hierarchies of summarization from the base detail data. Highly summarized data may have lightly summarized data as its source.
Holding the gain Putting in place controls in a process that has been improved to maintain the quality level achieved by the improvement.
Homonym A word or phrase that has the same spelling or sounds the same, but has a different meaning.
Hoshin planning (Hoshin Kanri) Also known as Policy Management or Policy Deployment, is a management technique developed in Japan by combining Management by Objectives and the Plan-Do-Check-Act (PDCA) improvement cycle. Hoshin planning provides a planning, implementation and review process to align business strategy and daily operations through total employee participation to achieve business objectives and breakthrough improvements.
House of quality A mapping of customer quality expectations in product or service to the quality measures of the product or service to summarize all expectations and the work to meet them.
Human error An action performed by a person that is wholly expected to have a positive or satisfactory outcome, but that does not. (Ben Marguglio). Human error is NOT a root cause of defects, rather, human error is predictable, manageable, and human error is preventable.
Human factors Static constraints related to human ergonomic and cognitive limitations.
Hypermedia The convergence of hypertext and multimedia.
Hypertext The ability to organize text data in logical chunks or documents that can be accessed randomly via links as well as sequentially.
Hypothetical reasoning Hypothetical reasoning is a problem-solving approach that explores several different alternative solutions in parallel to determine which approach or series of steps best solves a particular problem. It is useful in business planning or optimization problems, where solutions vary according to cost or where numerous solutions may be feasible.
Identifier One or more attributes that uniquely locate an occurrence of an entity type. conceptually synonymous with primary key.
In control The state of a process characterized by the absence of special causes of variation. Processes in control produce consistent results within acceptable limits of variation.
Inadvertent error Error introduced unconsciously; for example, when a data intermediary unwittingly transposes values or skips a line in data entry. See also Intentional error.
Incremental load The propagation of changed data to a target database or data warehouse in which only the data that has been changed since the last load is loaded or updated in the target.
Informate A term coined by Shoshona Zuboff in The Age of The Smart Machine to described the benefit of information technology when used to capture knowledge about business events so that the knowledge can “informate” other knowledge workers to more intelligently perform their jobs.
Information 1) Data in context, i.e., the meaning given to data or the interpretation of data based on its context; 2) the finished product as a result of processing, presentation and interpretation of data.
Information architecture A "blueprint" of an enterprise expressed in terms of a business process model, showing what the enterprise does; an enterprise information model, showing what information resources are required; and a business information model, showing the relationships of the processes and information.
Information architecture quality A component of information quality measuring the degree to which data models and database design are stable, flexible, and reusable, and implement principles of data structure integrity.
Information assessment See Information quality assessment.
Information chaos A state of the dysfunctional learning organization in which there are unmanaged, inconsistent, and redundant databases that contain data about a single type of thing or fact. The information chaos quotient is the number of unmanaged, inconsistent, and redundant databases containing data about a single type of thing or fact.
Information chaos quotient The count of the number of unmanaged, inconsistent, and redundant databases containing data about a single type of thing or fact.
Information customer-supplier relationship The information stakeholder partnerships between the information producers who create information and the knowledge workers who depend on it.
Information directory A repository or dictionary of the information stored in a data warehouse, including technical and business meta data, that supports all warehouse customers. The technical meta data describes the transformation rules and replication schedules for source data. The business meta data supports the definition and domain specification of the data.
Information float The length of the delay in the time a fact becomes known in an organization to the time in which an interested knowledge worker is able to know that fact. Information float has two components : Manual float is the length of the delay in the time a fact becomes known to when it is first captured electronically in a potentially sharable database. Electronic float is the length in time from when a fact is captured in its electronic form in a potentially sharable database, to the time it is "moved" to a database that makes it accessible to an interested knowledge worker.
Information group A relatively small and cohesive collection of information, consisting of 20-50 attributes and entity types, grouped around a single subject or subset of a major subject. An information group will generally have one or more subject matter experts and several business roles that use the information.
Information life cycle See Information value/cost chain.
Information Management (IM) The function of managing information as an enterprise resource, including planning, organizing and staffing, leading and directing, and controlling information. Information management includes managing data as the enterprise knowledge infrastructure and information technology as the enterprise technical infrastructure, and managing applications across business value chains.
Information model A high-level graphical representation of the information resource requirements of an organization showing the information classes and their relationships.
Information myopia A disease that occurs when knowledge workers can see only part of the information they need, caused by not defining data relationships correctly or not having access to data that is logically related because it exists in multiple nonintegrated databases.
Information policy A statement of important principles and guidelines required to effectively manage and exploit the enterprise information resources.
Information presentation quality The characteristic in which information is presented, whether in a report or document, on a screen, in forms, orally or visually, in a manner to communicate clearly to the recipient knowledge worker to facilitate understanding and enabling taking the right action or making the right decision.
Information preventive maintenance establishing processes to control the creation and maintenance of volatile and critical data to keep it maintained at the highest level feasible, possibly including validating volatile data on an appropriate schedule and assessment of that data before critical processes use it.
Information process improvement The process of improving processes to eliminate data errors and defects. This is one component of data defect prevention. Information process improvement is proactive information quality.
Information producer The role of individuals in which they originate, capture, create, or update data or knowledge as a part of their job function or as part of the process they perform. Information producers create the actual information content and are accountable for its accuracy and completeness to meet all information stakeholders’ needs. See also Data intermediary.
Information product improvement The process of data cleansing, reengineering, and transformation required to improve existing defective data up to an acceptable level of quality. This is one component of information scrap and rework. See also Data cleansing, Data reengineering, and Data transformation. Information product improvement is reactive information quality.
Information product specifications The set of information resource data (meta data) characteristics that define all characteristics for a process and creating/updating applications can produce quality information. Information product specification characteristics include : data name, definition, domain or data value set (code values or ranges) and the business rules that identify policies and constraints on the potential values. These specifications must be understandable to the information producers who create and maintain the data and the knowledge workers who apply the data in their work.
Information quality Consistently meeting all knowledge worker and end-customer expectations in all the characteristics of the information products and services they deem important. The degree to which information consistently meets the requirements and expectations of all knowledge workers who require it to perform their processes.
Information quality assessment The random sampling of a data collection and measuring it against various quality characteristics, such as accuracy, completeness, validity, nonduplication or timeliness to determine its level of quality or reliability. Also called data quality assessment or data audit.
Information quality characteristic An aspect of information that an information customer deems important in order to be considered "quality information." Characteristics include completeness, accuracy, timeliness, understandability, objectivity and presentation clarity, among others.
Information quality contamination The creation of inaccurate derived data by combining accurate data with inaccurate data.
Information quality decay The characteristic of data such that formerly accurate data will become not accurate over time because the characteristic about the real world object will change without a corresponding update to the data applied. For example, John Doe’s marital status value of "single" in a database is subject to information quality decay and will become inaccurate the moment he becomes married.
Information quality decay rate The rate, usually expressed as a percent per year, at which the accuracy of a data collection will deteriorate over time if no data updates are applied, (e.g., (1) person age decay rate is 100% within one year, decaying at a rate of approximately 1.9% per week; (2) if 17% of a population moves annually, the annual decay rate of address is 17%).
Information quality management The function that leads the organization to improve information quality by implementing processes to measure, asses costs of, improve and control information quality, and by providing guidelines, policies, and education for information quality improvement.
Information quality measure(s) A specific quality measure or test (set of measures or tests) to assess information quality. For example, Product Id will be tested for uniqueness, Customer records will be tested for duplicate occurrences, Customer address will be tested to assure it is the correct address, Product Unit of Measure will be tested to be a valid Unit of Measure domain code, and Order Total Price Amount will be tested to assure it has been calculated correctly. Quality measures will be assessed using business rule tests in automated quality analysis software, coded routines in internally developed quality assessment programs, or in physical quality assessment procedures. Some call information quality measures filters or metrics.
Information Resource Management (IRM) (1) The application of generally accepted management principles to data as a strategic business asset. (2) The function of managing data as an enterprise resource. This generally includes operational data management or data administration, strategic information management, repository management, and database administration. See also Information management. (3) The organization unit responsible for providing principles and processes for managing the information assets of the enterprise.
Information scrap and rework The activities and costs required to cleanse or correct nonquality information, to recover from process failure caused by nonquality information, or to rework or work around problems caused by missing or nonquality information. Analogous to manufacturing scrap and rework.
Information stakeholder Any individual who has an interest in and dependence on a set of data or information. Stakeholders may include information producers, knowledge workers, external customers, and regulatory bodies, as well as various information systems roles such as database designers, application developers, and maintenance personnel.
Information steward A role in which an individual has accountability for the quality of some part of the information resource. See Information stewardship.
Information stewardship Accountability for the quality of some part of the information resource for the well-being of the larger organization. Every individual within an organization holds one or more information stewardship roles, based on the nature of their job and its relationship to information, such as creating information, applying it, defining it, modeling it, developing a computer screen to display it or moving it from one database or file to another. See Strategic information steward, Managerial information steward, and Operational information steward.
Information stewardship agreement A formal agreement among business managers specifying the quality standard and target date for information produced in one business area and used in one or more other business areas.
Information value The measure of importance of information expressed in tangible metrics. Information has realized and potential value. Realized value is the actual value derived from information applied by knowledge workers in the accomplishment of the business processes. Potential value is the future value of information that could be realized if applied to business processes in which the information is not currently used.
Information value/cost chain The end-to-end set of processes and data stores, electronic and otherwise, involved in creating, updating, interfacing, and propagating data of a specific type from its origination to its ultimate data store, including independent data entry processes, if any.
Information view A knowledge worker's perceived relationship of the data elements needed to perform a process, showing the structure and data elements required. A process activity has one and only one information view.
Information view model A local data model derived from an enterprise model to reflect the specific information required for one business area or function, one organization unit, one application or system, or one business process.
Intentional error Error introduced consciously. For example, an information producer required to enter an unknown fact like birth date, enters his or her own or some "coded" birth date used to mean "unknown." See also Inadvertent error.
Interface program An application that extracts data from one database, transforms it, and loads it into a non-controlled redundant database. Interface programs represent one cost of information scrap and rework in that the information in the first database is not "able" to be used from that source and must be "reworked" for another process or knowledge worker to use.
Interfaceation The technique of supposedly "integrating" application systems by developing "interface programs" or middleware to extract data in one format from a data source and transform it to another format for a data target rather than by standardizing the data definition and format.
Internal view The physical database design or schema in the ANSI 3-schema architecture.
IRM Acronym for Information Resource Management.
ISO Acronym for International Standards Organization. A European body founded in 1946 to set international standards in all engineering disciplines, including information technology. Its members are national standards bodies; for example, BSI (British Standards Institute). ISO approves standards, including OSI communications protocols and ISO 9000 standards.
ISO 9000 International standards for quality management specifying guidelines and procedures for documenting and managing business processes and providing a system for third-party certification to verify those procedures are followed in actual practice.
Knowledge Information context; understanding of the significance of information.
Knowledge base (1) That part of a knowledge base system in which the rules and definitions used to build the application are stored. The knowledge base may also include a fact or object storage facility. (2) A database where the codification of knowledge is kept; usually a set of rules specified in an if . . . then format.
Knowledge base system A software system whose application-specific information is programmed in the form of rules and stored in a specific facility, known as the knowledge base. The system uses Artificial Intelligence (AI) procedures to mimic human problem-solving techniques, applying the rules stored in the knowledge base and facts supplied to the system to solve a particular business problem.
Knowledge error Information quality error introduced as a result of lack of training or expertise.
Knowledge worker The role of individuals in which they use information in any form as part of their job function or in the course of performing a process, whether operational or strategic. Also referred to as an information consumer or customer. Accountable for work results created as a result of the use of information and for adhering to any policies governing the security, privacy, and confidentiality of the information used.
Legacy data Data that comes from files and/or databases developed without using an enterprise data architecture approach.
Legacy systems Systems that were developed without using an enterprise data architecture approach.
Lifetime value (LTV) See Customer lifetime value.
Lightly summarized Data that is summarized only one or two levels of hierarchy of summary from the base detailed data.
Load To sequentially add a set of records into a database or data warehouse. See also Incremental load.
Lock A means of serializing events or preventing access to data while an application or information producer may be updating that data.
Log A collection of records that describe the events that occur during DBMS execution and their sequence. The information thus recorded is used for recovery in the event of a failure during DBMS execution.
Lower control limit The lowest acceptable value or characteristic in a set of items deemed to be of acceptable quality. Together with the upper control limit, it specifies the boundaries of acceptable variability in an item to meet quality specifications.
LTV Acronym for Customer Lifetime Value.
Managerial information steward The role of accountability a business manager or process ow