TERM PAPER FOR OPERATING SYSTEMS
DATA WAREHOUSES, DECISION SUPPORT AND DATA MINING
Date: 09/11/2011
“I certify that the work contained in this paper is wholly mine. This paper has not been used to meet requirements in another course. It has not been purchased nor written by someone else, nor written for me. Exceptions to the aforementioned constitute plagiarism and an honor and ethics violation and therefore will result in a course grade of F and any other University remedies as appropriate.”
Data Warehouses, Decision Support and Data Mining
Abstract
Data warehousing and on-line analytical processing (OLAP) are key elements of decision support which has primarily become focus on database
…show more content…
Data warehouses, in contrast, are targeted for decision support. Historical, summarized and consolidated data is more important than detailed, individual records. Since data warehouses contain consolidated data, perhaps from several operational databases, over potentially long periods of time, they tend to be orders of magnitude larger than operational databases; enterprise data warehouses are projected to be hundreds of gigabytes to terabytes in size. The workloads are query intensive with mostly ad hoc, complex queries that can access millions of records and perform a lot of scans, joins, and aggregates. Query throughput and response times are more important than transaction throughput.
To facilitate complex analyses and visualization, the data in a warehouse is typically modeled multidimensionally. For example, in a sales data warehouse, time of sale, sales district, salesperson, and product might be some of the dimensions of interest. Often, these dimensions are hierarchical; time of sale may be organized as a day-month-quarter-year hierarchy, product as a product-category-industry hierarchy.
Many organizations want to implement an integrated enterprise warehouse that collects information about all subjects (e.g., customers, products, sales, assets, personnel) spanning the whole organization. However, building an enterprise warehouse is a long and complex process, requiring extensive business modeling, and may take many years to succeed. Some
An active data warehousing, or ADW, is a data warehouse implementation that supports near-time or near-real-time decision making. It is featured by event-driven actions that are triggered by a continuous stream of queries that are generated by people or applications regarding an organization or company against a broad, deep granular set of enterprise data. Continental uses active data warehousing to keep track of their company’s daily progress and performance. Continental’s management team holds an operations meeting every morning to discuss how their
This data is collected and organized in order to process orders and maintain good customer service. The logical view of data would allow a knowledge worker to arrange and access information based on the needs of the business separating it from the physical view of how information is arranged and stored. The ability to do this allows for an employee to create detailed reports in order to determine information such as customer information and their order numbers and dates. This is imperative for a company like Comcast who has over 27 million customers in order to have a system to keep important data to analyze. Using a data warehouse allows them to gather from several databases and then the company can use the information to determine for example how many units of voice products are sold to create the necessary business intelligence to make future decisions and remain
One of the main functions of any business is to be able to use data to leverage a strategic competitive advantage. The use of relational databases is a necessity for contemporary organizations; however, data warehousing has become a strategic priority due to the enormous amounts of data that must be analyzed along with the varying sources from which data comes. Company gathers data by using Web analytics and operational systems, we must design a solution overview that incorporates data warehousing. The executive team needs to be clear about what data warehousing can provide the company.
What information is accessible? The data warehouse offers possibilities to define what’s offered through metadata, published information, and parameterized analytic applications. Is the data of high value? Data warehouse patrons assume reliability and value. The presentation area’s data must be correctly organized and harmless to consume. In terms of design, the presentation area would be planned for the luxury of its consumers. It must be planned based on the preferences articulated by the data warehouse diners, not the staging supervisors. Service is also serious in the data warehouse. Data must be transported, as ordered, promptly in a technique that is pleasing to the business handler or reporting/delivery application designer. Lastly, cost is a feature for the data
A data warehouse is a large databased organized for reporting. It preserves history, integrates data from multiple sources, and is typically not updated in real time. The key components of data warehousing is the ability to access data of the operational systems, data staging area, data presentation area, and data access tools (HIMSS, 2009). The goal of the data warehouse platform is to improve the decision-making for clinical, financial, and operational purposes.
Companies and organizations all over the world are blasting on the scene with data mining and data warehousing trying to keep an extreme competitive leg up on the competition. Always trying to improve the competiveness and the improvement of the business process is a key factor in expanding and strategically maintaining a higher standard for the most cost effective means in any business in today’s market. Every day these facilities store large amounts of data to improve increased revenue, reduction of cost, customer behavior patterns, and the predictions of possible future trends; say for seasonal reasons. Data
Enterprise Data Warehouses (EDW) have become the foundation of many enterprises' systems of record, serving as the catalyst of strategic initiatives encompassing Customer Relationship Management (CRM), Supply Chain Management SCM) and the pervasive adoption of analytics and Business Intelligence (BI) throughout enterprises. The role of databases continues to be an ancillary one, supporting the overall structural and data integrity of the EDW and increasing its value to the overall enterprise (Phillips, 1997). The advances made over the last decade in the areas of Extra, Transact & Load (ETL) have made it possible to create EDW frameworks and platforms more efficiently, creating greater accuracy in overall database and data warehouse performance as a result (Ballou, Tayi, 1999). The creation and use of an EDW to further drive an organization to its objectives requires that the differences between databases and data warehouses be defined, in addition to a clear, concise definition of just what data warehouse technologies are. Finally, the relationship between data warehouses and business intelligence (BI) including analytics needs analysis and validation. Each of these three areas are discussed in this analysis.
A data warehouse and business intelligence application was created as part of the Orion Sword Group project providing business intelligence to order and supply chain management to users. I worked as part of a group of four students to implement a solution. This report reflects on the process undertaken to design and implement the solution as well as my experience and positive learning outcome.
In the early '90s, data warehousing applications were either strategic or tactical in nature. Trending and detecting patterns was the typical focus of many solutions. Now, companies are implementing data warehouses or operational data stores which meet both strategic and operational needs. The business need for these solutions usually comes from the desire to make near
Businesses today continue to strive and grow in the industry to keep up with the never ending changes in the business they need the tools to obtain information that can be used to make decisions for the business. The decisions to make in a business can consist of knowing what geographic region to focus on, which product lines to expand, and what markets to strengthen in the industry. To obtain the type of information that has the proper content and format that can assist with strategic decisions they turned to data warehousing. It became the new paradigm intended specifically for vital strategic information.
In the future, the development is more focused on Big Data where the requirement of availability of information increase directly with the complexities of decision making increase, thus the requirement of data infrastructure need larger and more analytically to align with knowledge and decision-supporting technologies (Hosack et al., 2012). Increasing information available to KMDSS through data warehouse capabilities may be useful to the several industries. DSS has been on the forefront not only of new technologies, but of new ways to address existing business problems and processes. The nature of DSS is to continuously improve the decision-making processes that, in turn, improve the efficiencies of
Data Warehouses and Data Marts: A Dynamic View By Joseph M. Firestone, Ph.D. White Paper No. Three March 27, 1997
Furthermore, the Gartner website argues that “BI has become a strategic initiative and is now recognised by chief information officers (CIOs) and business leaders as instrumental in driving business effectiveness and innovation,” (Anon., 2007). Gartner also argues that “BI projects were the number one technology priority for 2007” (Anon., 2007). According to the Bill Inmon, data warehouse is “a subject-oriented, integrated, time variant and non-volatile collection of data used in strategic decision making”. Hammergen & Simon, (2009) define data warehouse more simpler by saying that “ Data warehousing is therefore the process of creating an architected information management solution to enable analytical and information processing despite platform, application, organizational, and other barriers.“ It is important to note that data warehouse system is different from relational database. The reasons of that are: (1) In the data warehouse data is stored for long term; (2) DW is designed for high performance for analytical queries; (3) its OLAP (Online Analytical Processing) technology enables to view data in various form; (4) linking between tables are simple (Tushman, 2014). Databases, in contrast, have a low performance regarding data analysis; joins between tables are
Answer: The term data warehouse is often used to refer to a system that extracts data from one or more sources, in order to transform and store in a model suitable for presentation and analysis. It can also be used to refer to just the database used in the aforementioned type of system. There are two main approaches to building a data warehouse, the Kimball approach and the Inmon approach.
A data warehouse (DW) can be acknowledged as one of the most complex information system modules available and it is a system that periodically retrieves and consolidates data from the sources into a dimensional or normalized data store. It is an integrated, subject-oriented, nonvolatile and a time-variant collection of data in support of management’s decisions (Inmon, 1993).