Data Warehousing And Data Mining Lecture Notes Pdf

  воскресенье 12 апреля
      73

What is Data warehouse?

The goal of data mining is to unearth relationships in data that may provide useful insights. Data mining tools can sweep through databases and identify previously hidden patterns in one step. An example of pattern discovery is the analysis of retail sales data to identify seemingly unrelated products that are often purchased together.

A data warehouse is a technique for collecting and managing data from varied sources to provide meaningful business insights. It is a blend of technologies and components which allows the strategic use of data.

Data Warehouse is electronic storage of a large amount of information by a business which is designed for query and analysis instead of transaction processing. It is a process of transforming data into information and making it available to users for analysis.

What Is Data Mining?

Data mining is looking for hidden, valid, and potentially useful patterns in huge data sets. Data Mining is all about discovering unsuspected/ previously unknown relationships amongst the data.

It is a multi-disciplinary skill that uses machine learning, statistics, AI and database technology.

The insights extracted via Data mining can be used for marketing, fraud detection, and scientific discovery, etc.

Data Mining Vs Data Warehouse: Key Differences

Data MiningData Warehouse
Data mining is the process of analyzing unknown patterns of data. A data warehouse is database system which is designed for analytical instead of transactional work.
Data mining is a method of comparing large amounts of data to finding right patterns. Data warehousing is a method of centralizing data from different sources into one common repository.
Data mining is usually done by business users with the assistance of engineers. Data warehousing is a process which needs to occur before any data mining can take place.
Data mining is the considered as a process of extracting data from large data sets. On the other hand, Data warehousing is the process of pooling all relevant data together.
One of the most important benefits of data mining techniques is the detection and identification of errors in the system. One of the pros of Data Warehouse is its ability to update consistently. That's why it is ideal for the business owner who wants the best and latest features.
Data mining helps to create suggestive patterns of important factors. Like the buying habits of customers, products, sales. So that, companies can make the necessary adjustments in operation and production. Data Warehouse adds an extra value to operational business systems like CRM systems when the warehouse is integrated.
The Data mining techniques are never 100% accurate and may cause serious consequences in certain conditions. In the data warehouse, there is great chance that the data which was required for analysis by the organization may not be integrated into the warehouse. It can easily lead to loss of information.
The information gathered based on Data Mining by organizations can be misused against a group of people. Data warehouses are created for a huge IT project. Therefore, it involves high maintenance system which can impact the revenue of medium to small-scale organizations.
After successful initial queries, users may ask more complicated queries which would increase the workload. Data Warehouse is complicated to implement and maintain.
Organisations can benefit from this analytical tool by equipping pertinent and usable knowledge-based information. Data warehouse stores a large amount of historical data which helps users to analyze different time periods and trends for making future predictions.
Organisations need to spend lots of their resources for training and Implementation purpose. Moreover, data mining tools work in different manners due to different algorithms employed in their design. In Data warehouse, data is pooled from multiple sources. The data needs to be cleaned and transformed. This could be a challenge.
The data mining methods are cost-effective and efficient compares to other statistical data applications. Data warehouse's responsibility is to simplify every type of business data. Most of the work that will be done on user's part is inputting the raw data.
Another critical benefit of data mining techniques is the identification of errors which can lead to losses. Generated data could be used to detect a drop-in sale. Data warehouse allows users to access critical data from the number of sources in a single place. Therefore, it saves user's time of retrieving data from multiple sources.
Data mining helps to generate actionable strategies built on data insights. Once you input any information into Data warehouse system, you will unlikely to lose track of this data again. You need to conduct a quick search, helps you to find the right statistic information.

Why use Data Warehouse?

Some most Important reasons for using Data warehouse are:

  • Integrates many sources of data and helps to decrease stress on a production system.
  • Optimized Data for reading access and consecutive disk scans.
  • Data Warehouse helps to protect Data from the source system upgrades.
  • Allows users to perform master Data Management.
  • Improve data quality in source systems.

Why use Data mining?

Some most important reasons for using Data mining are:

  • Establish relevance and relationships amongst data. Use this information to generate profitable insights
  • Business can mak informed decisions quickly
  • Helps to find out unusual shopping patterns in grocery stores.
  • Optimize website business by providing customize offers to each visitor.
  • Helps to measure customer's response rates in business marketing.
  • Creating and maintaining new customer groups for marketing purposes.
  • Predict customer defections, like which customers are more likely to switch to another supplier in the nearest future.
  • Differentiate between profitable and unprofitable customers.
  • Identify all kind of suspicious behavior, as part of a fraud detection process.

KEY DIFFERENCE

  • Data mining is considered as a process of extracting data from large data sets, whereas a Data warehouse is the process of pooling all the relevant data together.
  • Data mining is the process of analyzing unknown patterns of data, whereas a Data warehouse is a technique for collecting and managing data.
  • Data mining is usually done by business users with the assistance of engineers while Data warehousing is a process which needs to occur before any data mining can take place
  • Data mining allows users to ask more complicated queries which would increase the workload while Data Warehouse is complicated to implement and maintain.
  • Data mining helps to create suggestive patterns of important factors like the buying habits of customers while Data Warehouse is useful for operational business systems like CRM systems when the warehouse is integrated.

Data Warehousing and Data Mining Notes PDF can be easily download from EduTechLearners without signup or login. Data Warehousing and Data Mining is a subject for students of B.tech of Computer Science & Engineering (CSE).These notes provides information about the Data Warehousing and Data Mining In full details. These notes can be downloaded Unit Wise as well as Zip file files containing all the Unit.

These notes are specially designed in pdf format for easy download and contain Power point presentations Lecture notes in simple and easy languages with full diagrams of architecture and its full explanation.

The Following Lines provides the topics in the specific notes with their download links:-

UNIT‐1: Introduction of Data Mining , Data warehouse and OLAP

  1. Motivation: Why data mining?
  2. What is data mining?
  3. Data Mining: On what kind of data?
  4. Data mining functionality?
  5. Classification of data mining systems
  6. Major issues in data mining
  7. What is a data warehouse?
  8. A multi‐dimensional data model
  9. Data warehouse architecture
  10. Data warehouse implementation
  11. Data warehouse implementation
  12. From data warehousing to data mining

UNIT‐2: Data Pre-processing

Business Analysis. Project Management. Ccna volunteer internship program. Candidates can gain internship and work experience in the following areas:. IT. Networking.

  1. Why preprocess the data?
  2. Data cleaning
  3. Data integration and transformation
  4. Data reduction
  5. Discretization and concept hierarch generation

UNIT‐3: Data Mining Primitives, Languages, and System Architectures

  1. Data mining primitives: What defines a data mining task?
  2. A data mining query language
  3. Design graphical user interfaces based on a data mining query language
  4. Architecture of data mining systems

UNIT‐4: Characterization and Comparison

  1. What is concept description?
  2. Data generalization and summarization‐based characterization
  3. Analytical characterization:Analysis of attribute relevance
  4. Mining class comparisons: Discriminating between different classes
  5. Mining descriptive statistical measures in large databases

UNIT‐5: Mining Association Rules in Large Databases

  1. Association rule mining
  2. Mining single‐dimensional Boolean association rules from transactional databases
  3. Mining multilevel association rules from transactional databases
  4. Mining multidimensional association rules from transactional databases and data warehouse
  5. From association mining to correlation analysis
  6. Constraint‐based association mining

UNIT‐6: Classification and Prediction

  1. What is classification? What is prediction?
  2. Issues regarding classification and prediction
  3. Classification by decision tree induction
  4. Bayesian Classification
  5. Classification by backpropagation
  6. Classification based on concepts from association rule mining
  7. Other Classification Methods
  8. Prediction
  9. Classification accuracy

UNIT‐7: Cluster Analysis

  1. What is Cluster Analysis?
  2. Types of Data in Cluster Analysis
  3. A Categorization of Major Clustering Methods
  4. Partitioning Methods
  5. Hierarchical Methods
  6. Density‐Based Methods
  7. Grid‐Based Methods
  8. Model‐Based Clustering Methods
  9. Outlier Analysis

UNIT‐8: Mining Complex Types of Data

  1. Multidimensional analysis and descriptive mining of complex data objects
  2. Mining spatial databases
  3. Mining multimedia databases
  4. Mining time‐series and sequence data
  5. Mining text databases
  6. Mining the World‐Wide Web

Download Complete Package:-

Feel free to comment below regarding notes.

If you wants some more notes on any of the topics please mail to us or comment below. We will provide you as soon as possible and if you want your’s notes to be published on our site then feel free to contribute on EduTechLearners or mail your content to contribute@edutechlearners.com ( The contents will be published by your Name).

You might also like these posts