Skip to main content Site map

Big Data Fundamentals: Concepts, Drivers & Techniques


Big Data Fundamentals: Concepts, Drivers & Techniques

Paperback by Erl, Thomas; Khattak, Wajid; Buhler, Paul

Big Data Fundamentals: Concepts, Drivers & Techniques

WAS £29.49   SAVE £4.42

£25.07

ISBN:
9780134291079
Publication Date:
13 Jan 2016
Language:
English
Publisher:
Pearson Education (US)
Imprint:
Pearson
Pages:
240 pages
Format:
Paperback
For delivery:
Estimated despatch 23 May 2024
Big Data Fundamentals: Concepts, Drivers & Techniques

Description

"This text should be required reading for everyone in contemporary business." --Peter Woodhull, CEO, Modus21 "The one book that clearly describes and links Big Data concepts to business utility." --Dr. Christopher Starr, PhD "Simply, this is the best Big Data book on the market!" --Sam Rostam, Cascadian IT Group "...one of the most contemporary approaches I've seen to Big Data fundamentals..." --Joshua M. Davis, PhD The Definitive Plain-English Guide to Big Data for Business and Technology Professionals Big Data Fundamentals provides a pragmatic, no-nonsense introduction to Big Data. Best-selling IT author Thomas Erl and his team clearly explain key Big Data concepts, theory and terminology, as well as fundamental technologies and techniques. All coverage is supported with case study examples and numerous simple diagrams. The authors begin by explaining how Big Data can propel an organization forward by solving a spectrum of previously intractable business problems. Next, they demystify key analysis techniques and technologies and show how a Big Data solution environment can be built and integrated to offer competitive advantages. Discovering Big Data's fundamental concepts and what makes it different from previous forms of data analysis and data science Understanding the business motivations and drivers behind Big Data adoption, from operational improvements through innovation Planning strategic, business-driven Big Data initiatives Addressing considerations such as data management, governance, and security Recognizing the 5 "V" characteristics of datasets in Big Data environments: volume, velocity, variety, veracity, and value Clarifying Big Data's relationships with OLTP, OLAP, ETL, data warehouses, and data marts Working with Big Data in structured, unstructured, semi-structured, and metadata formats Increasing value by integrating Big Data resources with corporate performance monitoring Understanding how Big Data leverages distributed and parallel processing Using NoSQL and other technologies to meet Big Data's distinct data processing requirements Leveraging statistical approaches of quantitative and qualitative analysis Applying computational analysis methods, including machine learning

Contents

Acknowledgments xvii Reader Services xviii PART I: THE FUNDAMENTALS OF BIG DATA Chapter 1: Understanding Big Data 3 Concepts and Terminology 5 Datasets 5 Data Analysis 6 Data Analytics 6 Descriptive Analytics 8 Diagnostic Analytics 9 Predictive Analytics 10 Prescriptive Analytics 11 Business Intelligence (BI) 12 Key Performance Indicators (KPI) 12 Big Data Characteristics 13 Volume 14 Velocity 14 Variety 15 Veracity 16 Value 16 Different Types of Data 17 Structured Data 18 Unstructured Data 19 Semi-structured Data 19 Metadata 20 Case Study Background 20 History 20 Technical Infrastructure and Automation Environment 21 Business Goals and Obstacles 22 Case Study Example 24 Identifying Data Characteristics 26 Volume 26 Velocity 26 Variety 26 Veracity 26 Value 27 Identifying Types of Data 27 Chapter 2: Business Motivations and Drivers for Big Data Adoption 29 Marketplace Dynamics 30 Business Architecture 33 Business Process Management 36 Information and Communications Technology 37 Data Analytics and Data Science 37 Digitization 38 Affordable Technology and Commodity Hardware 38 Social Media 39 Hyper-Connected Communities and Devices 40 Cloud Computing 40 Internet of Everything (IoE) 42 Case Study Example 43 Chapter 3: Big Data Adoption and Planning Considerations 47 Organization Prerequisites 49 Data Procurement 49 Privacy 49 Security 50 Provenance 51 Limited Realtime Support 52 Distinct Performance Challenges 53 Distinct Governance Requirements 53 Distinct Methodology 53 Clouds 54 Big Data Analytics Lifecycle 55 Business Case Evaluation 56 Data Identification 57 Data Acquisition and Filtering 58 Data Extraction 60 Data Validation and Cleansing 62 Data Aggregation and Representation 64 Data Analysis 66 Data Visualization 68 Utilization of Analysis Results 69 Case Study Example 71 Big Data Analytics Lifecycle 73 Business Case Evaluation 73 Data Identification 74 Data Acquisition and Filtering 74 Data Extraction 74 Data Validation and Cleansing 75 Data Aggregation and Representation 75 Data Analysis 75 Data Visualization 76 Utilization of Analysis Results 76 Chapter 4: Enterprise Technologies and Big Data Business Intelligence 77 Online Transaction Processing (OLTP) 78 Online Analytical Processing (OLAP) 79 Extract Transform Load (ETL) 79 Data Warehouses 80 Data Marts 81 Traditional BI 82 Ad-hoc Reports 82 Dashboards 82 Big Data BI 84 Traditional Data Visualization 84 Data Visualization for Big Data 85 Case Study Example 86 Enterprise Technology 86 Big Data Business Intelligence 87 PART II: STORING AND ANALYZING BIG DATA Chapter 5: Big Data Storage Concepts 91 Clusters 93 File Systems and Distributed File Systems 93 NoSQL 94 Sharding 95 Replication 97 Master-Slave 98 Peer-to-Peer 100 Sharding and Replication 103 Combining Sharding and Master-Slave Replication 104 Combining Sharding and Peer-to-Peer Replication 105 CAP Theorem 106 ACID 108 BASE 113 Case Study Example 117 Chapter 6: Big Data Processing Concepts 119 Parallel Data Processing 120 Distributed Data Processing 121 Hadoop 122 Processing Workloads 122 Batch 123 Transactional 123 Cluster 124 Processing in Batch Mode 125 Batch Processing with MapReduce 125 Map and Reduce Tasks 126 Map 127 Combine 127 Partition 129 Shuffle and Sort 130 Reduce 131 A Simple MapReduce Example 133 Understanding MapReduce Algorithms 134 Processing in Realtime Mode 137 Speed Consistency Volume (SCV) 137 Event Stream Processing 140 Complex Event Processing 141 Realtime Big Data Processing and SCV 141 Realtime Big Data Processing and MapReduce 142 Case Study Example 143 Processing Workloads 143 Processing in Batch Mode 143 Processing in Realtime 144 Chapter 7: Big Data Storage Technology 145 On-Disk Storage Devices 147 Distributed File Systems 147 RDBMS Databases 149 NoSQL Databases 152 Characteristics 152 Rationale 153 Types 154 Key-Value 156 Document 157 Column-Family 159 Graph 160 NewSQL Databases 163 In-Memory Storage Devices 163 In-Memory Data Grids 166 Read-through 170 Write-through 170 Write-behind 172 Refresh-ahead 172 In-Memory Databases 175 Case Study Example 179 Chapter 8: Big Data Analysis Techniques 181 Quantitative Analysis 183 Qualitative Analysis 184 Data Mining 184 Statistical Analysis 184 A/B Testing 185 Correlation 186 Regression 188 Machine Learning 190 Classification (Supervised Machine Learning) 190 Clustering (Unsupervised Machine Learning) 191 Outlier Detection 192 Filtering 193 Semantic Analysis 195 Natural Language Processing 195 Text Analytics 196 Sentiment Analysis 197 Visual Analysis 198 Heat Maps 198 Time Series Plots 200 Network Graphs 201 Spatial Data Mapping 202 Case Study Example 204 Correlation 204 Regression 204 Time Series Plot 205 Clustering 205 Classification 205 Appendix A: Case Study Conclusion 207 About the Authors 211 Thomas Erl 211 Wajid Khattak 211 Paul Buhler 212 Index 213

Back

Middlesex University logo