Zapisane w:
Opis bibliograficzny
Główni autorzy: DR. KOUSHIK MANDAL, SONALI PARUA
Format: Recurso digital
Język:
Wydane: Zenodo 2025
Dostęp online:https://doi.org/10.5281/zenodo.15879471
Etykiety: Dodaj etykietę
Nie ma etykietki, Dołącz pierwszą etykiete!
Spis treści:
  • <p>In recent years, we have witnessed a profound transformation <br>in the way data is generated, stored, analyzed, and ultimately utilized <br>to inform decisions across nearly every domain of human activity. <br>This era of Big Data has catalyzed an unprecedented synergy between <br>traditional statistical methods and the fast-evolving field of machine <br>learning. As organizations and researchers navigate massive <br>datasets—from consumer behavior logs to genomic sequences—the <br>ability to extract meaningful insights depends increasingly on a robust <br>statistical foundation interwoven with computational efficiency and <br>algorithmic sophistication. <br>This book, Statistical Methods for Big Data and Machine <br>Learning, is designed to bridge that critical intersection. It provides an <br>integrated view of the statistical underpinnings essential for modern <br>data science, while also offering practical insights into how these <br>concepts manifest within machine learning pipelines, tools, and real<br>world applications. Whether applied to model credit risk, optimize <br>marketing strategies, understand public health trends, or power <br>recommendation systems, the statistical methods covered here are <br>fundamental to extracting value from data at scale. <br>The early chapters establish a foundational understanding of <br>statistics—both descriptive and inferential—before progressively <br>moving toward advanced topics such as regularization, resampling, <br>Bayesian inference, and ensemble learning. The book addresses <br>essential questions: How do we model uncertainty in large-scale <br>systems? How do we select features from high-dimensional data? <br>What are the ethical considerations when deploying models in <br>sensitive contexts? Each chapter not only introduces theoretical <br>frameworks but also emphasizes implementation, interpretability, and <br>critical thinking, ensuring readers are not only capable of using <br>statistical methods, but also of questioning their limitations and <br>implications. The data analysis section delves into the intricacies of <br>working with structured, semi-structured, and unstructured data. <br>Concepts such as estimates of location and variability, as well as <br>categorical data analysis, are covered in depth. A dedicated segment <br>on database systems, including relational databases, NoSQL, and <br>query optimization, ensures that readers understand the infrastructure <br>behind data storage and retrieval. <br>A key strength of this text lies in its holistic structure. We <br>have organized the material to suit a wide audience: undergraduate and <br>graduate students in statistics, data science, or computer science; <br>professionals in industry seeking to deepen their analytical skills; and <br>educators looking for comprehensive teaching material. From a <br>thorough discussion of machine learning’s statistical roots to hands<br>on case studies rooted in global and Indian contexts, this book aims to <br>be as accessible as it is rigorous. <br>The journey from raw data to actionable insight is not merely <br>a technical one; it is also conceptual, ethical, and deeply human. As <br>data volumes continue to grow, so too must our responsibility to use <br>them wisely. Our hope is that this book serves not just as a guide to <br>methods, but also as a catalyst for responsible, impactful data-driven <br>discovery.</p>