内容推荐 机器学习系统既复杂又独特。复杂是因为包含大量组件,涉及许多不同的利益方;独特是因为其依赖于数据,不同用例之间的数据差异很大。在本书中,你将学习以一种整体方法来设计兼具可靠性、可伸缩性、可维护性,并能适应不断变化的环境和业务需求的机器学习系统。 作者Chip Huyen是CIaypot AI的联合创始人,她在如何帮助系统作为一个整体实现其目标的背景下考虑了每一种设计决策,例如如何处理和创建训练数据,使用哪些特性,重新训练模型的频率,以及监测哪些内容。书中的迭代框架采用了真实的案例研究,并辅以大量参考资料。 这本书将帮助你处理以下情况: 工程化数据并选择正确的指标来解决业务问题 实现持续开发、评估、部署和更新模型的流程自动化 开发监控系统,快速检测和解决模型在生产中可能遇到的问题 构建跨用例服务的机器学习平台 开发可靠的机器学习系统 作者简介 奇普·胡岩是实时机器学习平台Claypot Al的联合创始人。凭借在NVIDIA、Netflix和Snorkel AI的工作,她帮助了一些世界上最大的组织开发和部署机器学习系统。 目录 Preface 1. Overview of Machine Learning Systems When to Use Machine Learning Machine Learning Use Cases Understanding Machine Learning Systems Machine Learning in Research Versus in Production Machine Learning Systems Versus Traditional Software Summary 2. Introduction to Machine Learning Systems Design Business and ML Objectives Requirements for ML Systems Reliability Scalability Maintainability Adaptability Iterative Process Framing ML Problems Types of ML Tasks Objective Functions Mind Versus Data Summary 3. Data Engineering Fundamentals Data Sources Data Formats ISON Row-Major Versus Column-Major Format Text Versus Binary Format Data Models Relational Model NoSQL Structured Versus Unstructured Data Data Storage Engines and Processing Transactional and Analytical Processing ETL: Extract, Transform, and Load Modes of Dataflow Data Passing Through Databases Data Passing Through Services Data Passing Through Real-Time Transport Batch Processing Versus Stream Processing Summary 4. Training Data Sampling Nonprobability Sampling Simple Random Sampling Stratified Sampling Weighted Sampling Reservoir Sampling Importance Sampling Labeling Hand Labels Natural Labels Handling the Lack of Labels Class Imbalance Challenges of Class Imbalance Handling Class Imbalance Data Augmentation Simple Label-Preserving Transformations Perturbation Data Synthesis Summary 5. Feature Engineering Learned Features Versus Engineered Features Common Feature Engineering Operations Handling Missing Values Scaling Discretization Encoding Categorical Features Feature Crossing Discrete and Continuous Positional Embeddings Data Leakage Common Causes for Data Leakage Detecting Data Leakage Engineering Good Features Feature Importance Feature Generalization Summary 6. Model Development and 0ffline Evaluation Model Development and Training Evaluating ML Models Ensembles Experiment Tracking and Versioning Distributed Training AutoML Model Offline Evaluation Baselines Evaluation Methods Summary 7. Model Deployment and Prediction Service Machine Learning Deployment Myths Myth 1: You Only Deploy One or Two ML Models at a Time Myth 2: If We Don't Do Anything, Model Performance Remains the Same Myth 3: You Won't Need to Update Your Models as Much Myth 4: Most ML Engineers Don't Need to Worry About Scale Batch Prediction Versus Online Prediction From Batch Prediction to Online Prediction Unifying Batch Pipeline and Streaming Pipeline Model Compression Low-Rank Factorization Knowledge Distillation Pruning Quantization ML on the Cloud and on the Edge Compiling and Optimizing Models for Edge Devices ML in Browsers Summary 8. Data Distribution Shifts and Monitoring Causes of ML System Failures Software System Failures ML-Specific Failures Data Distribution Shifts Types of Data Distribution Shifts General Data Distribution Shifts Detecting Data Distribution Shifts Addressing Data Distribution Shifts Monitoring and Observability ML-Specific Metrics Monitoring Toolbox Observability Summary 9. Continual Learning and Test in Production Continual Learning Stateless Retraining Versus Stateful Training Why Continual Learning? Continual Learning Challenges Four Stages of Continual Learning How Often to Update Your Models Test in Production Shadow Deployment A/B Testing Canary Release Interleaving Experiments Bandits Summary 10. Infrastructure and Tooling for MLOps Storage and Compute Public Cloud Versus Private Data Centers Development Environment Dev Environment Setup Standardizing Dev Env |