简介 |
![]()
内容推荐 Java是从事实践工作的数据科学家的主力语言,不少Hadoop生态系统都基于Java,数据科学领域中大多数生产系统绝对都是用其编写的。如果你了解Java,乌代·卡马特、克里希纳·查普佩拉著的这本《精通Java机器学习(影印版)(英文版)》就是你迈向成为数据科学高级从业者的下一步。 本书旨在为你介绍机器学习领域的一系列先进技术,包括分类、聚类、异常检测、流学习、主动学习、半监督学习、概率图建模、文本挖掘、深度学习、大数据批处理以及流机器学习。每章都附有说明性示例和真实案例研究,展示如何使用合理的方法和当前最好的Java工具来运用新学到的技术。 阅读完本书后,你将理解构建能够解决任何领域中的数据科学问题的强大机器学习模型所需的工具和技术。 目录 Preface Chapter 1: Machine Learning Review Machine learning - history and definition What is not machine learning? Machine learning - concepts and terminology Machine learning - types and subtypes Datasets used in machine learning Machine learning applications Practical issues in machine learning Machine learning - roles and process Roles Process Machine learning -tools and datasets Datasets Summary Chapter 2: Practical Approach to Real-World Supervised Learning Formal description and notation Data quality analysis Descriptive data analysis Basic label analysis Basic feature analysis Visualization analysis Univariate feature analysis Multivariate feature analysis Data transformation and preprocessing Feature construction Handling missing values Outliers Discretization Data sampling Is sampling needed? Undersampling and oversampling Training, validation, and test set Feature relevance analysis and dimensionality reduction Feature search techniques Feature evaluation techniques Filter approach Wrapper approach Embedded approach Model building Linear models Linear Regression Naive Bayes Logistic Regression Non-linear models Decision Trees K-Nearest Neighbors (KNN) Support vector machines (SVM) Ensemble learning and meta learners Bootstrap aggregating or bagging Boosting Model assessment, evaluation, and comparisons Model assessment Model evaluation metrics Confusion matrix and related metrics ROC and PRC curves Gain charts and lift curves Model comparisons Comparing two algorithms Comparing multiple algorithms Case Study - Horse Colic Classification Business problem Machine learning mapping Data analysis Label analysis Features analysis Supervised learning experiments Weka experiments RapidMiner experiments Results, observations, and analysis Summary References Chapter 3: Unsupervised Machine Learninq Techniques …… Chapter 4: Semi-Supervised and Active Learning Chapter 5: Real-Time Stream Machine Learning Chapter 6: Probabilistic Graph Modeling Chapter 7: Deep Learning Chapter 8: Text Mining and Natural Language Processing Chapter 9: Bia Data Machine Learnina - The Final Frontier Appendix A: Linear Algebra Appendix B: Probability Index
|