网站首页  软件下载  游戏下载  翻译软件  电子书下载  电影下载  电视剧下载  教程攻略

请输入您要查询的图书:

 

书名 基于开源工具的数据分析(影印版)
分类
作者 (美)雅奈特
出版社 东南大学出版社
下载
简介
编辑推荐

《基于开源工具的数据分析(影印版)》(作者Philipp K.Janert)使用图形来描述带有一个、两个或者十多个变量的数据;使用粗略计算以及维度和概率参数来开发概念模型;使用诸如模拟和聚类的集约计算方法来挖掘数据;通过报告、信息板和其他度量程序来让你的结论更容易理解;理解财务计算,包括货币时间价值;利用降维技术或者预测分析来克服数据分析过程中面临的挑战;熟悉数据分析的不同开源编程环境。

内容推荐

数据收集相对比较简单,而要把原始信息转化为有用的数据则需要知道如何精确地抽取你想要的内容。通过这《基于开源工具的数据分析(影印版)》(作者Philipp K.Janert)的深入讲解,那些对数据分析感兴趣的中等或者富有经验的程序员将可以学习到在商业环境中与数据打交道的技术。你将了解到如何观察数据来找出它所包含的信息,如何在概念模型里捕捉到这些想法,然后把你的理解通过商业计划、度量标准的精确报告和其他方式反馈给你所在的机构。

你将会通过《基于开源工具的数据分析(影印版)》每章结束部分的动手实践来慢慢体验各种概念。最重要的是,你将了解到如何思考你所希望获取的数据——而不是依赖于工具来替你思考。

目录

PREFACE

1 INTRODUCTION

Data Analysis

What's in This Book

What's with the Workshops?

What's with the Math?

What You'll Need

What's Missing

PART I Graphics: Looking at Data

2 A SINGLE VARIABLE: SHAPE AND DISTRIBUTION

Dot andJitter Plots

Histograms and Kernel Density Estimates

The Cumu/atiue Distribution Function

Rank-Order Plots and Lilt Charts

Only When Appropriate: Summary Statistics and Box Plots

Workshop: NumPy

Further Reading

3 TWO VARIABLES: ESTABLISHING RELATIONSHIPS

Scatter Plots

Conquering Noise: 5moothing

Logarithmic Plots

Banking

Linear ReRression and All That

Shouwing What's Important

Graphical Analysis and Presentation Graphics

Workshop: matplotlib

Further Reading

TIME AS A VARIABLE: TIME-SERIES ANALYSIS

Examples

The Task

Smoothing

Don't Ouerlook the Obuious!

The Correlation Function

Optional: Filters and Conuolutions

Workshop: scipy.signal

Further ReadinR

5 MORE THAN TWO VARIABLES: GRAPHICAL MULTIVARIATE ANALYSIS

False-Color Plots

A Lot at a Glance: Multiplots

Composition Problems

Nouel Plot Types

Interactiue Explorations

Workshop: Tools for Multiuariate Graphics

Further ReadinR

6 INTERMEZZO: A DATA ANALYSIS SESSION

A Data Analysis Session

Workshop: gnuplot

Further ReadinR

PART II Analyticg: Modeling Data

7 GUESSTIMATION AND THE BACK OF THE ENVELOPE

Principles of Guesstimation

How Good Are Those Numbers?

Optional: A Closer Look at Perturbation Theory and

Error PropaRation

Workshop: The Gnu Scientific Library (GSL)

Further Reading

8 MODELS FROM SCALING ARGUMENTS

Models

ArRuments from Scale

Mean-Field Approximations

Common Time-Euolution Scenarios

Case Study: How Many Seruers Are Best?

Why Modeling?

Workshop: Sage

Further Reading

9 ARGUMENTS FROM PROBABILITY MODELS

The. Binomial Distribution and Bernoulli Trials

The Gaussian Distribution and the Central Limit Theorem

Power-Law Distributions and Non-Normal Statistics

Other Distributions

Optional: Case Study--Unique Visitors ouer Time

Workshop: Power-Law Distributions

Further Reading

10 WHAT YOU REALLY NEED TO KNOW ABOUT CLASSICAL STATISTICS

Genesis

Statistics Defined

Statistics Explained

Controlled Experiments Versus Obseruationa} Studies

Optional: Bayesian Statistics--The Other Point of View

Workshop: R

Further Reading

11 INTERMEZZO:MYTHBUSTING--BIGFOOT, LEAST SQUARES, AND ALL THAT

How to Auerage Auerages

The Standard Deuiation

Least Squares

Further Reading

PART III Computation: Mininhg Data

12 SIMULATIONS

A Warm-Up Question

Monte Carlo Simulations

Resampling Methods

Workshop: Discrete Euent Simulations with Simpy

Further Reading

13 FINDING CLUSTERS

What Constitutes a Cluster?

Distance and Similarity Measures

Clustering Methods

Pre-and Postprocessing

Other ThouRhts

A Special Case: Market BasketAnalysis

A Word of WarninR

Workshop: P/cluster and the C Clustering Library

Further Reading

14 SEEING THE FOREST FOR THE TREES: FINDING

IMPORTANT ATTRIBUTES

Principal Component Analysis

Visual Techniques

Kohonen Maps

Workshop: PCA with R

Further Readin2

15 INTERMEZZO:WHEN MORE IS DIFFERENT

A Horror Story

Some Suggestions

What About Map/Reduce?

Workshop: Generating Permutations

Further Reading

PART IV Applications: Using Data

16 REPORTING, BUSINESS INTELLIGENCE, AND DASHBOARDS

Business Intelligence

Corporate Metrics and Dashboards

Data Quality Issues

Workshop: Berkeley DB and SQLite

Further Reading

17 FINANCIAL CALCULATIONS AND MODELING

The Time Value o[ Money

Uncertainty in Planning and Opportunity Costs

Cost Concepts and Depreciation

Should You Care?

Is This All That Matters?

Workshop: The Newsuendor Problem

Further Reading

18 PREDICTIVE ANALYTICS

Introduction

Some Classification Terminology

Algorithms for Classification

The Process

The Secret Sauce

The Nature o[ Statistical Learning

Workshop: Two Do-lt-Yoursel Classifiers

Further Reading

19 EPILOGUE: FACTS ARE NOT REALITY

A PROGRAMMING ENVIRONMENTS FOR SCIENTIFIC COMPUTATION

AND DATA ANALYSIS

Software Tools

A Catalog of Scientific Software

Writing Your Own

Further Reading

B RESULTS FROM CALCULUS

Common Functions

Calculus

Useful Tricks

 Notation and Basic Math

 Where to Go from Here

 Further Readin9

 WORKING WITH DATA

 Sources for Data

 Cleanin9 and ConditioninR

 Sarnplin9

 Data File Formats

 The Care and Feeding of Your Data Zoo

 Skills

 Terminology

 Further Fleadin9

INDEX

随便看

 

霍普软件下载网电子书栏目提供海量电子书在线免费阅读及下载。

 

Copyright © 2002-2024 101bt.net All Rights Reserved
更新时间:2025/4/23 5:48:27