信息检索导论(英文版)/图灵原版计算机科学系列(美)曼宁//拉哈万//(德)舒策人民邮电出版社豆瓣PDF电子书bt网盘迅雷下载计算机-操作系统-霍普软件下载网

1 Boolean retrieval 1

1.1 An example information retrieval problem 3

1.2 A first take at building an inverted index 6

1.3 Processing Boolean queries 9

1.4 The extended Boolean model versus ranked retrieval 13

1.5 References and further reading 16

2 The term vocabulary and postings lists 18

2.1 Document delineation and character sequence decoding 18

2.2 Determining the vocabulary of terms 21

2.3 Faster postings list intersection via skip pointers 33

2.4 Positional postings and phrase queries 36

2.5 References and further reading 43

3 Dictionaries and tolerant retrieval 45

3.1 Search structures for dictionaries 45

3.2 Wildcard queries 48

3.3 Spelling correction 52

3.4 Phonetic correction 58

3.5 References and further reading 59

4 Index construction 61

4.1 Hardware basics 62

4.2 Blocked sort-based indexing 63

4.3 Single-pass in-memory indexing 66

4.4 Distributed indexing 68

4.5 Dynamic indexing 71

4.6 Other types of indexes 73

4.7 References and further reading 76

5 Index compression 78

5.1 Statistical properties of terms in information retrieval 79

5.2 Dictionary compression 82

5.3 Postings file compression 87

5.4 References and further reading 97

6 Scoring, term weighting, and the vector space model 100

6.1 Parametric and zone indexes 101

6.2 Term frequency and weighting 107

6.3 The vector space model for scoring 110

6.4 Variant tf–idf functions 116

6.5 References and further reading 122

7 Computing scores in a complete search system 124

7.1 Efficient scoring and ranking 124

7.2 Components of an information retrieval system 132

7.3 Vector space scoring and query operator interaction 136

7.4 References and further reading 137

8 Evaluation in information retrieval 139

8.1 Information retrieval system evaluation 140

8.2 Standard test collections 141

8.3 Evaluation of unranked retrieval sets 142

8.4 Evaluation of ranked retrieval results 145

8.5 Assessing relevance 151

8.6 A broader perspective: System quality and user utility 154

8.7 Results snippets 157

8.8 References and further reading 159

9 Relevance feedback and query expansion 162

9.1 Relevance feedback and pseudo relevance feedback 163

9.2 Global methods for query reformulation 173

9.3 References and further reading 177

10 XML retrieval 178

10.1 Basic XML concepts 180

10.2 Challenges in XML retrieval 183

10.3 A vector space model for XML retrieval 188

10.4 Evaluation of XML retrieval 192

10.5 Text-centric versus data-centric XML retrieval 196

10.6 References and further reading 198

11 Probabilistic information retrieval 201

11.1 Review of basic probability theory 202

11.2 The probability ranking principle 203

11.3 The binary independence model 204

11.4 An appraisal and some extensions 212

11.5 References and further reading 216

12 Language models for information retrieval 218

12.1 Language models 218

12.2 The query likelihood model 223

12.3 Language modeling versus other approaches in information retrieval 229

12.4 Extended language modeling approaches 230

12.5 References and further reading 232

13 Text classification and Naive Bayes 234

13.1 The text classification problem 237

13.2 Naive Bayes text classification 238

13.3 The Bernoulli model 243

13.4 Properties of Naive Bayes 245

13.5 Feature selection 251

13.6 Evaluation of text classification 258

13.7 References and further reading 264

14 Vector space classification 266

14.1 Document representations and measures of relatedness in vector spaces 267

14.2 Rocchio classification 269

14.3 k nearest neighbor 273

14.4 Linear versus nonlinear classifiers 277

14.5 Classification with more than two classes 281

14.6 The bias–variance tradeoff 284

14.7 References and further reading 291

15 Support vector machines and machine learning on documents 293

15.1 Support vector machines: The linearly separable case 294

15.2 Extensions to the support vector machine model 300

15.3 Issues in the classification of text documents 307

15.4 Machine-learning methods in ad hoc information retrieval 314

15.5 References and further reading 318

16 Flat clustering 321

16.1 Clustering in information retrieval 322

16.2 Problem statement 326

16.3 Evaluation of clustering 327

16.4 K-means 331

16.5 Model-based clustering 338

16.6 References and further reading 343

17 Hierarchical clustering 346

17.1 Hierarchical agglomerative clustering 347

17.2 Single-link and complete-link clustering 350

17.3 Group-average agglomerative clustering 356

17.4 Centroid clustering 358

17.5 Optimality of hierarchical agglomerative clustering 360

17.6 Divisive clustering 362

17.7 Cluster labeling 363

17.8 Implementation notes 365

17.9 References and further reading 367

18 Matrix decompositions and latent semantic indexing 369

18.1 Linear algebra review 369

18.2 Term–document matrices and singular valuede compositions 373

18.3 Low-rank approximations 376

18.4 Latent semantic indexing 378

18.5 References and further reading 383

19 Web search basics 385

19.1 Background and history 385

19.2 Web characteristics 387

19.3 Advertising as the economic model 392

19.4 The search user experience 395

19.5 Index size and estimation 396

19.6 Near-duplicates and shingling 400

19.7 References and further reading 404

20 Web crawling and indexes 405

20.1 Overview 405

20.2 Crawling 406

20.3 Distributing indexes 415

20.4 Connectivity servers 416

21 Link analysis 421

21.1 TheWeb as a graph 422

21.2 PageRank 424

21.3 Hubs and authorities 433

21.4 References and further reading 439

Inde 469

Bibliography 441

书名	信息检索导论(英文版)/图灵原版计算机科学系列
分类	计算机-操作系统
作者	(美)曼宁//拉哈万//(德)舒策
出版社	人民邮电出版社
下载
简介	编辑推荐本书从计算机科学领域的角度出发，介绍了信息检索的基础知识，并对当前信息检索的发展做了回顾，重点介绍了搜索引擎的核心技术，如文档分类和文档聚类问题，以及机器学习和数值计算方法。书中所有重要的思想都用示例进行了解释，生动形象，引人入胜，实现了理论与实战的完美结合。内容推荐本书是信息检索的教材，旨在从计算机科学的视角提供一种现代的信息检索方法。书中从基本概念讲解网络搜索以及文本分类和文本聚类等，对收集、索引和搜索文档系统的设计和实现的方方面面、评估系统的方法、机器学习方法在文本收集中的应用等给出了最新的讲解。书中所有重要的思想都是用示例进行解释，图文并茂。本书非常适合作为计算机科学及相关专业的高年级本科生和研究生的“信息检索”课程的入门教材，当然也同样适合研究人员和专业人士阅读。目录 1 Boolean retrieval 1 1.1 An example information retrieval problem 3 1.2 A first take at building an inverted index 6 1.3 Processing Boolean queries 9 1.4 The extended Boolean model versus ranked retrieval 13 1.5 References and further reading 16 2 The term vocabulary and postings lists 18 2.1 Document delineation and character sequence decoding 18 2.2 Determining the vocabulary of terms 21 2.3 Faster postings list intersection via skip pointers 33 2.4 Positional postings and phrase queries 36 2.5 References and further reading 43 3 Dictionaries and tolerant retrieval 45 3.1 Search structures for dictionaries 45 3.2 Wildcard queries 48 3.3 Spelling correction 52 3.4 Phonetic correction 58 3.5 References and further reading 59 4 Index construction 61 4.1 Hardware basics 62 4.2 Blocked sort-based indexing 63 4.3 Single-pass in-memory indexing 66 4.4 Distributed indexing 68 4.5 Dynamic indexing 71 4.6 Other types of indexes 73 4.7 References and further reading 76 5 Index compression 78 5.1 Statistical properties of terms in information retrieval 79 5.2 Dictionary compression 82 5.3 Postings file compression 87 5.4 References and further reading 97 6 Scoring, term weighting, and the vector space model 100 6.1 Parametric and zone indexes 101 6.2 Term frequency and weighting 107 6.3 The vector space model for scoring 110 6.4 Variant tf–idf functions 116 6.5 References and further reading 122 7 Computing scores in a complete search system 124 7.1 Efficient scoring and ranking 124 7.2 Components of an information retrieval system 132 7.3 Vector space scoring and query operator interaction 136 7.4 References and further reading 137 8 Evaluation in information retrieval 139 8.1 Information retrieval system evaluation 140 8.2 Standard test collections 141 8.3 Evaluation of unranked retrieval sets 142 8.4 Evaluation of ranked retrieval results 145 8.5 Assessing relevance 151 8.6 A broader perspective: System quality and user utility 154 8.7 Results snippets 157 8.8 References and further reading 159 9 Relevance feedback and query expansion 162 9.1 Relevance feedback and pseudo relevance feedback 163 9.2 Global methods for query reformulation 173 9.3 References and further reading 177 10 XML retrieval 178 10.1 Basic XML concepts 180 10.2 Challenges in XML retrieval 183 10.3 A vector space model for XML retrieval 188 10.4 Evaluation of XML retrieval 192 10.5 Text-centric versus data-centric XML retrieval 196 10.6 References and further reading 198 11 Probabilistic information retrieval 201 11.1 Review of basic probability theory 202 11.2 The probability ranking principle 203 11.3 The binary independence model 204 11.4 An appraisal and some extensions 212 11.5 References and further reading 216 12 Language models for information retrieval 218 12.1 Language models 218 12.2 The query likelihood model 223 12.3 Language modeling versus other approaches in information retrieval 229 12.4 Extended language modeling approaches 230 12.5 References and further reading 232 13 Text classification and Naive Bayes 234 13.1 The text classification problem 237 13.2 Naive Bayes text classification 238 13.3 The Bernoulli model 243 13.4 Properties of Naive Bayes 245 13.5 Feature selection 251 13.6 Evaluation of text classification 258 13.7 References and further reading 264 14 Vector space classification 266 14.1 Document representations and measures of relatedness in vector spaces 267 14.2 Rocchio classification 269 14.3 k nearest neighbor 273 14.4 Linear versus nonlinear classifiers 277 14.5 Classification with more than two classes 281 14.6 The bias–variance tradeoff 284 14.7 References and further reading 291 15 Support vector machines and machine learning on documents 293 15.1 Support vector machines: The linearly separable case 294 15.2 Extensions to the support vector machine model 300 15.3 Issues in the classification of text documents 307 15.4 Machine-learning methods in ad hoc information retrieval 314 15.5 References and further reading 318 16 Flat clustering 321 16.1 Clustering in information retrieval 322 16.2 Problem statement 326 16.3 Evaluation of clustering 327 16.4 K-means 331 16.5 Model-based clustering 338 16.6 References and further reading 343 17 Hierarchical clustering 346 17.1 Hierarchical agglomerative clustering 347 17.2 Single-link and complete-link clustering 350 17.3 Group-average agglomerative clustering 356 17.4 Centroid clustering 358 17.5 Optimality of hierarchical agglomerative clustering 360 17.6 Divisive clustering 362 17.7 Cluster labeling 363 17.8 Implementation notes 365 17.9 References and further reading 367 18 Matrix decompositions and latent semantic indexing 369 18.1 Linear algebra review 369 18.2 Term–document matrices and singular valuede compositions 373 18.3 Low-rank approximations 376 18.4 Latent semantic indexing 378 18.5 References and further reading 383 19 Web search basics 385 19.1 Background and history 385 19.2 Web characteristics 387 19.3 Advertising as the economic model 392 19.4 The search user experience 395 19.5 Index size and estimation 396 19.6 Near-duplicates and shingling 400 19.7 References and further reading 404 20 Web crawling and indexes 405 20.1 Overview 405 20.2 Crawling 406 20.3 Distributing indexes 415 20.4 Connectivity servers 416 21 Link analysis 421 21.1 TheWeb as a graph 422 21.2 PageRank 424 21.3 Hubs and authorities 433 21.4 References and further reading 439 Inde 469 Bibliography 441
随便看	大拇指课程(沉思与生长)/变革的课程领导丛书为幸福的人生--民国名家对话中小学生/大夏书系首席教师/大夏书系日本近现代文学作品选读让孩子自己往前走(不骄纵不控制的父母之道)/大夏书系以工程造价为核心的项目管理--基于价值成本及风险的多视角/金马威工程管理咨询丛书拔罐/中医优势治疗技术丛书红枫吟(一敬诗词集) 女性党政领导人才开发对策(以江苏省为例) 南京的风花雪月金塘大桥建设关键技术(公路基础设施建设与养护)/交通运输建设科技丛书神奇的鲸家族(鼠海豚海豚和鲸)/戴帽子的猫科普图书馆动物宝宝探访记(野生小动物的知识)/戴帽子的猫科普图书馆穿越雨林(热带雨林全景)/戴帽子的猫科普图书馆骆驼是哺乳动物吗(哺乳动物大观)/戴帽子的猫科普图书馆企业预算管理--从预算整合到整合预算假如我来举办狗狗秀(狗的故事)/戴帽子的猫科普图书馆卡卡搬新家(精) 中国现代文学论丛(第10卷1) 你愿意做一只蝌蚪吗(池塘生活全貌)/戴帽子的猫科普图书馆喔蝴蝶(蝴蝶的奥秘)/戴帽子的猫科普图书馆癌症常识全知道奇妙的冰(南北极全貌)/戴帽子的猫科普图书馆我今天能认50种树了(树木知识大全)/戴帽子的猫科普图书馆沙漠里为什么没有水(沙漠知多少)/戴帽子的猫科普图书馆 PDF打印机 PDF阅读器(简易版)-精简查看-极速上传泥怪兽赛车（3D游戏）上门洗车服务疯狂篮球 - 体育游戏乐度欢乐娃娃机惠车友 X-酷跑安医附院海贼无双3全人物全技能存档 v2.0 我的世界彩虹猫地图存档 v2.0 围攻高负载俯仰系统存档 v2.0 寻仙工具箱 v4.0 我的世界魔龙之城地图存档 v2.0 围攻翘翼舞者第二代存档 v2.0 我的世界巨型蘑菇塔地图存档 v2.0 礼包助手PC版 v1.3 围攻蜻蜓式直升机存档 v2.0 我的世界帝国铁甲号地图存档 v2.0 oxygenate oxygen bar oxygen mask oxygen tank oxygen tent oxyhaemoglobin oxymoron oyster oyster bed Oz [BT下载][味道大师][第05集][WEB-MP4/0.89G][国语配音/中文字幕][4K-2160P][H265][流媒体][DeePTV] [BT下载][御渊令][短剧][第23-24集][WEB-MP4/0.30G][国语配音/中文字幕][4K-2160P][H265][流媒体][DeePTV] [BT下载][恋爱兄妹][第28-29集][WEB-MP4/1.56G][国语配音/中文字幕][1080P][流媒体][DeePTV] [BT下载][恋爱兄妹][第28-29集][WEB-MKV/1.56G][国语配音/中文字幕][1080P][流媒体][ColorTV] [BT下载][恋爱兄妹][第28-29集][WEB-MP4/2.55G][国语配音/中文字幕][4K-2160P][H265][流媒体][DeePTV] [BT下载][恋爱兄妹][第28-29集][WEB-MKV/3.04G][国语配音/中文字幕][4K-2160P][H265][流媒体][ColorTV] [BT下载][我记得你的温度][第20集][WEB-MP4/0.21G][国语配音/中文字幕][1080P][流媒体][DeePTV] [BT下载][我记得你的温度][第20集][WEB-MP4/0.90G][国语配音/中文字幕][4K-2160P][60帧率][H265][流媒体][D [BT下载][我记得你的温度][第20集][WEB-MP4/0.68G][国语配音/中文字幕][4K-2160P][H265][流媒体][DeePTV] [BT下载][朝阳之于夜][第01集][WEB-MP4/0.12G][中文字幕][1080P][H265][流媒体][DeePTV] 惠普Win10改Win7系统BIOS设置怎么弄？ Win7精简版32位239M终极纯净版无法使用无线网络怎么办？ Excel数字变成了小数点+E+17怎么办？惠普Win10改Win7系统BIOS设置怎么弄？ Win7精简版32位239M终极纯净版无法使用无线网络怎么办？ Excel数字变成了小数点+E+17怎么办？惠普Win10改Win7系统BIOS设置怎么弄？ Win7精简版32位239M终极纯净版无法使用无线网络怎么办？ Excel数字变成了小数点+E+17怎么办？惠普Win10改Win7系统BIOS设置怎么弄？