Spark机器学习 (英)彭特里思(Nick Pentreath) 著 mobi 百度云 txt pdb 下载 lrf pdf 地址

Spark机器学习 (英)彭特里思(Nick Pentreath) 著电子书下载地址
- 文件名
- [epub 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 epub格式电子书
- [azw3 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 azw3格式电子书
- [pdf 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 pdf格式电子书
- [txt 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 txt格式电子书
- [mobi 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 mobi格式电子书
- [word 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 word格式电子书
- [kindle 下载] Spark机器学习 (英)彭特里思(Nick Pentreath) 著 kindle格式电子书
寄语:
新华书店正版,关注店铺成为会员可享店铺专属优惠,团购客户请咨询在线客服!
内容简介:
你可以从书中学到使用Scala、Java和Python创建你的靠前个Spark程序;在你自己的计算机以及AmazonEC2上建立、配置Spark开发环境;访问公共机器学习数据集,使用Spark载入、处理、清理、转换数据;使用Spark的机器学习库来实现能够利用各种熟知的机器学习模型的程序;等等。
书籍目录:
Preface
Chapter 1: Getting Up and Running with Spark
Installing and setting up Spark locally
Spark clusters
The Spark programming model
Spark Context and Spark Conf
The Spark shell
Resilient Distributed Datasets
Creating RDDs
Spark operations
Caching RDDs
Broadcast variables and accumulators
The first step to a Spark program in Scala
The first step to a Spark program in Java
The first step to a Spark program in Python
Getting Spark running on Amazon EC2
Launching an EC2 Spark cluster
Summary
Chapter 2: Designing a Machine Learning System
Introducing Movie Stream
Business use cases for a machine learning system
Personalization
Targeted marketing and customer segmentation
Predictive modeling and analytics
Types of machine learning models
The components of a data—driven machine learning system
Data ingestion and storage
Data cleansing and transformation
Model training and testing loop
Model deployment and integration
Model monitoring and feedback
Batch versus real time
An architecture for a machine learning system
Practical exercise
Summary
Chapter 3: Obtaining, Processing, and Preparing Data with Spark
Accessing publicly available datasets
The Movie Lens lOOk dataset
Exploring and visualizing your data
Exploring the user dataset
Exploring the movie dataset
Exploring the rating dataset
Processing and transforming your data
Filling in bad or missing data
Extracting useful features from your data
Numerical features
Categorical features
Derived features
Transforming timestamps into categorical features
Text features
Simple text feature extraction
Normalizing features
Using MLlib for feature normalization
Using packages for feature extraction
Summary
Chapter 4: Building a Recommendation Engine with Spark
Types of recommendation models
Content—based filtering
Collaborative filtering
Matrix factorization
Extracting the right features from your data
Extracting features from the MovieLens 100k dataset
Training the recommendation model
Training a model on the MovieLens 100k dataset
Training a model using implicit feedback data
Using the recommendation model
User recommendations
Generating movie recommendations from the MovieLens 100k dataset
Item recommendations
Generating similar movies for the MovieLens 100k dataset
Evaluating the performance of recommendation models
Mean Squared Error
Mean average precision at K
Using MLlib's built—in evaluation functions
RMSE and MSE
MAP
Summary
Chapter 5: Building a Classification Model with Spark
Types of classification models
Linear models
Logistic regression
Linear support vector machines
The na'fve Bayes model
Decision trees
Extracting the right features from your data
Extracting features from the Kaggle/StumbleUpon evergreen classification dataset
Training classification models
Training a classification model on the Kaggle/StumbleUpon evergreen classification dataset
Using classification models
Generating predictions for the Kaggle/StumbleUpon
evergreen classification dataset
Evaluating the performance of classification models
Accuracy and prediction error
Precision and recall
ROC curve and AUC
Improving model performance and tuning parameters
Feature standardization
Additional features
Using the correct form of data
Tuning model parameters
Linear models
Decision trees
The naive Bayes model
Cross—validation
Summary
Chapter 6: Buildin a Regression Model with Spark
Types of regression models
Least squares regression
Decision trees for regression
Extracting the right features from your data
Extracting features from the bike sharing dataset
Creating feature vectors for the linear model
Creating feature vectors for the decision tree
Training and using regression models
Training a regression model on the bike sharing dataset
Evaluating the performance of regression models
Mean Squared Error and Root Mean Squared Error
Mean Absolute Error
Root Mean Squared Log Error
The R—squared coefficient
Computing performance metrics on the bike sharing dataset
Linear model
Decision tree
Improving model performance and tuning parameters
Transforming the target variable
Impact of training on log—transformed targets
Tuning model parameters
Creating training and testing sets to evaluate parameters
The impact of parameter settings for linear models
The impact of parameter settings for the decision tree
Summary
Chapter 7: Building a Clustering Model with Spark
Types of clustering models
K—means clustering
Initialization methods
Variants
Mixture models
Hierarchical clustering
Extracting the right features from your data
Extracting features from the MovieLens dataset
Extracting movie genre labels
Training the recommendation model
Normalization
Training a clustering model
Training a clustering model on the MovieLens dataset
Making predictions using a clustering model
Interpreting cluster predictions on the MovieLens dataset
Interpreting the movie clusters
Evaluating the performance of clustering models
Internal evaluation metrics
External evaluation metrics
Computing performance metrics on the MovieLens dataset
Tuning parameters for clustering models
Selecting K through cross—validation
Summary
Chapter 8: Dimensionality Reduction with Spark
Types of dimensionality reduction
Principal Components Analysis
Singular Value Decomposition
Relationship with matrix factorization
Clustering as dimensionality reduction
Extracting the right features from your data
Extracting features from the LFW dataset
Exploring the face data
Visualizing the face data
Extracting facial images as vectors
Normalization
Training a dimensionality reduction model
Running PCA on the LFW dataset
Visualizing the Eigenfaces
Interpreting the Eigenfaces
Using a dimensionality reduction model
Projecting data using PCA on the LFW dataset
The relationship between PCA and SVD
Evaluating dimensionality reduction models
Evaluating k for SVD on the LFW dataset
Summary
Chapter 9: Advanced Text Processing with Spark
What's so special about text data?
Extracting the right features from your data
Term weighting schemes
Feature hashing
Extracting the TF—IDF features from the 20 Newsgroups dataset
Exploring the 20 Newsgroups data
Applying basic tokenization
Improving our tokenization
Removing stop words
Excluding terms based on frequency
A note about stemming
Training a TF—IDF model
Analyzing the TF—IDF weightings
Using a TF—IDF model
Document similarity with the 20 Newsgroups dataset and
TF—IDF features
Training a text classifier on the 20 Newsgroups dataset
using TF—IDF
Evaluating the impact of text processing
Comparing raw features with processed TF—IDF features on the
20 Newsgroups dataset
Word2Vec models
Word2Vec on the 20 Newsgroups dataset
Summary
Chapter 10: Real—time Machine Learning withSpark Streaming
Online learning
Stream processing
An introduction to Spark Streaming
Input sources
Transformations
Actions
Window operators
Caching and fault tolerance with Spark Streaming
Creating a Spark Streaming application
The producer application
Creating a basic streaming application
Streaming analytics
Stateful streaming
Online learning with Spark Streaming
Streaming regression
A simple streaming regression program
Creating a streaming data producer
Creating a streaming regression model
Streaming K—means
Online model evaluation
Comparing model performance with Spark Streaming
Summary
Index
作者介绍:
彭特里思,如果你是一名Scala、Java或Python开发人员,对机器学习和数据分析饶有兴趣,并热衷于学习如何使用spa rk框架将常见机器学习技术运用干大规模应用,那么这本书就是写给你的。如果对spark有基本的理解自然会有益处,但这并不是必需的。
出版社信息:
暂无出版社相关信息,正在全力查找中!
书籍摘录:
暂无相关书籍摘录,正在全力查找中!
在线阅读/听书/购买/PDF下载地址:
原文赏析:
在信息检索中,准确率通常用于评价结果的质量,而召回率用来评价结果的完整性。
通常,准确率和召回率是负相关的,高准确率常常对应低召回率,反之亦然。
准确率和召回率在单独度量时用处不大,但是它们通常会被一起组成聚合或者平均度量。二者也同时依赖于模型中选择的阈值。
现代的大数据场景包含如下需求:比如能与系统的其他组件整合,尤其是数据的收集和存储系统、分析和报告以及前端应用;易于扩展且与其他组件相对独立..;.. 最好能同时支持批处理和实时处理。
个性化和推荐十分相似,但推荐通常专指向用户显式地呈现某些产品或是内容,而个性化有时偏向隐式。比如说,对 MovieStream 的搜索功能个性化,以根据该用户的数据来改变搜索结果。
对数据进行初步预处理之后,需要将其转换为一种适合机器学习模型的表示形式。对许多模型类型来说,这种表示就是包含数值数据的向量或矩阵。
其它内容:
书籍介绍
Apache spark是一款全新开发的分布式框架,特别对低延迟任务和内存数据存储进行了优化。它结合了速度、可扩展性、内存处理以及容错性,是极少数适用于并行计算的框架之一,同时还非常易于编程,拥有一套灵活、表达能力丰富、功能强大的API设计。
《Spark机器学习(影印版 英文版)》指导你学习用于载入及处理数据的spark APl的基础知识,以及如何为各种机器学习模型准备适合的输入数据:另有详细的例子和实际生活中的真实案例来帮助你学习包括推荐系统、分类、回归、聚类、降维在内的常见机器学习模型,你还会看到如大规模文本处理之类的高级主题、在线机器学习的相关方法以及使用spa rk st reami ng进行模型评估。
网站评分
书籍多样性:5分
书籍信息完全性:3分
网站更新速度:4分
使用便利性:5分
书籍清晰度:6分
书籍格式兼容性:4分
是否包含广告:5分
加载速度:5分
安全性:3分
稳定性:3分
搜索功能:3分
下载便捷性:8分
下载点评
- 图文清晰(572+)
- 无盗版(169+)
- 推荐购买(115+)
- 可以购买(565+)
- 种类多(223+)
- 收费(615+)
- 内容齐全(309+)
- 方便(137+)
- 强烈推荐(311+)
- 下载速度快(483+)
- 盗版少(278+)
下载评价
- 网友 陈***秋:
不错,图文清晰,无错版,可以入手。
- 网友 苍***如:
什么格式都有的呀。
- 网友 益***琴:
好书都要花钱,如果要学习,建议买实体书;如果只是娱乐,看看这个网站,对你来说,是很好的选择。
- 网友 谭***然:
如果不要钱就好了
- 网友 田***珊:
可以就是有些书搜不到
- 网友 融***华:
下载速度还可以
- 网友 通***蕊:
五颗星、五颗星,大赞还觉得不错!~~
- 网友 林***艳:
很好,能找到很多平常找不到的书。
- 网友 索***宸:
书的质量很好。资源多
喜欢"Spark机器学习 (英)彭特里思(Nick Pentreath) 著"的人也看了
2025版 易佰作文 高中生议论文论点论据论证一本全 高考满分作文素材 高中生议论文论点论据大全 作文素材 mobi 百度云 txt pdb 下载 lrf pdf 地址
9787552304800 mobi 百度云 txt pdb 下载 lrf pdf 地址
T/CREA 018-2022工程信息模型数据存储标准 mobi 百度云 txt pdb 下载 lrf pdf 地址
??????????????????? mobi 百度云 txt pdb 下载 lrf pdf 地址
平妖传 古代神怪小说的集大成之作 明代两大文学家罗贯中、冯梦龙跨时空打造的文学经典 mobi 百度云 txt pdb 下载 lrf pdf 地址
9787561371787 mobi 百度云 txt pdb 下载 lrf pdf 地址
2016年度初级会计职称考试教辅:经济法基础全真模拟试题 mobi 百度云 txt pdb 下载 lrf pdf 地址
9787511422675 mobi 百度云 txt pdb 下载 lrf pdf 地址
同等学力考研西医综合历年真题全解(2022年修订版)(同等学力考研临床医学学科综合应试宝典) mobi 百度云 txt pdb 下载 lrf pdf 地址
电话里的童话(注音彩绘版) mobi 百度云 txt pdb 下载 lrf pdf 地址
- 2024版经纶学典学霸高考白题生物高一高二高三高中生物新高考新教材新题型课时同步辅导资料练习册全解全析题中题 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 中国名家经典童话 李姗姗魔法童话系列 小刺猬帕帕拉拉(全6册) mobi 百度云 txt pdb 下载 lrf pdf 地址
- 2021云图英语时文速递 冲刺篇 北京理工大学出版社 mobi 百度云 txt pdb 下载 lrf pdf 地址
- GRE考前强化训练 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 中国保险市场研究 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 孙用译卡勒瓦拉(上下)(精)/中国翻译家译丛 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 给新孩子的中华优秀传统故事·人文积淀卷 中小学生核心素养发展丛书 mobi 百度云 txt pdb 下载 lrf pdf 地址
- VC#.NET程序设计实训指导/计算机专业基础系列规划教材 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 无菌过滤 (美)约尼茨 编 北京大学医学出版社【正版】 mobi 百度云 txt pdb 下载 lrf pdf 地址
- 智能化无人作战系统 mobi 百度云 txt pdb 下载 lrf pdf 地址
书籍真实打分
故事情节:4分
人物塑造:6分
主题深度:5分
文字风格:4分
语言运用:9分
文笔流畅:9分
思想传递:7分
知识深度:6分
知识广度:9分
实用性:9分
章节划分:8分
结构布局:9分
新颖与独特:3分
情感共鸣:9分
引人入胜:5分
现实相关:9分
沉浸感:9分
事实准确性:5分
文化贡献:8分