Leveraging Customer Data to Enhance Relevancy in Personalization презентация

Big Data Analytics Track Driving Personalized Experiences Using Customer Profiles Leveraging Data to Enhance Relevancy in Personalization Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and

Слайд 1Leveraging Customer Data to Enhance Relevancy in Personalization
“Using Apache Data Processing

Projects on top of MongoDB”

Marc Schwering
Sr. Solution Architect – EMEA
marc@mongodb.com
@m4rcsch


Слайд 2Big Data Analytics Track
Driving Personalized Experiences Using Customer Profiles

Leveraging Data to

Enhance Relevancy in Personalization

Machine Learning to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB

Слайд 3Agenda For This Session
Personalization Process Review
The Life of an Application
Separation of

Concerns / Real World Architecture
Apache Spark and Flink Data Processing Projects
Clustering with Apache Flink
Next Steps


Слайд 4High Level Personalization Process
1. Profile created
2. Enrich with public data
3. Capture

activity

4. Clustering analysis

5. Define Personas

6. Tag with personas

7. Personalize interactions

Batch analytics

Public data

Common technologies
R
Hadoop
Spark
Python
Java
Many other options

Personas changed much less often than tagging


Слайд 5Evolution of a Profile (1)
{
"_id" : ObjectId("553ea57b588ac9ef066428e1"),
"ipAddress" : "216.58.219.238",
"referrer" : ”kay.com",
"firstName"

: "John",
"lastName" : "Doe",
"email" : "johndoe@gmail.com"
}


Originating IP
Demographic info
Location
Name
Sex
Email


Слайд 6Evolution of a Profile (n+1)
{
"_id" : ObjectId("553e7dca588ac9ef066428e0"),
"firstName" : "John",
"lastName" : "Doe",
"address"

: "229 W. 43rd St.",
"city" : "New York",
"state" : "NY",
"zipCode" : "10036",
"age" : 30,
"email" : "john.doe@mongodb.com",
"twitterHandle" : "johndoe",
"gender" : "male",
"interests" : [
"electronics",
"basketball",
"weightlifting",
"ultimate frisbee",
"traveling",
"technology"
],
"visitedCounts" : {
"watches" : 3,
"shirts" : 1,
"sunglasses" : 1,
"bags" : 2
},
"purchases" : [
{
"id" : 1,
"desc" : "Power Oxford Dress Shoe",
"category" : "Mens shoes"
},
{
"id" : 2,
"desc" : "Striped Sportshirt",
"category" : "Mens shirts"
}
],
"persona" : "shoe-fanatic”
}

Слайд 7One size/document fits all?
Profile Data
Preferences
Personal information
Contact information
DOB, gender, ZIP...
Customer Data
Purchase History
Marketing

History
„Session Data“
View History
Shopping Cart Data
Information Broker Data
Personalisation Data
Persona Vectors
Product and Category recommendations




Application

Batch analytics


Слайд 8Separation of Concerns
Profile Data
Preferences
Personal information
Contact information
DOB, gender, ZIP...
Customer Data
Purchase History
Marketing History
„Session

Data“
View History
Shopping Cart Data
Information Broker Data
Personalisation Data
Persona Vectors
Product and Category recommendations




Batch analytics Layer


Слайд 9Benefits
Code does less, Document and Code stays focused
Split ability
Different Teams
New Languages
Defined

Dependencies



Слайд 10Result
Code does less, Document and Code stays focused
Split ability
Different Teams
New Languages
Defined

Dependencies


KISS => Keep it simple and save!

=> Clean Code <=

Robert C. Marten: https://cleancoders.com/
M. Fowler / B. Meyer. et. al.: Command Query Separation



Слайд 11Analytics and Personalization
From Query to Clustering


Слайд 12Separation of Concerns
Profile Data
Preferences
Personal information
Contact information
DOB, gender, ZIP...
Customer Data
Purchase History
Marketing History
„Session

Data“
View History
Shopping Cart Data
Information Broker Data
Personalisation Data
Persona Vectors
Product and Category recommendations




Batch analytics Layer


Слайд 13Separation of Concerns
Profile Data
Preferences
Personal information
Contact information
DOB, gender, ZIP...
Customer Data
Purchase History
Marketing History
„Session

Data“
View History
Shopping Cart Data
Information Broker Data
Personalisation Data
Persona Vectors
Product and Category recommendations




Batch analytics Layer


Слайд 14Architecture revised



Data Processing


Слайд 15Advice for Developers
OWN YOUR DATA! (but only relevant Data)
Say no! (to

direct Data ie. DB Access)

Слайд 16Data Processing


Слайд 17Hadoop in a Nutshell
An open source distributed storage and distributed batch

oriented processing framework

Hadoop Distributed File System (HDFS) to store data on commodity hardware
Yarn as resource management platform
MapReduce as programming model working on top of HDFS


Слайд 18Spark in a Nutshell
Spark is a top-level Apache project
Can be run

on top of YARN and can read any Hadoop API data, including HDFS or MongoDB

Fast and general engine for large-scale data processing and analytics
Advanced DAG execution engine with support for data locality and in-memory computing


Слайд 19Flink in a Nutshell
Flink is a top-level Apache project
Can be run

on top of YARN and can read any Hadoop API data, including HDFS or MongoDB

A distributed streaming dataflow engine
Streaming and batch
Iterative in memory execution and handling
Cost based optimizer


Слайд 20Latency of query operations


Слайд 21Iterative Algorithms / Clustering


Слайд 22K-Means in Pictures
Source: Wikipedia K-Means


Слайд 23K-Means as a Process


Слайд 24Iterations in Hadoop and Spark


Слайд 25
Iterations in Flink
Dedicated iteration operators
Tasks keep running for the iterations, not

redeployed for each step
Caching and optimizations done automatically

Слайд 28More…?


Слайд 29Takeaways
Stay focussed => Start and stay small
Evaluate with BigDocuments but do

a PoC focussed on the topic
Extending functionality is easy
Aggregation, MapReduce
Hadoop Connector opens a new variety of Use Cases
Extending functionality could be challenging
Evolution is outpacing help channels
A lot of options (Spark, Flink, Storm, Hadoop….)
More than just a binary

Слайд 30Next Steps
Next Session => Hands on Spark and Whatson Content!
„Machine Learning

to Engage the Customer, with Apache Spark, IBM Watson, and MongoDB“
RDD Examples
Try out Spark and Flink
http://bit.ly/MongoDB_Hadoop_Spark_Webinar
http://flink.apache.org/
https://github.com/mongodb/mongo-hadoop
https://github.com/m4rcsch/flink-mongodb-example
Participate and ask Questions!
@m4rcsch
marc@mongodb.com


Слайд 31Thank you!

Marc Schwering
Sr. Solutions Architect – EMEA
marc@mongodb.com
@m4rcsch


Обратная связь

Если не удалось найти и скачать презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое ThePresentation.ru?

Это сайт презентаций, докладов, проектов, шаблонов в формате PowerPoint. Мы помогаем школьникам, студентам, учителям, преподавателям хранить и обмениваться учебными материалами с другими пользователями.


Для правообладателей

Яндекс.Метрика