How Can Startups Leverage Big Data? презентация


Mostly Unstructured Data Client Data Customer Data Social Data Driving towards insight What is Big Data?

Слайд 1How Can Startups Leverage Big Data?
Trudging Through Myth To Discover Real


Слайд 2Mostly Unstructured Data
Client Data
Customer Data
Social Data
Driving towards insight
What is Big Data?

“Big Data is any dataset

not suited to be processed by traditional legacy technology.”

Слайд 4The Three V’s
Mining social data for sentiment
Analyzing web clickstreams
Analyzing log data

for security breaches
Telemetry from sensors and machines
eCommerce predictive analytics

Слайд 5The Three V’s
Mining social data for sentiment
Analyzing web clickstreams
Analyzing log data

for security breaches
Telemetry from sensors and machines
eCommerce predictive analytics

Слайд 6
Evolution of Data
Data Complexity

Слайд 7
Big Data is now much more than hype – real customers

with real use cases are adopting daily
Recent survey found that business leaders expected the deployment of Hadoop to result in a 3-year benefit ranging from $5M to $50M+
Close to 100% of business leaders have already deployed or plan to deploy ApacheTM Hadoop®

Big Data is Here to Stay

"Enterprises are showing increasing interest in the value provided by the large-scale data processing that Hadoop and Spark can provide, but can be wary of the upfront cost and complexity of setting up a cluster to prove that value. Managed services such as [OnMetalTM Cloud Big Data Platform] enable enterprises to focus their energies on generating business insights rather than configuring and managing infrastructure.” 
Matt Aslett
451 Research Director, Data Platforms and Analytics

Слайд 8To learn more about your customers
To optimize your business processes
To become

a more targeted marketer
Interact with users and customers in real time
Add additional revenue and services

Why leverage Big Data?


What Is the Cost of Lacking a Big Data Strategy?

Today every

company can be a data company
Successful companies will be data companies
Under Armour isn’t just a fitness company – they’re a data company

Слайд 10Open Source
Able to process petabytes of data quickly
Developed at Google, implemented

at scale at Yahoo
Handles unstructured data very well
One of the fastest growing eco-systems

Hadoop Has Emerged As A Leader In Distributed Data Sets

Слайд 11Fundamentals of Hadoop v1
Data Services
Core Services
Distributed File System
Distributed, scalable, non relational database

and table management system

Data flow scripting language

DW analysis layer through HiveQL (SQL-like) queries

Data processing framework

Operational Services

Log data aggregation and movement

Bulk data transfer from and to relational DB

Слайд 12
Biggest impediments include:
Insufficient skills in-house to design and deploy
Designing and deploying

takes too long
High cost of physical infrastructure

Hadoop is Hard

Слайд 13Original focus on batch processing
Streaming and interactive use cases emerging
Shift from

jobs that take hours to seconds
Impala, Spark, and Presto are emerging tools

Hadoop is Changing

Слайд 14But what are these companies doing with Big Data?
Gaining Insights!!!

Слайд 15What are Companies Doing with Hadoop?

Слайд 16Application Underpinning
Enterprises consider support for mobility and productivity enhancement to mobile

workers as their top-priority new application category, according to a recent survey by CIMI Corp. That means most companies that have adopted, or are adopting, Hadoop will likely have to integrate the framework with mobile applications.
Data Aggregation
The two big use cases we're seeing for Impala are aggregating data in Hadoop to present analytic dashboards and improving data-discovery applications by providing faster performance than Hive," Alex Gutow, Cloudera's product marketing manager.
Users are increasingly choosing Hadoop as the underlying technology to power interactive dashboarding capability.
Internet of Things
As tech wearables and generated devices start to become common-day solutions the backend of your application needs to be built to address these concerns and can handle the velocity and volume of data being produced by the appliance.

People are building net-new applications with Hadoop as their database

Слайд 17Clickstream Analysis
Your home page looks great. But how do you move

customers on to bigger things—like submitting a form or completing a purchase? Get more granular with customer segmentation. Hadoop makes it easier to analyze, visualize and ultimately change how visitors behave on your website.
A clickstream is a series of page requests. Every page requested generates a signal. These signals can be graphically represented for clickstream reporting. The main point of clickstream tracking is to give webmasters insight into what visitors on their site are doing.

The study of human clicks on a website
Tracking Cookies
Tool used to understand and track online activity
Data Mining
Collecting data from websites and online properties

Understand how your users are behaving on your website and optimize your experience

Слайд 18Sentiment Analysis
Your customers are talking. With Hadoop, you can mine Twitter,

Facebook and other social media conversations for sentiment data about you and your competition, and use it to make targeted, real-time decisions that increase market share.
Sentiment analysis aims to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a document.

Social Media Feeds
Many companies are now capturing entire Twitter and Facebook feeds to analyze.
Data Mining
Users are searching the web for comments, blogs, and whitepapers that can point to overall sentiment
Forums, user groups, Heroku

Find out what your users are saying about you. Are they happy? Does your product make them a promoter?

Слайд 19Machine Learning
Your machines know things. From out in the field to

the assembly line floor—machines stream low-cost, always-on data. Hadoop makes it easier for you to store and refine that data and identify meaningful patterns, providing you with the insight to make proactive business decisions.
Machine Learning is a scientific discipline that deals with the construction and study of algorithms that can learn from data. Such algorithms operate by building a model based on inputs and using that to make predictions or decisions, rather than following only explicitly programmed instructions.

Pattern Recognition
Users are building clusters to detect patterns and identify anomalies in data that these devices are generating
Decision Tree
Allows the system to take action and make choices based on the data
Predictive Modeling
Aims to automate the most common mistakes and errors as part of a preventative model

Interactive devices are now streamlining things like maintenance and troubleshooting

Слайд 20Fraud Detection
Fraud is a billion-dollar business and it is increasing every

year. The PwC global economic crime survey of 2009 suggests that close to 30% of companies worldwide have reported being victims of fraud in the past year.
Fraud involves one or more persons who intentionally act secretly to deprive another of something of value, for their own benefit. Fraud is as old as humanity itself and can take an unlimited variety of different forms. However, in recent years, the development of new technologies has also provided further ways in which criminals may commit fraud.

Rules-Based Detection
Even though internet hackers have become better at tricking online systems, they still exhibit very calculated behavior.
Machine Learning
The aggregation of data points can help you collect more info about the potential sale and detect if it might be fraud.
Users Tagging and Tracing
Once users are flagged as fraudulent, their repeated attempts can be prevented.

Users are detecting fraudulent online behavior and rejecting those users before they commit an offense

Слайд 21Server Log Data
Security breaches happen. And when they do, your server

logs may be your best line of defense. Hadoop takes server-log analysis to the next level by speeding and improving security forensics and providing a low cost platform to show compliance.
Generally small files that track user information inside a confined environment; often used to meet compliance or troubleshoot an incident.

Scrub Data for Forensics
If a security incident occurs, it is important to remediate fast
Identify Anomalies
Anti-patterns are often the first sign
Discover Trends
Some types of errors might become common; learn to identify them
Actively Automate to Solve Issues with Log Files
Many of these errors can be proactively eliminated through the use of automation.

Aggregate server logs to find trends and anomalies in your security records

Слайд 22360 View of Customer – Dashboards and Analytics
Whenever a customer interacts

with an organization, it is vital that the richness of information available on that customer informs and guides the processes that will help to maximize their experience, while simultaneously making the interaction as effective and efficient as possible. This includes everything from avoiding repetition or rekeying of information, to viewing customer history, establishing context and initiating desired actions.

A total 360 view often contains 3 views:
The Past
Understanding how your users act in the past lets you understand who they are and serve them relevant content and products
The Present
Where are users coming from? What is their experience on your site right now? Do they need help?
The Future
Did they buy? Can we serve them more information to help their choice? Can we market to them better?

Create in-depth personas for your customers based on how they are actually behaving.

Слайд 23What’s Next? Interactive Processing!
What if instead of reacting to behavior

we can engage virtually with the user to inhibit behavior?

This is called interactive processing and it takes input from humans and reacts based on patterns and algorithms.

The quicker we can server up this interaction, to the user the better equipped we are to inhibit their behavior!

Interact with customers in real-time offering suggestions and inhibiting behavior


Слайд 24Introducing support of Apache SparkTM
Apache Spark enables enterprises to combine the

breadth of structured and unstructured data with the speed of in-memory processing to build streaming, machine learning, and graph-optimized applications that allow businesses to take action at the speed of insight.

Apache Spark

Слайд 25Deeper Integration with SQL Workloads
Streaming Applications
Machine Learning
Iterative Processing
Real-time Graphical Dashboards
New Use


Слайд 26YES
Does the delivery method matter?

Слайд 27Choose The Best Deployment Model

Слайд 29Advantages of storing data in the cloud:

Слайд 30Dedicated Hosting
No Capex Investment
Choose new hardware and software versioning easily
Rely on

extended support personnel
Increased security options
Concurrent and predictable performance

Control Data Access
Integrate with core mainframe and systems
Build your own IP
Control every aspect of design and operation

Advantages of Dedicated Hosting/On-Premise

The Trade Off...
Custom Built
Purpose Built

OnMetal Lets You Scale Like the Internet Giants
“Rackspace Cloud, because of

its single-tenant OnMetal line, is the only place on Earth where you can enjoy Facebook/Google-style infrastructure rented by the hour.”
-Ev Kontsevoy
Director, Product

Слайд 33Benefits of Outsourced Hosting

The Level of Management You Need
Only you can decide what model

is best for you!
Managed Service
Turnkey Service

Слайд 35Data as a Service: more time building, less time managing databases
For some

businesses, database or infrastructure management IS core to the business

For most software-based businesses, database or infrastructure management represents time and resources not spent building the application

You must answer for yourself: are you in the business of managing infrastructure, or in the business of [your market here]?




Слайд 39Rackspace Offerings for the Data Tier
for Data
Managed Offerings of Most

Big Data, SQL, & NoSQL Databases

Managed Database Services for Production Apps

Cloud IaaS
Get started fast

Dedicated Hosting
Predictable costs & performance

Cloud Elasticity & Dedicated Performance

Слайд 40Sign up for a free trial
Want to know more?
Read my

blog and check out the articles

What’s Next?

Слайд 41Questions?

Обратная связь

Если не удалось найти и скачать презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое

Это сайт презентаций, докладов, проектов, шаблонов в формате PowerPoint. Мы помогаем школьникам, студентам, учителям, преподавателям хранить и обмениваться учебными материалами с другими пользователями.

Для правообладателей