What is the Present State of the Art Of In-Memory Analytics? презентация

Содержание

Disclaimer “i think you’ll find it’s a bit more complicated than that.”

Слайд 1
What is the Present State of the Art Of In-Memory Analytics?
Timo

Elliott, Innovation Evangelist timoelliott.com

Слайд 2Disclaimer
“i think you’ll find
it’s a bit more
complicated than that.”


Слайд 3A Bit of History


Слайд 4LEO: Lyon’s Electronic Office, 1951
Sixty-four 5ft-long mercury tubes, each weighing half

a ton, were used to provide a massive 8.75 Kb of memory (i.e. one hundred-thousandth of a today’s entry-level iPhone).

Слайд 51980s – first in-memory BI tools
Usefulness limited by high cost of

memory and limitations of 16bit memory addressing

640KB max memory


Слайд 61995: Windows 95 & 32-bit Architectures
Qlikview, TimesTen, and others take advantage

of new 32bit memory addressing to provide in-memory analytics

Слайд 7Complex Event Processing












Sensor readings – 10’s of thousands per second

Virtually no

useful information in a single isolated event

history

e.g. Compare variance of trends across multiple sensors against historical norms

Event window – e.g. 30 min

Alert

Extracting insight from events


Слайд 8Complex Event Processing



































































































































































































































































































































































































































































































































































































































































































































































































































































Tradtional BI: “How many
Fraudulent credit card transactions
occurred last week

in Madrid?”

1

2

3

4

5

6

7

8

9

time


Complex Event Processing: “when three credit card authorizations for the same card occur in any five seconds
window, deny the requests and check for fraud.”




Continuous Queries


Слайд 9In-Memory and The Internet of Things




CEP Engine
Studio
Input Streams
Sensors
Messages
Transactions
Market data
Clicks

Alerts
Dashboards
Applications



adapters



Слайд 10“Traditional” Business Intelligence
Slow
Painful
Expensive
Copy
ETL



Слайд 11It’s Like An Onion…
The more layers there are, the more it

makes you cry…

Слайд 12What Was The Problem?
Slow Disks & CPUs
I/O Bottleneck
Expensive Memory

Optimized for Transactions
BI

is an Afterthought

30 Year-Old Database Design Principles



Слайд 13Why Talk About In-Memory?


Слайд 14Analysts Recommend In-Memory
.
“An in-memory data platform offers more than performance benefits”
“Recommendations:

Invest in an in-memory data platform to gain competitive edge”

“In-Memory Database Is Gaining Momentum Across All Use Cases”

“In-Memory Delivers Extreme Performance And Scalability”

“In-Memory Data Platform Is No Longer An
Option — It’s A Necessity!”


Слайд 15Companies Like Yours Are Implementing In-Memory
32%
run in-memory databases at their location

today

75%

expect to expand their in-memory use in the next 3 years

Source: 2014 DBTA survey of IT and data managers

Top Uses

Top Benefits


Слайд 16Database vendors are investing in in-memory
The Forrester Wave: In-Memory Database Platforms,

Q3 ‘15

Слайд 17All Analytics Vendors Now Support In-Memory To Some Extent
Oracle Database In-Memory

Option
“The Oracle Database In-Memory option dramatically accelerates the performance of analytic queries by storing data in a highly optimized columnar in-memory format.”

Microsoft SQL Server In-Memory OLTP
‘When data lives totally in memory, we can use much, much simpler data structures. When a table is declared memory-optimized, all of its records live in memory.”

DB2 with BLU Acceleration
“IBM DB2 with BLU Acceleration speeds analytics and reporting using dynamic in-memory columnar technologies. In-memory columnar technologies provide an extremely efficient way to scan and find relevant data.“

Qlik
“In-memory indexing automatically builds and maintains all data relationships from multiple sources for unrestricted exploration”

SAP HANA
“A good example of a modern in-memory database technology is SAP's HANA platform. “

Teradata
“Teradata uses a hybrid approach to in-memory that intelligently puts the right data in memory to deliver high-speed in-memory performance at a fraction of the cost of putting all data in memory.“

Tableau
“The Data Engine is a high-performing analytics database on your PC. It has the speed benefits of traditional in-memory solutions without the limitations that your data must fit in memory.“

Spark
“Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.“


Слайд 18What Is In-Memory?
And why now?


Слайд 19What Is In-Memory?
Data access times of various storage types relative to

RAM (logarithmic scale)

RAM is 300,000 times faster than hard disks

CPU register is 61 million times faster than hard disks

Слайд 20In-Memory Databases vs. Caching
“Much of the work that is done by

a conventional, disk-optimized RDBMS is done under the assumption that data primarily resides on disk. Even when a disk-based RDBMS has been configured to hold all of its data in main memory, its performance is hobbled by assumptions of disk-based data residency. When the assumption of disk-residency is removed, complexity is dramatically reduced.”
- Oracle TimesTen Overview

Слайд 21In-Memory Computing Costs have Plummeted
Turning Torso: 190m
Cost of 1 Mb of

memory in 2000: ≈$1

Слайд 22In-Memory Computing Costs have Plummeted
Cost of 1 Mb of memory today:

≈ ½ cent

75cm

And shrinking, and shrinking, and shrinking….

IKEA
MICKE
Skrivbord
399 kr


Слайд 23Prices Continue to Slide
DRAM production costs drop by 30% every 12

months

Слайд 24In-Memory Computing
Copy
ETL


Up to 1,000x faster
No optimizations required


Слайд 25Row vs. Column Databases
My Filing System
My Wife’s Filing System
Row-based
Column-based


Слайд 26Column Databases
Copy
ETL


Up to 1,000x faster
More data in less space


Слайд 27Massively Parallel Systems
E.g. Netezza technology now part of IBM PureSystems
E.g. Greenplum,

now part of EMC

Слайд 28Column Stores, Compression, and Parallel Processing
E.g. DB2 with BLU acceleration


Слайд 29“In-Chip” Processing
E.g. SiSense

Vector-based instructions
Cache-optimized
Decompression

Close collaboration between in-memory software vendors and chip

developers (e.g. SAP & Intel Haswell)

Слайд 30Massively Parallel Hardware
Copy
ETL

Query
Up to 1,000x faster
Optimized for hardware



Слайд 31In-Database Processing
E.g. SAS & Teradata


Слайд 32Move Processing to the Data

Operational (OLTP)
Analytics (OLAP)
Planning Predictive
Text Search
Spatial





Processing Engines
Relational Stores
Row based
Columnar
ETL Data

Quality


Document Store

Object Graph Store


Слайд 33In-Database Analytics
Copy
ETL

Query
Up to 1,000x faster
Push processing down to dedicated hardware, less

traffic




Слайд 34Real-Time Data
Copy
ETL
Real-time replication — why have a separate operational data store?



Слайд 35Transactions
ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that

guarantee that database transactions are processed reliably.

ACID

ACID

compliance


Слайд 36In-Memory Enterprise Applications
E.g. Microsoft SQL Server In-Memory OLTP


Слайд 37In-Memory Enterprise Applications
E.g. SAP S/4 HANA


Слайд 38Hybrid Transactional Analytical Processing
Copy
Use a single platform for both analytics and

applications



Слайд 39Virtuous Circle of Technology
In-Memory

Columnar Databases

Hardware Acceleration

Calculation Engine

Columnar storage increases the amount

of data that can be stored in limited memory (compared to disk)

Column databases enable easier parallelization of queries

In-memory processing gives more time for relatively slow updates to column data

In-memory allows sophisticated calculations in real-time

Hardware acceleration makes sophisticated calculations possible

Each technology works well on its own, but combining them all is the real opportunity — provides all of the upside benefits while mitigating the downsides


Слайд 40Apache Spark
MAP
Reduce
HDFS


MAP
Reduce


Data
Source 2
map()
join()
cache()
transform
Hadoop V1
Spark


Слайд 41Lots of Support for Spark


Слайд 42YARN
HDFS

HANA-Spark Adapter for improved performance between distributed systems
Compiled queries enable applications

& data analysis to work more efficiently across nodes

Familiar OLAP experience on Hadoop to derive business insights from big data such as drill-down into HDFS data

Compiled Queries
Spark Adapter
Drill Downs

SAP HANA in-memory platform




Vora

Spark

Vora

Spark


Vora

Spark

HANA-Spark Adaptor

HANA Smart Data Access, UDFs, Others


Extensive programming support for Scala, python, C, C++, R, and Java allow data scientists to use their tool of choice,

Enable data scientists and developers who prefer Spark R, Spark ML to mash up corporate data with Hadoop/Spark data easily

Optionally, leverage HANA’s multiple data processing engines for developing new insights from business and contextual data.

Spark Extensions

SAP HANA Vora


Слайд 43Persistence & Failover


Слайд 44Next-Generation Chips Are On Their Way
NVM non-volatile memory


Слайд 45Scale Up
4,294,967,296x
256x
16 bit
32 bit
64 bit
64 kilobytes
4 gigabytes
16 exabytes


Directly addressable memory


Слайд 46What About Scale?
There are now systems with more than half a

petabyte of in-memory, and growing…

Слайд 47Balancing Data Temperature and Costs
Hot
Warm
Cold
Data is accessed frequently
Data is not accessed

frequently

Data is only accessed sporadically



Volume of data

Performance (and direct cost)


Many different solutions possible


Слайд 48What Type of In-Memory Is The Right One?
 

Complex ROI calculations



Data volumes
Relative

costs (?)

Cost of storage

Value of speed


Value of agility


Слайд 49Fast-Moving Market


Слайд 50Hybrid vs. Pure In-Memory Tradeoffs
data duplication vs single source
replicated vs real-time
unpredictable

response times vs consistent response times

Слайд 51Top Benefits


Слайд 52Speed
“If things seem under control, you’re just not going fast enough.”
Mario

Andretti

Слайд 53Real-Time Operations
Instead of analyzing the shards of glass after the accident,

what if you could catch the vase BEFORE it hit the ground?

Слайд 54Agility (Speed of Change)


Слайд 55Simplification = Lower Costs
“In-memory changes the cost equation through simplification. It

can help save costs on hardware and software, as well as reduce labor required for administration and development needs. Based on a composite cost model, an in-memory platform can save an organization 37% across hardware, software, and labor costs, depending on various factors.”

Слайд 56Lower Costs
“Don’t let somebody say to you we can’t go in-memory

because it’s so much more money. Acquisition costs may be higher. If you calculate out a TCO, it’s going to be less.”
Donald Feinberg, Gartner



Слайд 57

The price of light…
…is less than the cost of darkness
ROI =

Return On Ignorance?

Слайд 58New, Simpler Infrastructures and Business Models
Weissbeerger Beverage Analytics


Слайд 59Conclusion


Слайд 60Myths & Facts
It’s a niche technology to run analytics faster
It has

been around since late 1990s

The main users of in-memory analytics are SMBs

Entire industries (SaaS, social networks, financial trading, online gaming) would not exist as we know them today without in-memory computing

More than 50 software vendors deliver in-memory technology

Small number of in-memory vendors

Only for deep-pocketed organizations

New and unproven





Myths

Facts


Слайд 61Business Impact of In-Memory Computing
Reducing applications running cost via data

base/legacy applications offloading
Improving transactional applications performance
Enabling horizontal, elastic scalability (scale up/down)
Boosting response time in analytical applications
Low latency (<1 microsecond) application messaging
Dramatically shortening batch processes execution time
Enabling real-time, "self-service" business intelligence and unconstrained data exploration
Detecting correlations/patterns across million of events in "a blink of an eye"
Supporting "big data" (big data needs big memory)
Running transactional and analytical applications on the same physical dataset


Run the business

Grow the business

Transform the business

Opportunities:

Business Impact


Слайд 62In-Memory Changes Everything
“In-memory computing will have a long-term, disruptive impact by

radically changing users’ expectations, application design principles, products’ architecture and vendors’ strategy.”
— Gartner

Слайд 63Thank you!


Обратная связь

Если не удалось найти и скачать презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое ThePresentation.ru?

Это сайт презентаций, докладов, проектов, шаблонов в формате PowerPoint. Мы помогаем школьникам, студентам, учителям, преподавателям хранить и обмениваться учебными материалами с другими пользователями.


Для правообладателей

Яндекс.Метрика