Humans By The Hundred презентация

Содержание

$ whoami SRE Manager at Yelp CWRU Alum Pittsburgh native

Слайд 1Humans By The Hundred
Scaling Big Data for Big Team Growth


Слайд 2$ whoami
SRE Manager at Yelp
CWRU Alum
Pittsburgh native


Слайд 3

Yelp’s Mission:
Connecting people with great
local businesses.


Слайд 4

Yelp Stats:
As of Q2 2015


Слайд 5What is Yelp?
Many sites: www, m, biz, api
Mobile apps
Partner platform
Hundreds of

developers
Thousands of servers

Слайд 6Why Am I Here?


Слайд 9This talk is about people


Слайд 17The Goal


Слайд 18Iterate as fast as possible


Слайд 19Regardless of how many people are participating


Слайд 20Deployment


Слайд 21How It Starts


Слайд 22Deployment: the early days
Get a few people together in slack/irc/etc.
Merge up

the code
Run the tests
Manually test it in stage
Cross your fingers

Слайд 25Things get slower...
Tests take longer to run
More hosts = longer downloads
More

developers = more eyeballs
More features = more code

Слайд 26The Problem: Humans Are Fallible


Слайд 27The Problem: Humans Are Fallible
“…oh @$#&”


Слайд 29The Problem, With Math
Assume:
Every change has a chance of success: 98%
That

means no test failures, no reverts, etc.
Every deploy has a number of changes: n
Any failure in the pipeline invalidates the deploy
Let’s figure out the probability of a successful deployment: p

Слайд 30The Problem, With Math
Only you
p = .98 (98%)
You and a friend
p

= .98 * .98 = .96 (96%)
You and nine co-workers
p = .98 * .98 * .98 * … * .98 = .82 (82%)

Слайд 31The Problem, With Math
p = (.98)n


Слайд 32The Problem, With Math
p = (.98)n
exponential decay!


Слайд 34This doesn’t scale!
More developers = more changes
More changes = longer deploys
Longer

deploys = less time to develop
Less time to develop = slower to iterate
Slower to iterate != the goal

Слайд 35Mitigating Exponential Decay
p = (.98)n


Слайд 36Mitigating Exponential Decay
p = (.98)n


Слайд 38Making it harder to screw up
Write more tests
Write better tests
Get better

code reviews
Get better infrastructure
Switch programming languages
Use better tools

Слайд 39Just write better software and stop making mistakes!


Слайд 40PROBLEM SOLVED


Слайд 42The Real World
Testing builds confidence in our changes
Testing does not protect

you from failure
Better tools, tests, and infrastructure can raise our success rates

Слайд 43Mitigating Exponential Decay
p = (.98)n


Слайд 44Mitigating Exponential Decay
p = (.98)n


Слайд 45Service-Oriented Architecture
Large monolith → smaller services
Services communicate over network
Usually HTTP, but

you can do RPC, SOAP, etc.
Service = independent code base
Independent deployments

Слайд 46Service-Oriented Architecture
Benefits
Smaller code bases = upper bound to n
Failure domains become

isolated
Technology independence
Federated responsibility

Слайд 47Service-Oriented Architecture
Drawbacks
everything becomes decoupled
function calls start looking like HTTP requests
versioning can

be a nightmare
tracking dependencies is hard
data consistency becomes challenging
end-to-end testing becomes hard(er), if not impossible

Слайд 48SOA scales people, not code.


Слайд 49Conquering SOA
With the monolith, it’s easy to focus on mean time

between failures (MTBF)

Слайд 50Conquering SOA
In a SOA, focus on mean time to recovery (MTTR)


Слайд 51Conquering SOA
Fail fast
Anticipate failure
Leverage iteration speed to recover fast


Слайд 52Conquering SOA
Treat everything as distributed
That means everything will fail
Use timeouts, retries
Find

ways to degrade gracefully
Fail fast & isolated
Don’t rely on synchronous processes
Prepare for eventual consistency

Слайд 53Reaping the Benefits
Smaller failure domains
Fewer people & changes to manage
Deploys get

smaller
Deploys get faster
Deploys become continuous

Слайд 54Reaping the Benefits
Smaller changes
means smaller code reviews
means faster validation
means smaller blast

radius
means faster iteration

Слайд 55Continuous Delivery
Everyone works against master branch
Master is deployed when commits added
Deployment

gated by tests
Monitoring knows something is wrong before you do!

Слайд 56PROBLEM SOLVED


Слайд 57Testing


Слайд 58Tests are hard to get right.


Слайд 65How can we do better?


Слайд 67“Not Recommended” Tests


Слайд 68“Not Recommended” Tests
If a test fails on master:
a feature is broken

on the live website, or
your test sucks and you should ditch it
In either case, we disable it
Ticket is created
Developers can fix it later or just bin it and start fresh

Слайд 69Reliable tests >> test coverage.


Слайд 70Don’t always run all the tests!


Слайд 71Tests of external services should be monitoring


Слайд 72Define your boundaries.


Слайд 73
yelp.com / dataset_challenge
61K businesses
61K checkin-sets
481K business attributes
1.6M reviews
366K users
2.8M edge

social-graph
495K tips

Your academic project, research or visualizations, submitted by Dec 31, 2015
=
$5,000 prize + $1,000 for publication + $500 for presenting*

*See full terms on website

Academic dataset from 10 cities in 4 countries!


Слайд 74
@YelpEngineering
YelpEngineers



engineeringblog.yelp.com
github.com/yelp


Слайд 75

yelp.com/careers


Слайд 76Questions?


Обратная связь

Если не удалось найти и скачать презентацию, Вы можете заказать его на нашем сайте. Мы постараемся найти нужный Вам материал и отправим по электронной почте. Не стесняйтесь обращаться к нам, если у вас возникли вопросы или пожелания:

Email: Нажмите что бы посмотреть 

Что такое ThePresentation.ru?

Это сайт презентаций, докладов, проектов, шаблонов в формате PowerPoint. Мы помогаем школьникам, студентам, учителям, преподавателям хранить и обмениваться учебными материалами с другими пользователями.


Для правообладателей

Яндекс.Метрика