What a long, strange trip it’s been презентация

Содержание

1. What a long, strange trip it’s been
2. Outline of talk The context How
3. About 18 years ago, … People started
4. Context: The Web for humans HTML schema.org
5. Goal: Web for Machines & Humans schema.org
6. What does that mean?
7. How do we get there? How does
8. Going depth first Many heated battles Lot
9. Timeline of ‘standards’ ‘96: Meta Content Framework
10. But something was missing … Fewer than
11. ’07 - :Rise of the consumers Yahoo!
12. Yahoo Search Monkey Give websites control over
13. Google Rich Snippets: Reviews schema.org
14. Google Rich Snippets: Events schema.org
15. Google Rich Snippets Multi-syntax Adhoc vocabulary for
16. Situation in 2010 Too many choices/decisions for
17. Schema.org Work started in August 2010 Google,
18. Schema.org: Major sites News: Nytimes, guardian.com, bbc.co.uk,
19. Schema.org principles: Simplicity Simple things should be
20. Schema.org principles: Simplicity Can’t expect webmasters to
21. Schema.org principles: Simplicity Copy and edit as
22. Schema.org principles: Incremental Started simple
23. Schema.org Principles: URIs ~1000s of terms
24. + = USA schema.org
25. Schema.org Principles: Collaborations Most discussions on public
26. Schema.org Principles: Collaborations IPTC /NYTimes / Getty
27. Schema.org Principles: Partners Partner with Authoring platforms
28. Recent Additions From Nouns to Verbs: Actions
29. Recent Additions Scholarly work, Comics, Serials, …
30. Looking forward Schema.org is doing better than
31. Newer Applications: Knowledge Graph schema.org
32. Newer Applications: Knowledge Graph schema.org
33. Non search applications: Google Now User profile
34. Pinterest: Schema.org for Rich Pins schema.org
35. Reservations ➔ Personal Assistant Open Table website → confirmation email → Android Reminder schema.org
36. Vertical Search Structured data in search Web
37. Google Rich Snippets: Recipe View schema.org
38. Web scale vertical search Searching for Veteran friendly jobs schema.org
39. Web Scale custom vertical search Build your
40. Scientific Data Publishing US Govt alone spends
41. Case study: Clinical Trials Clinical
42. Case study: SkyServer Huge amount
43. First steps for scientific data
44. Concluding Structured data on the web is
45. Questions? schema.org

Слайд 1What a long, strange trip it’s been
R.V.Guha
Google

schema.org

Слайд 2Outline of talk
The context
How did we end up where we

are

Schema.org
What it is, status of adoption
Schema.org principles, how does it work

Looking ahead
Next Generation Applications

schema.org

Слайд 3About 18 years ago, …
People started thinking about structured data on

the web
A few people from Netscape, Microsoft and W3C got together @MIT

Trying to make sense of a flurry of activity/proposals
XML, MCF, CDF, Sitemaps, …

There were a number of problems
PICS, Meta data, sitemaps, …

But one unifying idea

schema.org

Слайд 4Context: The Web for humans

HTML
schema.org

Слайд 5Goal: Web for Machines & Humans

schema.org

Слайд 6 What does that mean?
Notable points

- Graph Data Model
- Common Vocabulary

schema.org

Слайд 7How do we get there?
How does the author give us the

graph
Data Model: Graph vs tree vs …
Syntax
Vocabulary
Identifiers for objects

Why should the author give us the graph?

schema.org

Слайд 8Going depth first
Many heated battles
Lot of proposals, standards, companies, …

Data model
Trees

vs DLGs vs Vertical specific vs who needs one?

Syntax
XML vs RDF vs json vs …

Model theory anyone
We need one vs who cares vs what’s that?

schema.org

Слайд 9Timeline of ‘standards’
‘96: Meta Content Framework (MCF) (Apple)
’97: MCF using XML

(Netscape) → RDF, CDF
’99 -- : RDF, RDFS
’01 -- : DAML, OWL, OWL EL, OWL QL, OWL RL
’03: Microformats
And many many many more … SPARQL, Turtle, N3, GRDDL, R2RML, FOAF, SIOC, SKOS, …

Lots of bells & whistles: model theory, inference, type systems, …

schema.org

Слайд 10But something was missing …
Fewer than 1000 sites were using these

standards

Something was clearly missing and it wasn’t more language features

We had forgotten the ‘Why’ part of the problem

The RSS story

schema.org

Слайд 11’07 - :Rise of the consumers
Yahoo! Search Monkey, Google Rich Snippets,

Facebook Open Graph

Offer webmasters a simple value proposition

Search engines to webmasters:
You give us data … we make your results nicer

Usage begins to take off
1000x increase in markup’ed up pages in 3 years

schema.org

Слайд 12Yahoo Search Monkey
Give websites control over snippet presentation
Moderate adoption
Targeted at

high end developers
Too many choices

schema.org

Слайд 13Google Rich Snippets: Reviews

schema.org

Слайд 14Google Rich Snippets: Events

schema.org

Слайд 15Google Rich Snippets
Multi-syntax
Adhoc vocabulary for each vertical
Very clear carrot
Lots of

experimentation on UI
Moderately successful: 10ks of sites
Scaling issues with vocabulary

schema.org

Слайд 16Situation in 2010
Too many choices/decisions for webmasters
Divergence in vocabularies
Too much fragmentation

N versions of person, address, …

A lot of bad/wrong markup
~25% for micro-formats, ~40% with RDFA
Some spam, mostly unintended mistakes

Absolute adoption numbers still rather low
Less than 100k sites

schema.org

Слайд 17Schema.org
Work started in August 2010
Google, Yahoo!, Microsoft & then Yandex

Goals:
One vocabulary

understood by all the search engines
Make it very easy for the webmaster

It is A vocabulary. Not The vocabulary.
Webmasters can use it together other vocabs
We might not understand the other vocabs. Others might

schema.org

Слайд 18Schema.org: Major sites
News: Nytimes, guardian.com, bbc.co.uk,
Movies: imdb, rottentomatoes, movies.com
Jobs / careers:

careerjet.com, monster.com, indeed.com
People: linkedin.com,
Products: ebay.com, alibaba.com, sears.com, cafepress.com, sulit.com, fotolia.com
Videos: youtube, dailymotion, frequency.com, vinebox.com
Medical: cvs.com, drugs.com
Local: yelp.com, allmenus.com, urbanspoon.com
Events: wherevent.com, meetup.com, zillow.com, eventful
Music: last.fm, myspace.com, soundcloud.com

schema.org

Слайд 19Schema.org principles: Simplicity
Simple things should be simple
For webmasters, not necessarily for

consumers of markup
Webmasters shouldn’t have to deal with N namespaces

Complex things should be possible
Advanced webmasters should be able to mix and match vocabularies

Syntax
Microdata, usability studies
RDFa, json-ld, …

schema.org

Слайд 20Schema.org principles: Simplicity
Can’t expect webmasters to understand Knowledge Representation, Semantic Web

Query Languages, etc.

It has to fit in with existing workflows
A posteriori ‘markup tools’ don’t work

Avoid KR system driven artifacts
Multiple domain / range for attributes
No classes like ‘Agent’
Categories and attributes should be concrete

schema.org

Слайд 21Schema.org principles: Simplicity
Copy and edit as the default mode for authors
It

is not a linear spec, but a tree of examples

Vocabularies
Authors only need to have local view
But schema.org tries to have a single global coherent vocabulary

schema.org

Слайд 22Schema.org principles: Incremental
Started simple
~ 100 categories at launch

Applies to

every area
Add complexity after adoption
now ~1200 vocab items
Go back and fill in the blanks

Move fast, accept mistakes, iterate fast

schema.org

Слайд 23Schema.org Principles: URIs
~1000s of terms like Actor, birthdate
~10s for most

sites
Common across sites

~10ks of terms like USA
External enumerations

~1b-100b terms like Chuck Norris and Ryan, Oklahama
Cannot expect agreement on these
Reference by description
Consumers can reconcile entity references

schema.org

Слайд 24
+
=
USA
schema.org

Слайд 25Schema.org Principles: Collaborations
Most discussions on public W3C lists

Work closely with interest

communities

Work with others to incorporate their vocabularies
We give them attribution on schema.org
Webmasters should not have to worry about where each piece of the vocabulary came from
Webmasters can mix and match vocabs

schema.org

Слайд 26Schema.org Principles: Collaborations
IPTC /NYTimes / Getty with rNews
Martin Hepp with Good

Relations
US Veterans, Whitehouse, Indeed.com with Job Posting
Creative Commons with LRMI
NIH National Library of Medicine for Medical vocab.
Bibextend, Highwire Press for Bibliographic vocabulary
Benetech for Accessibility
BBC, European Broadcasting Union for TV & Radio schema
Stackexchange, SKOS group for message board
Lots and lots and lots of individuals

schema.org

Слайд 27Schema.org Principles: Partners
Partner with Authoring platforms
Drupal, Wordpress, Blogger, YouTube

Drupal 8
Schema.org markup

for many types
News articles, comments, users, events, …
More schema.org types can be created by site author
Markup in HTML5 & RDFa Lite
Will come out early 2015

schema.org

Слайд 28Recent Additions
From Nouns to Verbs: Actions
Object → potential actions
Constraints on actions
E.g.,

ThorMovie → Stream, Buy, …

Introducing time: Roles
E.g., Joe Montana played for the SF 49ers from 1979 to 1992 in the position QuarterBack

schema.org

Слайд 29Recent Additions
Scholarly work, Comics, Serials, …
Communications: TV, Radio, Q&A, …
Accessibility
Commerce: Reservations
Sports
Buyer/Seller,

etc.
Bibtex

The ontology is growing …
~800 properties
~600 classes

schema.org

Слайд 30Looking forward
Schema.org is doing better than we expected
Thanks to millions of

webmasters!

But this is not the final goal
Just the means to the next generation of applications

First generation of applications
Rich presentation of search results

Many new applications
Related to search and beyond

schema.org

Слайд 31Newer Applications: Knowledge Graph

schema.org

Слайд 32Newer Applications: Knowledge Graph

schema.org

Слайд 33Non search applications: Google Now
User profile
(google.com/now/topics)
+
structured data feeds

schema.org

Слайд 34Pinterest: Schema.org for Rich Pins
schema.org

Слайд 35Reservations ➔ Personal Assistant
Open Table website → confirmation email → Android

Reminder

schema.org

Слайд 36Vertical Search
Structured data in search
Web search: annotate search results
OR
Filtering based

on structured data
Only in specialized corpus
Ecommerce, real estate, etc.

How about filtering based on structured data across the web?

schema.org

Слайд 37Google Rich Snippets: Recipe View

schema.org

Слайд 38Web scale vertical search
Searching for Veteran friendly jobs
schema.org

Слайд 39Web Scale custom vertical search
Build your own custom vertical search engine
Google

does the heavy lifting: crawling, indexing, etc.
You specify the schema.org restricts
APIs to help build your own UI

Searches over all pages on the web with a certain schema.org markup

Demo

schema.org

Слайд 40Scientific Data Publishing
US Govt alone spends over $60B/yr on scientific research

Primary

output of most of this research is data
Most of the data is thrown away
All that is published are papers

We would like the data published in a easily reusable form

schema.org

Слайд 41 Case study: Clinical Trials
Clinical trials
4000+ clinical trials at any

time in the US alone
Almost all the data ‘thrown away’
All that gets published is a textual ‘abstract’

Many of the trials are redundant
Earlier trials have the data
Assumptions, etc. cannot be re-examined
Longitudinal studies extremely hard, but super important

Having all the clinical trial data on the web, in a common schema will make this much easier!

schema.org

Слайд 42 Case study: SkyServer
Huge amount of astronomy data

Jim Gray, NASA

and others brought it all together, normalized it and made it available on the web

Has changed the way astronomy research takes place
Students in Africa getting PhDs without leaving Africa!
Radio/Ultra-violet/Visible light data easily brought together

Caveats
SQL biased, not distributed, not scalable
All normalization done by hand, once
Small number of data sources
But shows that it can be done …

schema.org

Слайд 43 First steps for scientific data publication
OPTC directive for data

from federally funded research to be freely available

Formation of new ‘Data Science’ institute inside NIH

Seeing traction in scientific data on the web
Lot of interest in creating schemas
Public repositories for scientific data starting

schema.org

Слайд 44Concluding
Structured data on the web is now ‘web scale’

Schema.org has got

traction and is evolving

The most interesting applications are yet to come

schema.org

Слайд 45Questions?

schema.org

Скачать презентацию

What a long, strange trip it’s been презентация

Содержание

Слайд 1What a long, strange trip it’s beenR.V.GuhaGoogleschema.org

Слайд 2Outline of talkThe context How did we end up where we

Слайд 3About 18 years ago, …People started thinking about structured data on

Слайд 4Context: The Web for humans HTMLschema.org

Слайд 5Goal: Web for Machines & Humans schema.org

Слайд 6 What does that mean? Notable points

Слайд 7How do we get there?How does the author give us the

Слайд 8Going depth firstMany heated battlesLot of proposals, standards, companies, …Data modelTrees

Слайд 9Timeline of ‘standards’‘96: Meta Content Framework (MCF) (Apple)’97: MCF using XML

Слайд 10But something was missing …Fewer than 1000 sites were using these

Слайд 11’07 - :Rise of the consumersYahoo! Search Monkey, Google Rich Snippets,

Слайд 12Yahoo Search MonkeyGive websites control over snippet presentationModerate adoption Targeted at

Слайд 13Google Rich Snippets: Reviews schema.org

Слайд 14Google Rich Snippets: Events schema.org

Слайд 15Google Rich SnippetsMulti-syntaxAdhoc vocabulary for each verticalVery clear carrot Lots of

Слайд 16Situation in 2010Too many choices/decisions for webmastersDivergence in vocabulariesToo much fragmentation

Слайд 17Schema.orgWork started in August 2010Google, Yahoo!, Microsoft & then YandexGoals:One vocabulary

Слайд 18Schema.org: Major sitesNews: Nytimes, guardian.com, bbc.co.uk,Movies: imdb, rottentomatoes, movies.comJobs / careers:

Слайд 19Schema.org principles: SimplicitySimple things should be simpleFor webmasters, not necessarily for

Слайд 20Schema.org principles: SimplicityCan’t expect webmasters to understand Knowledge Representation, Semantic Web

Слайд 21Schema.org principles: SimplicityCopy and edit as the default mode for authorsIt

Слайд 22Schema.org principles: IncrementalStarted simple ~ 100 categories at launchApplies to

Слайд 23Schema.org Principles: URIs ~1000s of terms like Actor, birthdate~10s for most

Слайд 24 +=USAschema.org

Слайд 25Schema.org Principles: CollaborationsMost discussions on public W3C listsWork closely with interest

Слайд 26Schema.org Principles: CollaborationsIPTC /NYTimes / Getty with rNewsMartin Hepp with Good

Слайд 27Schema.org Principles: PartnersPartner with Authoring platformsDrupal, Wordpress, Blogger, YouTubeDrupal 8Schema.org markup

Слайд 28Recent AdditionsFrom Nouns to Verbs: ActionsObject → potential actionsConstraints on actionsE.g.,

Слайд 29Recent AdditionsScholarly work, Comics, Serials, …Communications: TV, Radio, Q&A, …AccessibilityCommerce: ReservationsSportsBuyer/Seller,

Слайд 30Looking forwardSchema.org is doing better than we expectedThanks to millions of

Слайд 31Newer Applications: Knowledge Graph schema.org

Слайд 32Newer Applications: Knowledge Graph schema.org

Слайд 33Non search applications: Google NowUser profile (google.com/now/topics) + structured data feedsschema.org

Слайд 34Pinterest: Schema.org for Rich Pinsschema.org

Слайд 35Reservations ➔ Personal AssistantOpen Table website → confirmation email → Android

Слайд 36Vertical SearchStructured data in searchWeb search: annotate search results ORFiltering based

Слайд 37Google Rich Snippets: Recipe View schema.org

Слайд 38Web scale vertical searchSearching for Veteran friendly jobsschema.org

Слайд 39Web Scale custom vertical searchBuild your own custom vertical search engineGoogle

Слайд 40Scientific Data PublishingUS Govt alone spends over $60B/yr on scientific researchPrimary

Слайд 41 Case study: Clinical TrialsClinical trials4000+ clinical trials at any

Слайд 42 Case study: SkyServerHuge amount of astronomy dataJim Gray, NASA

Слайд 43 First steps for scientific data publicationOPTC directive for data

Слайд 44ConcludingStructured data on the web is now ‘web scale’Schema.org has got

Слайд 45Questions? schema.org

Похожие презентации

Обратная связь

Что такое ThePresentation.ru?

Слайд 1What a long, strange trip it’s been
R.V.Guha
Google

schema.org

Слайд 2Outline of talk
The context
How did we end up where we

Слайд 3About 18 years ago, …
People started thinking about structured data on

Слайд 4Context: The Web for humans

HTML
schema.org

Слайд 5Goal: Web for Machines & Humans

schema.org

Слайд 6 What does that mean?
Notable points

Слайд 7How do we get there?
How does the author give us the

Слайд 8Going depth first
Many heated battles
Lot of proposals, standards, companies, …

Data model
Trees

Слайд 9Timeline of ‘standards’
‘96: Meta Content Framework (MCF) (Apple)
’97: MCF using XML

Слайд 10But something was missing …
Fewer than 1000 sites were using these

Слайд 11’07 - :Rise of the consumers
Yahoo! Search Monkey, Google Rich Snippets,

Слайд 12Yahoo Search Monkey
Give websites control over snippet presentation
Moderate adoption
Targeted at

Слайд 13Google Rich Snippets: Reviews

schema.org

Слайд 14Google Rich Snippets: Events

schema.org

Слайд 15Google Rich Snippets
Multi-syntax
Adhoc vocabulary for each vertical
Very clear carrot
Lots of

Слайд 16Situation in 2010
Too many choices/decisions for webmasters
Divergence in vocabularies
Too much fragmentation

Слайд 17Schema.org
Work started in August 2010
Google, Yahoo!, Microsoft & then Yandex

Goals:
One vocabulary

Слайд 18Schema.org: Major sites
News: Nytimes, guardian.com, bbc.co.uk,
Movies: imdb, rottentomatoes, movies.com
Jobs / careers:

Слайд 19Schema.org principles: Simplicity
Simple things should be simple
For webmasters, not necessarily for

Слайд 20Schema.org principles: Simplicity
Can’t expect webmasters to understand Knowledge Representation, Semantic Web

Слайд 21Schema.org principles: Simplicity
Copy and edit as the default mode for authors
It

Слайд 22Schema.org principles: Incremental
Started simple
~ 100 categories at launch

Applies to

Слайд 23Schema.org Principles: URIs
~1000s of terms like Actor, birthdate
~10s for most

Слайд 24
+
=
USA
schema.org

Слайд 25Schema.org Principles: Collaborations
Most discussions on public W3C lists

Work closely with interest

Слайд 26Schema.org Principles: Collaborations
IPTC /NYTimes / Getty with rNews
Martin Hepp with Good

Слайд 27Schema.org Principles: Partners
Partner with Authoring platforms
Drupal, Wordpress, Blogger, YouTube

Drupal 8
Schema.org markup

Слайд 28Recent Additions
From Nouns to Verbs: Actions
Object → potential actions
Constraints on actions
E.g.,

Слайд 29Recent Additions
Scholarly work, Comics, Serials, …
Communications: TV, Radio, Q&A, …
Accessibility
Commerce: Reservations
Sports
Buyer/Seller,

Слайд 30Looking forward
Schema.org is doing better than we expected
Thanks to millions of

Слайд 31Newer Applications: Knowledge Graph

schema.org

Слайд 32Newer Applications: Knowledge Graph

schema.org

Слайд 33Non search applications: Google Now
User profile
(google.com/now/topics)
+
structured data feeds

schema.org

Слайд 34Pinterest: Schema.org for Rich Pins
schema.org

Слайд 35Reservations ➔ Personal Assistant
Open Table website → confirmation email → Android

Слайд 36Vertical Search
Structured data in search
Web search: annotate search results
OR
Filtering based

Слайд 37Google Rich Snippets: Recipe View

schema.org

Слайд 38Web scale vertical search
Searching for Veteran friendly jobs
schema.org

Слайд 39Web Scale custom vertical search
Build your own custom vertical search engine
Google

Слайд 40Scientific Data Publishing
US Govt alone spends over $60B/yr on scientific research

Primary

Слайд 41 Case study: Clinical Trials
Clinical trials
4000+ clinical trials at any

Слайд 42 Case study: SkyServer
Huge amount of astronomy data

Jim Gray, NASA

Слайд 43 First steps for scientific data publication
OPTC directive for data

Слайд 44Concluding
Structured data on the web is now ‘web scale’

Schema.org has got

Слайд 45Questions?

schema.org