2016-02 Graphs - PG+RDF

Two graph data models
RDF and Property Graphs
Andy Seaborne
Paolo Castagna
andy@a.o, castagna@a.o

Outline
➢ Graphs
➢ Data Model: RDF
➢ Data Model: Property Graphs
➢ Best of both?

Andy
➢ Involved in Linked Data standards
(SPARQL, RDF)
➢ Open source: contributor to Apache Jena
➢ Work for TopQuadrant, an RDF tools
company

Graphs
Org charts are
not trees

Graphs
Reference data
Life Sciences Ontologies
Vocabularies
Sharable data
Wikipedia Info boxes (DBpedia)
Analytics and Unstructured data
Fraud analysis
Social Graphs

Use Case for Graphs
Looking for patterns
➢ Analytics
● Social networks and recommendation engines
● Data center infrastructure management
➢ Knowledge Graphs
● Happenings: people, places, events
● Customer databases / products catalogues

Graph Data Models
➢ RDF
● W3C Standard
➢ Property Graphs
● Industry standard

RDF
➢ A graph is a set of links
Link: a triple : subject - predicate - object
predicate (or property) is the link name : an IRI
➢ IRIs (=URIs)
literals (strings, numbers, …)
blank nodes

prefix : <http://example/myData/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix foaf: <http://xmlns.com/foaf/0.1/>
# foaf:name is a short form of <http://xmlns.com/foaf/0.1/name>
:alice rdf:type foaf:Person ;
foaf:name "Alice Smith" ;
foaf:knows :bob .
:alice
foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
:bob

:bob rdf:type foaf:Person ;
foaf:name "Bob Brown" .
"Bob Brown"
foaf:Person
rdf:type
:bob

:alice rdf:type foaf:Person ;
foaf:name "Alice Smith" ;
foaf:knows :bob .
:bob rdf:type foaf:Person ;
foaf:name "Bob Brown" .
:alice
foaf:knows
"Alice Smith"
foaf:name
foaf:Person
rdf:type
"Bob Brown"
foaf:Person
rdf:type
:bob

RDFS
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
foaf:Person rdfs:subClassOf foaf:Agent .
foaf:Person rdfs:subClassOf
<http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing> .
foaf:skypeID
rdfs:domain foaf:Agent ;
rdfs:label "Skype ID" ;
rdfs:range rdfs:Literal ;
rdfs:subPropertyOf foaf:nick .

RDF : Access
➢ SPARQL : Query language
➢ Protocol : over HTTP
## Names of people Alice knows.
PREFIX : <http://example/myData/>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
SELECT * {
:alice foaf:knows ?X .
?X foaf:name ?name .
}

RDF : Access
➢ SPARQL : Query language
SELECT ?name ?numFriends {
{ SELECT ?person (count(*) AS ?numFriends) {
?person foaf:knows ?X .
} GROUP BY ?person
}
?person foaf:name ?name .
} ORDER BY ?numFriends

RDF : Access
➢ SPARQL : Update language
PREFIX : <http://example/myData/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
INSERT DATA {
:bob foaf:name "Bob Brown" ;
foaf:knows :alice
} ;
INSERT { :alice knows ?B }
} WHERE {
:bob knows ?B
}

Apache Jena
TLP: April 2012
➢ Involvement in standards
➢ RDF 1.1, SPARQL 1.1
➢ RDF database
➢ SPARQL server
Other RDF@ASF:
➢ Any23, Marmotta, Clerezza, Stanbol, Rya

Property Graph Data Model
A property graph is a set of vertices and edges with
respective properties (i.e. key / values):
➢ each vertex or edge has a unique identifier
➢ each vertex has a set of outgoing edges and a set of incoming edges
➢ edges are directed: each edge has a start vertex and an end vertex
➢ each edge has a label which denotes the type of relationship
➢ vertexes and edges can have a properties (i.e. key / value pairs)
Directed multigraph with properties
attached to vertexes and edges

Property Graph: Example
id = 1 id = 2
name = “Alice”
surname = “Smith”
age = 32
email = alice@example.com
...
name = “Bob”
surname = “Brown”
age = 45
email = bob@example.com
...
since = 01/01/1970
...
id = 3
knows

Property Graphs : Access
➢ Tinkerpop Gremlin
DSL for various languages
g.V().as('person').out('knows').as('friend')
.select().by{it.value('name').length()}
➢ Cypher
MATCH (you:Person {name:"You"})
FOREACH (name in ["Johan","Rajesh","Anna","Julia","Andrew"] |
CREATE (you)-[:FRIEND]->(:Person {name:name}))
➢ Connect : API

Property Graphs @ASF
➢ Apache Tinkerpop
➢ Apache Spark > GraphX
➢ Apache Giraph
➢ Apache Flink > Gelly

RDF
Standards
Information modeling
Data publishing
Property Graphs
Code
Analytics
Data capture

Layering
Using Property Graphs tech for RDF
Using RDF tech for Property Graphs
Doable but why?
Can’t use the tools of one without
understanding the other.

What to take from RDF
URIs as data types
Data Exchange
Data modelling
Emphasis on data formats for exchange
Relational Algebra engines

URIs matter
https://twitter.com/canberratimes/status/700198365393321984

What to take from PG
Separate links and values
Short names for attributes
Engines for Graph Algorithms

Some Conclusions
➢ Data Graphs are (still) new to many people
➢ RDF emphasizes information modelling
→ Knowledge graphs e.g SNOMED
→ SQL-like query
➢ Property Graph emphasizes data syntax
→ Data capture
→ Graph analytic algorithms
➢ Naive layering of data models leads dissatisfaction
→ Can only mix toolsets by knowing it’s layered
➢ Could share technology
→ Storage, data access, query algebra

The Answer
Building one on top of the other is possible
… but why do it?
Really hard to use! Worse of both worlds.
Semantic Web has some useful features
Apply to property graphs

2016-02 Graphs - PG+RDF

Related slideshows

More Related Content

What's hot

What's hot (20)

Similar to 2016-02 Graphs - PG+RDF

Similar to 2016-02 Graphs - PG+RDF (20)

Recently uploaded

Recently uploaded (20)

2016-02 Graphs - PG+RDF