【知识图谱】Neo4j GDS（Graph Data Science）安装和使用-CFANZ编程社区

Neo4j Graph Data Science （GDS）

安装

基本步骤

projected graph model

创建图：从Neo4j中进行图形投影，放入内存中进行操作。
选择算法：选择合适的算法。
存储结果：存储算法执行结果。

基本介绍

三种等级

Production-quality：算法已经稳定和可扩展（gds.<algorithm>.）
Beta：算法待稳定（gds.beta.<algorithm>.）
Alpha：算法不稳定（gds.alpha.<algorithm>.）

两种变体

Named graph variant：要操作的图形将从图形目录中读取。

CALL gds.graph.create(
  'persons',            
  'Person',             
  'KNOWS'               
)
YIELD
  graphName AS graph, nodeProjection, nodeCount AS nodes, relationshipProjection, relationshipCount AS rels

Anonymous graph variant：作为算法执行的一部分，将创建和删除要操作的图形。

四种执行模式

stream：以记录流的形式返回算法的结果。
stats：返回汇总统计信息的单个记录，但不写入Neo4j数据库。
mutate：将算法的结果写入内存中的图形，并返回汇总统计信息的单个记录。这种模式是为命名图变量设计的，因为它的效果在匿名图上是不可见的。
write：将算法的结果写入Neo4j数据库，并返回汇总统计信息的单个记录。

最后，可以通过在命令后面附加estimate来估计执行模式。

功能函数

常用功能函数：

gds.util.asNode / gds.util.asNodes

语法：gds.util.asNode(nodeId) / gds,util.asNodes(nodeIds)

例子：

CREATE  (nAlice:User {name: 'Alice'})
CREATE  (nBridget:User {name: 'Bridget'})
CREATE  (nCharles:User {name: 'Charles'})
CREATE  (nAlice)-[:LINK]->(nBridget)
CREATE  (nBridget)-[:LINK]->(nCharles)

MATCH (u:User{name: 'Alice'})
WITH id(u) AS nodeId
RETURN gds.util.asNode(nodeId).name AS node
                       
MATCH (u:User)
WHERE NOT u.name = 'Charles'
WITH collect(id(u)) AS nodeIds
RETURN [x in gds.util.asNodes(nodeIds)| x.name] AS nodes

算法

格式：

CALL gds[.<tier>].<algorithm>.<execution-mode>[.<estimate>](
  graphName: String,
  configuration: Map
)

以 Dijkstra Source-Target 算法为例：

//首先创建图
CREATE (a:Location {name: 'A'}),
       (b:Location {name: 'B'}),
       (c:Location {name: 'C'}),
       (d:Location {name: 'D'}),
       (e:Location {name: 'E'}),
       (f:Location {name: 'F'}),
       (a)-[:ROAD {cost: 50}]->(b),
       (a)-[:ROAD {cost: 50}]->(c),
       (a)-[:ROAD {cost: 100}]->(d),
       (b)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 80}]->(e),
       (d)-[:ROAD {cost: 30}]->(e),
       (d)-[:ROAD {cost: 80}]->(f),
       (e)-[:ROAD {cost: 40}]->(f);
//其次创建图投影，存入内存中提高算法速度
CALL gds.graph.create(
    'myGraph',
    'Location',
    'ROAD',
    {
        relationshipProperties: 'cost'
    }
)
//最后执行算法
//1.评估成本
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.write.estimate('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost',
    writeRelationshipType: 'PATH'
})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
RETURN nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
//2.返回算法结果
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
    index,
    gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
    costs,
    nodes(path) as path
ORDER BY index