Neo4j Graph Data Science (GDS)
安装
基本步骤

- 创建图:从Neo4j中进行图形投影,放入内存中进行操作。
 - 选择算法:选择合适的算法。
 - 存储结果:存储算法执行结果。
 
基本介绍
三种等级
- Production-quality:算法已经稳定和可扩展(
gds.<algorithm>.) - Beta:算法待稳定(
gds.beta.<algorithm>.) - Alpha:算法不稳定(
gds.alpha.<algorithm>.) 
两种变体
- Named graph variant:要操作的图形将从图形目录中读取。
 
CALL gds.graph.create(
  'persons',            
  'Person',             
  'KNOWS'               
)
YIELD
  graphName AS graph, nodeProjection, nodeCount AS nodes, relationshipProjection, relationshipCount AS rels
 
- Anonymous graph variant:作为算法执行的一部分,将创建和删除要操作的图形。
 
四种执行模式
- stream:以记录流的形式返回算法的结果。
 - stats:返回汇总统计信息的单个记录,但不写入Neo4j数据库。
 - mutate:将算法的结果写入内存中的图形,并返回汇总统计信息的单个记录。这种模式是为命名图变量设计的,因为它的效果在匿名图上是不可见的。
 - write:将算法的结果写入Neo4j数据库,并返回汇总统计信息的单个记录。
 
最后,可以通过在命令后面附加estimate来估计执行模式。
功能函数
常用功能函数:
-  
gds.util.asNode / gds.util.asNodes
语法:gds.util.asNode(nodeId) / gds,util.asNodes(nodeIds)
例子:
CREATE (nAlice:User {name: 'Alice'}) CREATE (nBridget:User {name: 'Bridget'}) CREATE (nCharles:User {name: 'Charles'}) CREATE (nAlice)-[:LINK]->(nBridget) CREATE (nBridget)-[:LINK]->(nCharles) MATCH (u:User{name: 'Alice'}) WITH id(u) AS nodeId RETURN gds.util.asNode(nodeId).name AS node MATCH (u:User) WHERE NOT u.name = 'Charles' WITH collect(id(u)) AS nodeIds RETURN [x in gds.util.asNodes(nodeIds)| x.name] AS nodes 
算法
格式:
CALL gds[.<tier>].<algorithm>.<execution-mode>[.<estimate>](
  graphName: String,
  configuration: Map
)
 
以 Dijkstra Source-Target 算法为例:
//首先创建图
CREATE (a:Location {name: 'A'}),
       (b:Location {name: 'B'}),
       (c:Location {name: 'C'}),
       (d:Location {name: 'D'}),
       (e:Location {name: 'E'}),
       (f:Location {name: 'F'}),
       (a)-[:ROAD {cost: 50}]->(b),
       (a)-[:ROAD {cost: 50}]->(c),
       (a)-[:ROAD {cost: 100}]->(d),
       (b)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 40}]->(d),
       (c)-[:ROAD {cost: 80}]->(e),
       (d)-[:ROAD {cost: 30}]->(e),
       (d)-[:ROAD {cost: 80}]->(f),
       (e)-[:ROAD {cost: 40}]->(f);
//其次创建图投影,存入内存中提高算法速度
CALL gds.graph.create(
    'myGraph',
    'Location',
    'ROAD',
    {
        relationshipProperties: 'cost'
    }
)
//最后执行算法
//1.评估成本
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.write.estimate('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost',
    writeRelationshipType: 'PATH'
})
YIELD nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
RETURN nodeCount, relationshipCount, bytesMin, bytesMax, requiredMemory
//2.返回算法结果
MATCH (source:Location {name: 'A'}), (target:Location {name: 'F'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
    sourceNode: source,
    targetNode: target,
    relationshipWeightProperty: 'cost'
})
YIELD index, sourceNode, targetNode, totalCost, nodeIds, costs, path
RETURN
    index,
    gds.util.asNode(sourceNode).name AS sourceNodeName,
    gds.util.asNode(targetNode).name AS targetNodeName,
    totalCost,
    [nodeId IN nodeIds | gds.util.asNode(nodeId).name] AS nodeNames,
    costs,
    nodes(path) as path
ORDER BY index









