Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm getting the following out of memory error when running neo4j queries in Python. I'm using neo4j 4.1.0 desktop.

neo4j.exceptions.ClientError: {code: Neo.ClientError.Procedure.ProcedureCallFailed} {message: Failed to invoke procedure `gds.alpha.shortestPath.deltaStepping.stream`: Caused by: java.lang.OutOfMemoryError: Java heap space}

I've followed the instructions to change the memory available: https://neo4j.com/docs/operations-manual/current/configuration/neo4j-conf/ and assigned 12GB to the relevant parameters in the conf file:

dbms.memory.heap.initial_size=12g
dbms.memory.heap.max_size=12g
dbms.memory.pagecache.size=12g

My database has 63,000 nodes and 57,000 relationships

My python code looks like this and is called in a loop, with the id value changing each time:

neo4j_session = neo4j_driver.session()
results_data = neo4j_session.run("MATCH (start:Person {id: 21) 
        CALL gds.alpha.shortestPath.deltaStepping.stream({ 
            nodeQuery:'MATCH(n:Person) RETURN id(n) AS id', 
            relationshipQuery:'MATCH (p1:Person {id: 21})-[p1Knows:KNOWS]->(p1s)-[r:IS_MEMBER_OF*..10]-(p2s)<-[p2Knows:KNOWS]-(p2:Person) WHERE p1.id <> p2.id and p1Knows.self_rating <> 0 and p1Knows.self_rating < p2Knows.self_rating with p1, p2, reduce(cost = 0, x IN r | cost + coalesce(x.distance, 0)) as cost  RETURN id(p1) AS source, id(p2) AS target, cost AS weight', 
            startNode: start, 
            relationshipWeightProperty: 'weight', 
            delta: 3.0, 
            writeProperty: 'sssp' 
        }) 
        YIELD nodeId, distance 
        where gds.util.isFinite(distance) 
        with nodeId, gds.util.asNode(nodeId) as n, distance 
        RETURN n.name AS Name, distance AS Cost 
        ORDER BY Cost".format(person_id)).data()

neo4j_session.close()

The error doesn't occur on the same id each time, so I'm wondering if I'm not using the python driver correctly and not clearing something up?

If not, do I really need 12GB of memory to query the graph?


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
1.4k views
Welcome To Ask or Share your Answers For Others

1 Answer

I always call write_transaction then use run to execute the query works fine for me, I have a much larger database than yours no errors. The issue might be that you are opening and closing session in the for loop.

 def data(tx):
     # run your for loop here
     tx.run(" RUN YOUR QUERY ")

 with driver.session() as session:
    session.write_transaction(data)
 driver.close()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...