Skip to content

Performant import of production data

Hannes Heine requested to merge pr529head into pr529base

Created by: Tirokk

roschaefer Authored by roschaefer Merged


@appinteractive thanks for pointing out split. You just saved me some days of work to refactor the import statements to use CSV instead of JSON files.

@Tirokk when I enter :schema in Neo4J web UI, I see the following:

:schema
Indexes
   ON :Badge(id) ONLINE
   ON :Category(id) ONLINE
   ON :Comment(id) ONLINE
   ON :Post(id) ONLINE
   ON :Tag(id) ONLINE
   ON :User(id) ONLINE

No constraints

So I temporarily removed the unique constraints on slug and added plain indices on id for all relevant node types. We cannot omit the :Label unfortunately, neo4j does not allow this. So I had to add all indices for all known node labels instead.

With indices the import finishes in:

Time elapsed: 351 seconds

🎉

@appinteractive when I keep the unique indices on slug, I get an error during import that a node with label :User and slug tobias already exists. Ie. we have unqiue constraint violations in our production data.

@mattwr18 @ulfgebhardt @ogerly I started the application on my machine on the production data and it turns out that the index page http://localhost:3000/ takes way to long. Visiting my profile page at http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer is fine, though. Even pagination works. When I visit a post page with not too many comments, the application is fast enough, too: http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news

Pullrequest

Issues

  • None

Checklist

  • None

How2Test

  • None

Todo

  • None

Merge request reports