Performant import of production data
Created by: Tirokk
Authored by roschaefer Merged
@appinteractive thanks for pointing out split
. You just saved me some
days of work to refactor the import statements to use CSV instead of
JSON files.
@Tirokk when I enter :schema
in Neo4J web UI, I see the following:
:schema
Indexes
ON :Badge(id) ONLINE
ON :Category(id) ONLINE
ON :Comment(id) ONLINE
ON :Post(id) ONLINE
ON :Tag(id) ONLINE
ON :User(id) ONLINE
No constraints
So I temporarily removed the unique constraints on slug
and added
plain indices on id
for all relevant node types. We cannot omit the
:Label
unfortunately, neo4j does not allow this. So I had to add all
indices for all known node labels instead.
With indices the import finishes in:
Time elapsed: 351 seconds
@appinteractive when I keep the unique indices on slug, I get an error
during import that a node with label :User
and slug tobias
already
exists. Ie. we have unqiue constraint violations in our production data.
@mattwr18 @ulfgebhardt @ogerly I started the application on my machine on the production data and it turns out that the index page http://localhost:3000/ takes way to long. Visiting my profile page at http://localhost:3000/profile/5b1693daf850c11207fa6109/robert-schafer is fine, though. Even pagination works. When I visit a post page with not too many comments, the application is fast enough, too: http://localhost:3000/post/5bbf49ebc428ea001c7ca89c/neues-video-format-human-connection-tech-news
Pullrequest
Issues
-
None
Checklist
-
None
How2Test
-
None
Todo
-
None