Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Nucache when we are loading in all data that we page over the data as to not cause an SQL timeout #7970

Merged
merged 4 commits into from Apr 20, 2020

Conversation

Shazwazza
Copy link
Contributor

@Shazwazza Shazwazza commented Apr 17, 2020

This is similar to the problem mentioned in v7 preview here #7574

When a site cold boots or rebuilds nucache from scratch by querying the DB, if there's a ton of data in the database this could lead to an Sql timeout exception. When we read this data in we use a db reader (cursor) so we don't load all of the data into memory at once (to preserve memory) but this means the connection may stay open longer. Because of that, we should be paging over the data using a db reader. This PR updates this behavior to read in by pages of size 500.

When in PureLive and a content type is changed, the content/media nucache needs to rebuild from the database data which is also a factor in when sql timeouts can occur. This paging will alleviate that issue but i've also added a Parellel call to process both the content and the media in parallel if the user's machine supports it which could go a little quicker as well.

Testing

These queries will execute in various circumstances - like a cold boot when nucache files don't exist, or when content types are changed. So to test just have PureLive mode on (default) and add a property to a doc type, this will rebuild all of nucache, both media and content with these queries.

Copy link
Member

@bergmania bergmania left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes looks good, but please see my single comment..

/// NPoco's normal Page returns a List{T} but sometimes we don't want all that in memory and instead want to
/// iterate over each row with a reader using Query vs Fetch.
/// </remarks>
internal static IEnumerable<T> QueryPaged<T>(this IDatabase database, long pageSize, Sql sql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm really not sure this is safe without a lock.

I know it is executed in a transaction, but this transaction is ReadCommitted isolation level.
So if another transaction commits between these pages, some items can repeated or skipped in the next page.

I think we need a lock, or at least run the transaction that uses this method i a higher isolation level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope created that calls into this is doing a readlock scope.ReadLock(Constants.Locks.ContentTree); that should be sufficient ?

@bergmania bergmania merged commit c313694 into v8/dev Apr 20, 2020
@bergmania bergmania deleted the v8/bugfix/nucache-db-paging branch April 20, 2020 09:10
@nul800sebastiaan nul800sebastiaan added the category/performance Fixes for performance (generally cpu or memory) fixes label Jun 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/performance Fixes for performance (generally cpu or memory) fixes release/8.7.0 type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants