size of group makes a difference

Technical support and discussion of Newsbin Version 6 series.

size of group makes a difference

Postby wonderwhy » Sat Apr 02, 2011 2:25 am

I'm still a few generations behind in PC technology (P4 2.4Ghz with 2GB RAM but lots of disk space) which may contribute to my most noticeable problems.
I'm using 6b7 and noticed two things, one which occurred in a late version 5.
I switched to vs 6 recently and after realizing that the 'conversion' process of vs 5 storage to vs 6 was slow, I decided to start over, using 'download all' then 'download latest' for all my groups. I have noticed that when I download from a large (say 500 000 headers) group, stop the update, double click to display, NB will begin what amounts to be a full listing of all downloaded headers. Smaller groups will display the 'display age' group of headers which I have set for 7 days for initial speed. I don't remember this happening with vs 5 but I didn't work much with large groups either.

Something that did happen with vs 5 and has been mentioned here is the 'lockout post' feature. NBs marking of the headers for 'poster lockout' seems to depend upon the size of the group. It works for me on smaller groups but on such as a.b.multimedia not. Actually, I can't say 'not' since I can see some entries on the header listing that are marked as 'poster lock'. The poster's name appears on the 'rejected list' but post headers are not marked or filted. The subject (.exe) filter seems to work okay.

It's an inconvenience but for the first problem, I let NB run through and display the entirety of the stored headers, then I can use the other display modes such as 'display age' or a month okay. I haven't figured out how to get the poster lockout for large groups to work. I'd rather not use the suggestion to actually delete the poster's posts since there doesn't seem to be a way to reload a particular month of posts much less a particular poster - if accidentally deleted. Filtering by poster is not a perfect solution due to 'sporging' but it's something.

Any idea what may be causing my problems?


AFTER reading the replies, concerning my situation (slow moving of downloaded headers into the database) I understand:

When I stop the downloading headers portion then the insertion process takes place for those new headers - as long as NB is active. On a faster PC this process doesn't create such a noticeable two stage operation.

If I close NB, leaving uninserted headers, the process will automatically continue for those headers when I reopen NB.

There isn't a problem with downloading newest headers before allowing the insertion of previously downloaded headers to complete.

Something I didn't mention that now seems connected to this process is that sometimes when I select a particular month to display headers (in a group that probably didn't finish header insertion) it results in "NB Pro has encountered a problem...".
The error code 0xc0000005.
Last edited by wonderwhy on Tue Apr 05, 2011 9:46 pm, edited 1 time in total.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby DThor » Sat Apr 02, 2011 7:11 am

You want to let Newsbin inject the new headers, if you interrupt it, it will pick it up next time you update the headers, but I suspect what you're seeing is what happens if you interrupt or don't wait for the injection process. AFAIK Newsbin tries to do the right thing as much as possible, but the assumption is you've downloaded your headers and let it do it's thing.

The final behaviour of spools isn't written in stone yet, but personally I'm doubting the notion of a two part process to spools will completely go away. Somehow, the data for compacting needs to be generated.

'One lung machines' will probably continue to suffer, as happens with all software being released nowadays. However, I would point out that Newsbin's memory behaviour is really good in v6, you can load/search a crapload more spools than you ever could in v5 without the memory hit, plus it's faster, so in some respects you'll see significant improvements with 6 over 5. However, you have to pay the piper somewhere, and the spool injection is currently the magilla. I would guess that I/O bandwidth will be your personal problem with your machine, directly or indirectly. I'm sure it's been suggested to you, but slower machines will only get slower trying to manage those massive spools locally, you might want to consider inet search and/or an nzb service. But yeah, I know, that's not really practical for picture groups. :)

DT
V6 Troubleshooting FAQ . V6 docs. Usenet info at Usenet Tools. Thanks!
User avatar
DThor
Elite NewsBin User
Elite NewsBin User
 
Posts: 5943
Joined: Mon Jul 01, 2002 9:50 am

Registered Newsbin User since: 04/01/03

Re: size of group makes a difference

Postby Quade » Sat Apr 02, 2011 9:59 am

stop the update, double click to display, NB will begin what amounts to be a full listing of all downloaded headers. Smaller groups will display the 'display age' group of headers which I have set for 7 days for initial speed. I don't remember this happening with vs 5 but I didn't work much with large groups either.


Newsbin currently downloads the headers as fast as it can. SO the process is

1 - Download
2 - Write RV4
3 - Goto 1 till done.

Then there's another process that pulls the RV4 files and inserts them into the database

1 - Read RV4
2 - Write to Database
3 - Notify post lists
4 - Goto 1 till done.

#3 is why you see records in the post lists from times you haven't asked for.

When you load a post list, the post list will also import RV4's in order to double the speed of insertion and make the post list more current.

This is the way 6 works right now. I have another mode where it reads from the server and writes directly to the DB. The benefit is that you can see in the download list when it's really done. The downside is that it slows header downloads significantly. I don't really know what the correct solution is yet. I might put up a beta with the "immediate write" mode and see what people think of it. What you're reporting though, is the result of importing the RV4's into the database in the background. Once all the RV4's are processed, "Download Latest" should be able to react pretty quickly. I think it's the flood of "Download all" that it's trying to put away.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: size of group makes a difference

Postby mho » Sat Apr 02, 2011 9:22 pm

Quade wrote:
This is the way 6 works right now. I have another mode where it reads from the server and writes directly to the DB. The benefit is that you can see in the download list when it's really done. The downside is that it slows header downloads significantly.

I think the current way would work fine, if you just added some form of notification that header insertion is still ongoing. As it is now, I think the only way is to watch the debug log, and I guess not all of us are doing that:-)

- mho
mho
Seasoned User
Seasoned User
 
Posts: 259
Joined: Sat Jan 02, 2010 8:57 pm

Registered Newsbin User since: 10/25/08

Re: size of group makes a difference

Postby mho » Sat Apr 02, 2011 9:25 pm

mho wrote:some form of notification that header insertion is still ongoing

I just thought that this doesn't necessarily need to mean the jobs hang around in the Download list; it might actually be nicer if it was marked in the Group list (and preferably on loaded group/GoG tabs) that the group was "Busy". That way, you could just go look in another group that had already finished updating:-)

- mho
mho
Seasoned User
Seasoned User
 
Posts: 259
Joined: Sat Jan 02, 2010 8:57 pm

Registered Newsbin User since: 10/25/08

Re: size of group makes a difference

Postby wonderwhy » Tue Apr 05, 2011 8:44 pm

Thanks for the information.
After extensive downloading I'll let it sit and watch it wind down in Task Manager/Process Explorer.

This is probably why I sometimes get "NB Pro has to shut down..." with the exception error code: 0xc0000005 when I select some months to display in a larger group.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby wonderwhy » Thu Apr 07, 2011 1:33 am

I vote for new columns in 'groups list' to show how many headers are yet to be inserted and allow start/stop insertion in the group(s of my choice.

I don't think header insertion continues if the group is not open to show posts. Why open another tab just for this?

Probably a minor problem for most .
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby richy99 » Thu Apr 07, 2011 6:07 am

header insertion happens if you have the group open or not, once you upldate the group any backlogged rv4 get inserted aswell
User avatar
richy99
Elite NewsBin User
Elite NewsBin User
 
Posts: 6353
Joined: Fri Nov 21, 2003 8:04 pm
Location: Wales

Registered Newsbin User since: 12/31/03

Re: size of group makes a difference

Postby wonderwhy » Wed Apr 13, 2011 3:54 am

After I had selected to update two dozen groups with the latest and stopped downloading headers & posts, I took a hint from another V6 post response and checked the groups folders for .NV4 files. I made a list of eighteen that still had them, most with a single .RV4 file, but a couple had more than a dozen each. I noticed that unlike some other occasions when NB would occupy the CPU at 50% for hours afterward (hopefully inserting headers) after stopping the file downloads, this time NB caused practically no CPU activity.
After a half hour I selected to "show posts" (right click menu on a group) on groups with .RV4s; the .RV4 files disappeared quickly, the longer lists in minutes. The header display for the group with the longer .RV4 lists didn't complete the display until much later. The two .db3-shm and two .db3-wal files remained until the display had fully updated.

Are the headers fully inserted when the .RV4 files are gone or does it complete with the -shm and -wal files disappearance?

I'm looking for a practical indicator (other than the now suspect CPU usage) to show finished storage.

A few of my group folders have a persistent file similar to: Storage.db3-mj3A08EA1B. Since the great majority have only the typical four files I presume that this may indicate a problem - but what?
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby DThor » Wed Apr 13, 2011 8:39 am

That's correct, when those files are gone, there's no more insertion to do.

FWIW, I'm in the camp that would like to eventually see some sort of indication that Newsbin is working on spool insertion.

DT
V6 Troubleshooting FAQ . V6 docs. Usenet info at Usenet Tools. Thanks!
User avatar
DThor
Elite NewsBin User
Elite NewsBin User
 
Posts: 5943
Joined: Mon Jul 01, 2002 9:50 am

Registered Newsbin User since: 04/01/03

Re: size of group makes a difference

Postby wonderwhy » Sun Apr 17, 2011 11:50 pm

Another problem that I just noticed because I left Boneless to the last after switching to v6. I'm not sure but I don't think I had this problem in version 5 though I never suceeded in downloading all of the headers.

The initial size of Boneless shows approx 3,345,056,707 (giganews) in the size column.

With any other group such as ab.nl I can watch the .RV4 files appear (and sometimes disappear) as the headers are downloaded and processed - typically large hearder downloads are processed after I stop downloading.

With 'download all headers' Boneless, no .RV4 files are created during a two hour downloading period that shows a downloaded size over 100M (not bytes). The eight files that are created initially during the update do not change date/time mod from start of the process. If I 'remove from the download list' and then 'download latest' the 'size' has decreased by the downloaded amount. If I do this a few times seven of the eight Boneless files (none .RV4) will update the mod time but the storagedata.db3 remains at the initial 'download all headers' mod time. The files sizes don't appreciably change. If I 'show posts' the eight files decrease to the usual four files of small sizes. I have deleted the files and group folder with similar results.

Is this due to the 32 bit/64 bit situation? Surely even my PC would store something.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby Quade » Mon Apr 18, 2011 12:04 am

What's your "storage age" set to? If it's shorter than the age of the headers, they just get thrown away.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: size of group makes a difference

Postby wonderwhy » Wed Apr 20, 2011 2:52 am

Thanks - that was it - at 600. Originally I had set it to something all encompassing like 6 years but somewhere along the way it was reduced.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby wonderwhy » Wed Apr 20, 2011 8:30 pm

I'm now using 6.0b9 build 899 and still have this curious/undesirable behavior (for me).
After downloading a sizable number of headers for a few groups and stopping the downloading, I leave NB on to allow it to process the .RV4s into the DB3. It may do this but it may not. Sometimes after a few hours I see the CPU not being used and lots of RAM available. Recently I found a group with 592 .RV4s, 2.73GB not being processed (apparently). I closed NB and reopened - after five minutes of activity NB went still again. No apparent effect on the group with the NV4s. I then did a "show posts" for that group and watched as the -shm and -wal files appeared and the RV4s disappeared, one by one, every few seconds, with some longer delays. I closed the group display and the processing stopped. I repeated the 'show posts' and the processing continued.
This doesn't seem to be the intended behavior as I thought that displaying a group could possibly delay the processing of the headers but this seems to trigger it.

Is this my old CPU at fault or ..?
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby Quade » Wed Apr 20, 2011 9:00 pm

Sounds like the background worker doesn't see all the RV4's. I'll note it.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: size of group makes a difference

Postby wonderwhy » Sat Apr 30, 2011 4:12 am

I'm using 6b11 and the processing of .rv4 files seemed to be fixed. While updating groups now I don't even see .rv4s but the -shm and -wal s appear and the database seems to be updated very quickly.

I still don't know what the storage.db3-mj... file is that remains in some but not all large groups. These are dated weeks before the update. May I delete?
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby Quade » Sat Apr 30, 2011 5:59 am

The current beta defaults to what I call "foreground update" which means all the data gets inserted directly into the DB with no RV4 stage. You can revert to the background update in the settings if you prefer that mode. Background update has a higher absolute header download speed but, spreads out the time needed to get the data into the DB files.

I'm not sure about deleting the .mj files. I'm going to say "probably" on the delete.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: size of group makes a difference

Postby wonderwhy » Thu May 26, 2011 3:42 pm

Using 6b14 soon to be 6b15.

My P4 CPU as expected does not do well with the "foreground" update, I cannot select to do an update of several groups which involve each have hundreds of K size updates without eventually running out of the 2GB RAM. Switching to the previous "background" method doesn't do that but still will strand some groups with .rv4 files that NB seems to forget about even after several hours. I trigger a slow header insertion by selecting 'show posts' for those groups. Updating other groups does not trigger processing of these groups.

An odd thing is that in "background" insertion the download list shows the groups usually leaving a size like "0/6 or 1/5 (single digits)" which may remain til the whole list of group headers has a similar tiny amount.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm

Re: size of group makes a difference

Postby Quade » Thu May 26, 2011 4:52 pm

My P4 CPU as expected does not do well with the "foreground" update, I cannot select to do an update of several groups which involve each have hundreds of K size updates without eventually running out of the 2GB RAM. Switching to the previous "background" method doesn't do that but still will strand some groups with .rv4 files that NB seems to forget about even after several hours. I trigger a slow header insertion by selecting 'show posts' for those groups. Updating other groups does not trigger processing of these groups.


Keep the group closed when you do the update and it's won't consume much RAM at all. What's your display age set to? Might want to cut it down to under 30 days.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: size of group makes a difference

Postby wonderwhy » Sat May 28, 2011 2:16 am

I normally don't open/display group header listing while updating especially for a group that's updating. (But if a group has unprocessed .rv4s as still happens on my PC I open that group and the header processing restarts.) I select a section of groups and then update. The the section of groups don't have many new ones then they are processed quickly but if there are several groups with large amounts then the available RAM gradually disappears. It's okay if I do just one or two groups at a time. So I decided to go back to the "background", .rv4 holding files which allows me to do some file downloads also - I just have to remember to leave NB running.
wonderwhy
Occasional Contributor
Occasional Contributor
 
Posts: 18
Joined: Tue Jan 04, 2011 12:49 pm


Return to V6 Technical Support

Who is online

Users browsing this forum: No registered users and 2 guests