Any way to speed up the Import Thread?

Technical support and discussion of Newsbin Version 6 series.

Any way to speed up the Import Thread?

Postby ack9 » Wed Jul 26, 2017 1:11 pm

I am reloading a few groups, grabbing all headers. The header download is going around 3MBps, and the Import folder is generating the GZ files to be processed for import. However, the header download seems to be going faster than the Import Thread, so I have a backlog of over a 1000 GZ files right now.

Windows 7 Pro 64-bit
3 CPUs allocated to the VM
6 GB memory
CPU load never goes over 42%
My drive is an SSD, and is not showing any significant I/O (it's actually quite low @ <2MB/sec while the drive has no problem creating at 500MB/s)

Any special settings I can add to the NBI?

At the moment, the only way I can tell to start reducing the Import folder backlog is to limit the network speed to the point that the Import Thread is processing faster than the header import creation...
ack9
Active Participant
Active Participant
 
Posts: 69
Joined: Thu Nov 11, 2004 3:15 pm

Registered Newsbin User since: 11/11/04

Re: Any way to speed up the Import Thread?

Postby ack9 » Thu Jul 27, 2017 1:10 am

Header download is filling up the filesystem too quickly so for now I've cancelled it and paused the queue to let the DB import catch up. On average, it is processing a 27MB GZ file in 45-50 seconds. Ugh.

Would love it if there was a secret setting to get back those blazing import speed of older eras of Newsbin....
ack9
Active Participant
Active Participant
 
Posts: 69
Joined: Thu Nov 11, 2004 3:15 pm

Registered Newsbin User since: 11/11/04

Re: Any way to speed up the Import Thread?

Postby Quade » Thu Jul 27, 2017 1:52 am

27 Mb GZ files is 270 megs of actual text that has to be parsed out and fed into a DB. If you count the size of the import folder and multiply by 10, that's the actual amount of data that needs to be processed.

Depending on the group it can be an intensive process. It's one of the reasons some groups are just not really usable with headers anymore. I mean, they are if you have a beast of a machine and SSD's. Less so when you have lesser hardware. In my experience, VM's seems to cripple disk performance too.

Would love it if there was a secret setting to get back those blazing import speed of older eras of Newsbin....


In the old days they were just stored as flat files. So basically the GZ file was the headers. Usenet was much smaller back then though.
User avatar
Quade
Eternal n00b
Eternal n00b
 
Posts: 44867
Joined: Sat May 19, 2001 12:41 am
Location: Virginia, US

Registered Newsbin User since: 10/24/97

Re: Any way to speed up the Import Thread?

Postby dexter » Thu Jul 27, 2017 10:05 am

If you turn off XFeatures, headers will come down slower and you won't generate as much of a backlog. It's under Options -> Servers and click on the "Really Advanced" button.

Quade is right though, the volume of data being processed is way higher than the old days. The groups have more traffic and larger files.

An alternative is to let us download the headers for you and use our built in search index. You can even browse each group using a wildcard search if you aren't looking for something specific. You can try it out by clicking the Search tab, either enter an asterisk or a search word, enter a group or select a group folder in the "Search in Group" field, then hit enter or click the magnifying glass icon. Searching is free for everyone but you need a subscription in order to queue search results for download. You can sign up by logging in to your account from within Newsbin by clicking the Help menu option then select "Search Control Panel".
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9511
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

Re: Any way to speed up the Import Thread?

Postby ack9 » Thu Jul 27, 2017 4:46 pm

dexter wrote:If you turn off XFeatures, headers will come down slower and you won't generate as much of a backlog. It's under Options -> Servers and click on the "Really Advanced" button.


I can't tell a difference of the setting, but looking at the NBI led me to how I think I would turn it off. Isn't it by default trying to use XFeatures (regardless of whether you've ever gone into that setting area), and the checkbox is just a 'force' mode?

In the NBI, there is two variables, that seem to be opposites of each other.
DisableXFeatures=0
EnableXFeatures=0

So I'll give the DisableXfeatures=1 a try for the heck of it.
ack9
Active Participant
Active Participant
 
Posts: 69
Joined: Thu Nov 11, 2004 3:15 pm

Registered Newsbin User since: 11/11/04

Re: Any way to speed up the Import Thread?

Postby dexter » Thu Jul 27, 2017 5:24 pm

You should do it from the GUI. One of the .nbi settings is legacy. Quade made a change so XFeatures is off by default now because so many security software packages puke on the binary data coming through NNTP when xfeatures is enabled.
User avatar
dexter
Site Admin
Site Admin
 
Posts: 9511
Joined: Fri May 18, 2001 3:50 pm
Location: Northern Virginia, US

Registered Newsbin User since: 10/24/97

Re: Any way to speed up the Import Thread?

Postby ack9 » Thu Jul 27, 2017 5:36 pm

Quade wrote:27 Mb GZ files is 270 megs of actual text that has to be parsed out and fed into a DB. If you count the size of the import folder and multiply by 10, that's the actual amount of data that needs to be processed.

Depending on the group it can be an intensive process. It's one of the reasons some groups are just not really usable with headers anymore. I mean, they are if you have a beast of a machine and SSD's. Less so when you have lesser hardware. In my experience, VM's seems to cripple disk performance too.


I appreciate that the GZ extracted is more sizeable. As a matter of context, I went ahead and used WinRAR to explode the remaining files (705, 1.6 GB GZ size) in the Import folder. That took it a total of 2 minutes and 18 seconds, resulting in 17 GB.

I don't believe NB will accept those extracted files though... correct?

I still believe the takeaway is that my VM isn't being taxed at all. Both I/O and CPU from NewsbinPro64.exe is well below the VM system limits, especially I/O (disk) is under a 1% utilization. I can read and write at well over 100MB/sec at the 32K reads and 4K writes that NB is doing.

If turning off the Xheaders doesn't work out better, I guess I have to log an enhancement request for Import thread throttling improvements?
ack9
Active Participant
Active Participant
 
Posts: 69
Joined: Thu Nov 11, 2004 3:15 pm

Registered Newsbin User since: 11/11/04

Re: Any way to speed up the Import Thread?

Postby ack9 » Thu Jul 27, 2017 8:03 pm

dexter wrote:You should do it from the GUI. One of the .nbi settings is legacy. Quade made a change so XFeatures is off by default now because so many security software packages puke on the binary data coming through NNTP when xfeatures is enabled.


Ah, good to know, I missed that change I guess in the release notes. Apologies and thanks for the clarification.
ack9
Active Participant
Active Participant
 
Posts: 69
Joined: Thu Nov 11, 2004 3:15 pm

Registered Newsbin User since: 11/11/04


Return to V6 Technical Support

Who is online

Users browsing this forum: No registered users and 3 guests