rejetto forum

about a search function workload

rejetto · 12 · 5288

0 Members and 1 Guest are viewing this topic.

Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13523
    • View Profile
hi, when a search function will be implemented, i will have to face the problem of it possibly being heavy for the computer.
it should be not heavy for those who have 1000 searchable files, but it may be for who has 20,000 files.
consider that someone may try overloading your HFS with useless searches, slowing down your computer.

so, we should think how to avoid this problem.
since this problem affects people with few files, the solution may even be not appliable to them.
or, a possible limit in the "limits" menu (like an option to inhibit searching for X seconds after the last) may be the way, but this would require the HFS admin to decide... an automatic solution would be better.

we will now eventually propose several options and solutions, but the smallest essential set should then be created. simpler is the best.


Offline Foggy

  • Tireless poster
  • ****
    • Posts: 806
    • View Profile
i will have to face the problem of it possibly being heavy for the computer.
it should be not heavy for those who have 1000 searchable files, but it may be for who has 20,000 files.
consider that someone may try overloading your HFS with useless searches, slowing down your computer.

Agreed but unfortunatly I have no ideas for possible solutions

Edit: Is hfs multi threaded? multiple threads could help the workload on some computers but not all.
« Last Edit: August 21, 2007, 09:41:34 AM by Foggy »


Offline TSG

  • Operator
  • Tireless poster
  • *****
    • Posts: 1935
    • View Profile
    • RAWR-Designs
I agree HFS needs a search function, i ran a search for someone and it dates back to 2004 when people were requesting the ability to search their files, :D. I understand that it will put stress on the host system, but that is the case with any search engine...

But like foggy says, i don't know how the internals of HFS work entirely, so i cant make any detailed suggestions.

My idea, would be for HFS to cache up a list of everything in its VFS to a list, when a person makes a search, the host computer will look through this cached list and then send them back the results, in some way. This list will rescan the VFS for new items every 2 hours or so *maybe modifiable*. To keep the data recent. Unless it is something actually placed into the VFS where that will be added to the list on entry. The only limit then would be... how fast the computer can search through a list of files...

Also, make it so a single IP Address can only search through the list one search at a time, so they cannot have multiple pages open and searching for many things? This is possible?
« Last Edit: August 21, 2007, 10:48:23 AM by That_Stevens_Guy »


Offline Foggy

  • Tireless poster
  • ****
    • Posts: 806
    • View Profile
TSG's Idea sounds good and to have the limit of only one ip searching at a time is also good.


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13523
    • View Profile
Edit: Is hfs multi threaded? multiple threads could help the workload on some computers but not all.

it is not.
but it would not help.

This list will rescan the VFS for new items every 2 hours or so *maybe modifiable*.

i don't think everyone would be happy with it being not always updated.

Quote
Also, make it so a single IP Address can only search through the list one at a time, so they cannot have multiple pages open and searching for many things? This is possible?

consider that being HFS not multi-threaded, only one search at time will be, just like the recursive listing.
when another search is issued, the previous one would be paused. (it may sound strange, but that's how also the listing works).

indeed i'm not talking about having 10 searches at a time. that's not the problem i'm addressing. even having only 1 search at time would take place. but it will keep your computer very busy nevertheless.
most people here don't use HFS on a dedicated computer, but on his own workstation, so we should try to not overload it.
try searching all your hard disk, and in the while you will see how your computer is slowed down.


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13523
    • View Profile
i guess a very good automatic way would be for HFS to detect hard disk activity, and having an option (enabled by default) that inhibits searches while there is much HD activity.

i don't know how hard is this detecting to be done. :/
and to be very effective, it should be compared to the HD top speed... that should be measured too, oh my god! though we may consider 40MB/s like being an average value.


Offline MarkV

  • Tireless poster
  • ****
    • Posts: 764
    • View Profile
When someone (determined by IP) does many searches (possible DoS attack), then after a while, artificially slow him down.
http://worldipv6launch.org - The world is different now.


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13523
    • View Profile
how many?
consider that searches will be sequentials and not concurrent.

maybe we will have a clearer view when the function will be available.


Offline MarkV

  • Tireless poster
  • ****
    • Posts: 764
    • View Profile
how many?

I think you are right, the exact limits should be adjusted later based on experiences in beta.
http://worldipv6launch.org - The world is different now.


Offline yhm_7

  • Occasional poster
  • *
    • Posts: 55
    • View Profile
how many?
consider that searches will be sequentials and not concurrent.

maybe we will have a clearer view when the function will be available.

we can study some bbs's search fountion.
Admin have the right to set the interval of two search from same ip.

e.g. set time for 15 seconds.
search will be refused if interval of two search is less than 15s.
« Last Edit: August 22, 2007, 07:35:20 AM by yhm_7 »


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13523
    • View Profile
i may count the time spent searching, and it could be limited by 1 minute every 10 minutes, per address.


Offline maverick

  • Tireless poster
  • ****
    • Posts: 1052
  • Computer Solutions
    • View Profile
I agree a search function is needed for those that currently don't have that feature built into their sites.

However, don't forget that some of us already use a commercial search engine.  For that reason, make sure that HFS's built-in search function can be turned off or off by default.
maverick