rejetto forum

Software => HFS ~ HTTP File Server => Topic started by: rejetto on June 23, 2006, 06:28:48 AM

Title: Fingerprints support
Post by: rejetto on June 23, 2006, 06:28:48 AM
fingerprint were requested here
http://www.rejetto.com/forum/viewtopic.php?p=1016814#1016814

how fingerprints are supported in HFS 2.1:
when right click on a file, HFS try to load any file .md5 located in the same physical folder of the file, and extract the hash from there.

let say test.txt
if test.txt.md5 exists, it has higher priority.

if a file "test.md5" contains on a line just an hash e no filename, then "test" is assumed as filename.
so test.txt.md5 can contain just the hash, and it's all right.

when browsing a real folder page, *.md5 files are loaded.
when browsing a virtual folder, if that has been created from a physical folder (not by clicking on "new folder"), then *.md5 files are loaded from that physical folder.

copy url with fingerprint is supported also for multiselection.

did i miss something?
Title: Fingerprints support
Post by: ledufe on June 23, 2006, 02:29:12 PM
step1

in the main menu do like this

(http://img69.imageshack.us/img69/4016/md51enabling9nu.th.jpg) (http://img69.imageshack.us/my.php?image=md51enabling9nu.jpg)

now download the free md5 creator from here

http://www.etree.org/cgi-bin/counter.cgi/software/md5sum.exe

save it on the right folder

#  Windows 95/98/Me: Download md5sum.exe to c:\windows\command

# Windows NT/2000/XP: Download md5sum.exe to your c:\winnt\system32


to create a md5 file do a bat or inside a prompt window type


 
Code: [Select]
md5sum [file-name.extension] > [filename].md5

NOTE: You must insert the name of the .md5 file [without the brackets]. Example:

Code: [Select]
md5sum *.mp3 > nightwish-once.md5


(http://img157.imageshack.us/img157/6203/md52creating4pl.th.jpg) (http://img157.imageshack.us/my.php?image=md52creating4pl.jpg)


then just copy the url with fingerprints(md5) by doing rightclick on the file in the VFS Tree like this
(http://img235.imageshack.us/img235/3709/md53copyurl4cb.th.jpg) (http://img235.imageshack.us/my.php?image=md53copyurl4cb.jpg)
and copy it to any chat window or weblink


so
the file added is

Openwave_SDK_622.exe

his url in my VFS is

http://ie-ipanema.no-ip.info:2222/file-with-md5/Openwave_SDK_622.exe

and his url with md5 is

http://ie-ipanema.no-ip.info:2222/file-with-md5/Openwave_SDK_622.exe#!md5!05c7c8417a1b71af69a0ae307f6e93ca

hope it helps
Title: Fingerprints support
Post by: ants on June 24, 2006, 08:12:20 AM
Can someone please explain to me what exatly md5 is and what it is meant to do?

Thankyou.
Title: Fingerprints support
Post by: rejetto on June 24, 2006, 10:30:46 AM
from http://en.wikipedia.org/wiki/Md5
Quote

In cryptography, MD5 (Message-Digest algorithm 5) is a widely-used cryptographic hash function with a 128-bit hash value. As an Internet standard (RFC 1321), MD5 has been employed in a wide variety of security applications, and is also commonly used to check the integrity of files.
Title: Fingerprints support
Post by: rejetto on June 25, 2006, 09:10:48 AM
i didn't forget... it is unrelevant since we are not using MD5 for security but for checksum
Title: Fingerprints support
Post by: Anonymous on July 02, 2006, 01:24:32 PM
It's great to see that my suggestion was taken seriously and also implemented. I think the fingerprints feature of HFS will be of help to a lot of people.

I have to say, great work rejetto! You truly do listen to people's suggestions (if i come to think of it, you implemented all my suggestions I had for HFS 2.1!)

I'll download the new beta and post back any feedback. Thanks again :)
Title: Fingerprints support
Post by: mastabog on July 02, 2006, 01:37:06 PM
That was me in the message above.

I've tested the new beta with support for link fingerprints. Thanks again for implementing it.

However, I was talking about a full support for link fingerprints, where HFS would compute the MD5/SHA1 hashes on the fly when instructed to copy the link with MD5/SHA1 fingerprint. Maybe you are implementing this in the future beta version, I'm not sure.

Using a 3rd party tool like md5sum to create md5 files and place them next to the shared files is ok but it is a cumbersome job nonetheless. Most users would probably avoid using it as it involves a lot of manual work for each file.

It would be a whole lot better if the "Copy URL with fingerprint" was always visible and when clicked, HFS would parse the file in question and compute the MD5 hash, adding it to the URL and forming the full link with fingerprint. This way, everything can be done in one simple click without needing to run 3rd party tools and creating aditional files. I am sorry if I have not been clear regarding this in my initial request. Let me know if i was clear now.

MD5 is a public alghorithm so you will easily find functions or libraries for Delphi (if HFS is coded in delphi).

Thanks
Title: Fingerprints support
Post by: rejetto on July 03, 2006, 01:12:08 AM
it takes too much time, it can't be done on-the-fly.
most files swapped with HFS are big (10-100-1000 MB), and any fingerprint require reading the whole file before.
all i can do is to add a command to create md5 files.
Title: Fingerprints support
Post by: maverick on July 03, 2006, 02:44:05 AM
Quote from: "rejetto"
all i can do is to add a command to create md5 files.

That might be a good idea.  Would there be a way to create md5 files in batch mode?  One at a time is a pain in the a$$.
Title: Fingerprints support
Post by: mastabog on July 03, 2006, 03:15:19 AM
Quote from: "rejetto"
it takes too much time, it can't be done on-the-fly.
most files swapped with HFS are big (10-100-1000 MB), and any fingerprint require reading the whole file before.
all i can do is to add a command to create md5 files.


Well, I can only say it again :). The MD5 hash should be computed only when the user clicks the "copy URL with md5 hash" and not whenever a file is added into HFS or when its URL is copied. Hashing a file takes the same amount of time whether it is being done by HFS or by an external tool - it makes no difference.

So, once more, it is on-demand only and not whenever a file is added in HFS. When I said "on-the-fly" I meant you need only 1 click of a mouse. Only when the user selects the copy url with md5 hash from the context menu (you even rename it to "compute md5 hash and copy url" to make it clear). You can even remember the MD5 hash if the file modification date and size have not change ...

Using an external tool to hash the file seems like too much manual work for this feature to be useful or attractive, kind of defeats its purpose.
Title: Fingerprints support
Post by: rejetto on July 03, 2006, 05:44:01 AM
Quote from: "maverick"
Quote from: "rejetto"
all i can do is to add a command to create md5 files.

That might be a good idea.  Would there be a way to create md5 files in batch mode?  One at a time is a pain in the a$$.

i gues it will work with multiselection
Title: Fingerprints support
Post by: rejetto on July 03, 2006, 05:49:07 AM
Quote from: "mastabog"
Well, I can only say it again :). The MD5 hash should be computed only when the user clicks the "copy URL with md5 hash"

so, clicking on it would mean to
1. if loaded md5 is older than file or doesn't exists, create md5
2. copy url

this can take minutes. i should display a dialog warning the user for the long waiting.
Title: Fingerprints support
Post by: mastabog on July 03, 2006, 11:17:14 AM
Quote from: "rejetto"
Quote from: "mastabog"
Well, I can only say it again :). The MD5 hash should be computed only when the user clicks the "copy URL with md5 hash"

so, clicking on it would mean to
1. if loaded md5 is older than file or doesn't exists, create md5
2. copy url

this can take minutes. i should display a dialog warning the user for the long waiting.


Exactly, and I think I got you right :)

That way you don't need to rely on external tools or external files (an md5 hash is a string of only 32 bytes long and can be stored in HFS' memory). And certainly, if the user is hashing a big file (e.g. 100 Mb) then a small warning box should be displayed. For the time being you could use a simple message box saying something like "Hashing file, this can take a while ...". A progress bar would be nice of course, but not critical at all (after all, md5sum.exe doesn't say anything until it finishes :P)

You can also dump the md5 hash into a file and use the file for future uses. You can also add an option to force re-hashing the file in case the file has changed but there is an older md5 file sitting there.

Thanks again
Title: Fingerprints support
Post by: maverick on July 03, 2006, 03:30:51 PM
I just don't understand it.  You guys are telling us that it can take quit a bit of time to get the md5 fingerprint of a large file.

I just tested a 140 and 170 mb rar archive and I got the md5 fingerprint in much less than a second using md5sum.  I also tested with another md5 utility with the same results and got the same fingerprint.  Just tested on a 300 mb rar archive and took about 4 secs.  Just tested on a 750 mb avi movie file.  Took about 10 secs but still no where near a lot of time.  I can't see no reason why HFS can't do it for selective files (not every file added to HFS).  For the very very big files (>1gb) there should be a time warning/comment indicating that it may take a little time to get the fingerprint for these files (if anyone is sharing such big files).  

Am I missing something?
Title: Fingerprints support
Post by: mastabog on July 03, 2006, 03:37:57 PM
Quote from: "maverick"
I just don't understand it.  You guys are telling us that it can take quit a bit of time to get the md5 fingerprint of a large file.

I just tested a 140 mb rar archive and I got the md5 fingerprint in much less than a second using md5sum.  I also tested with another md5 utility with the same results.  I can't see no reason why HFS can't do it on-the-fly.

Am I missing something?


Well, if you read my message I said HFS could (and should) do it by itself without reading external files generated by 3rd party tools.

however, maybe you have a super machine and/or super HDD :) but for a 700 MB file on my P4 HT 2.4 GHz the MD5 hashing takes about 10 seconds. For slower machines it might take more.

Regardless of that, it's not a good idea to compute the MD5 hash on-the-fly for all files that you add into HFS as it may generate too much hdd and cpu activity. In my opinion, the best solution would be to have a global option (disabled by default) that computes MD5 hashes of all files when added to HFS and an entry in the context menu of each file where the user can copy the URL with the MD5 hash and that would instruct HFS to compute the md5 hash when clicked.

Cheers
Title: Fingerprints support
Post by: maverick on July 03, 2006, 07:56:23 PM
Quote from: "mastabog"
maybe you have a super machine and/or super HDD :) but for a 700 MB file on my P4 HT 2.4 GHz the MD5 hashing takes about 10 seconds. For slower machines it might take more.

A super machine, far from it.  Actually those tests were done on a P3 900 mhz.  I'll do more testing on a faster system when I get home.  (btw that post of mine above was edited a few times with new results as I finished additional tests).

Regardless, it's up to rejetto to decide on what changes he would like to make, if anything.
Title: Fingerprints support
Post by: mastabog on July 03, 2006, 08:50:48 PM
Quote from: "maverick"
Quote from: "mastabog"
maybe you have a super machine and/or super HDD :) but for a 700 MB file on my P4 HT 2.4 GHz the MD5 hashing takes about 10 seconds. For slower machines it might take more.

A super machine, far from it.  Actually those tests were done on a P3 900 mhz.  I'll do more testing on a faster system when I get home.  (btw that post of mine above was edited a few times with new results as I finished additional tests).

Regardless, it's up to rejetto to decide on what changes he would like to make, if anything.


Read your edited post and yeah, HFS could automatically compute the MD5 hashes for small files (e.g. less than 32 MB ). Usually people check bigger files against MD5 hashes but your idea makes sense nonetheless.

This could be another global option in HFS - automatically compute MD5 hashes for files smaller than <user editable value here> MB. That would be really neat! :)
Title: Fingerprints support
Post by: rejetto on July 04, 2006, 11:35:25 AM
it's not a matter of CPU, md5 is designed to be fast
the only bottleneck is hard disk speed.
on my laptop, most of times my hard disk stays under 10MB/s.
laptops have slow HDs.
i just tested md5sum on a 700MB file, and it took 1 minute, with speed varying from 8MB/s to 14MB/s.
i made the test while the hard disk was not busy.
Title: Fingerprints support
Post by: Anonymous on July 04, 2006, 03:23:22 PM
Quote from: "rejetto"
it's not a matter of CPU, md5 is designed to be fast
the only bottleneck is hard disk speed.
on my laptop, most of times my hard disk stays under 10MB/s.
laptops have slow HDs.
i just tested md5sum on a 700MB file, and it took 1 minute, with speed varying from 8MB/s to 14MB/s.
i made the test while the hard disk was not busy.


It is fast, of course, it is just another CRC alg. However, what I said is that it is CPU intensive (a lot of CPU activity), meaning that during hashing the CPU usage of the hashing process can reach almost 100% (you can verify that with task manager).

That is not desired as it slows the other processes down hence my opinion against making HFS compute md5 hashes on the fly for all files and making it either a global strict option or a global option where the user can input a maximum file size for which the MD5 hashes should be computed automatically when added to HFS.
Title: Fingerprints support
Post by: mastabog on July 04, 2006, 03:25:28 PM
Offtopic

Ack! :(  the board logged me out again. It was me in the post above. Have you setup the board so it expires the cookies after a number of days? My cookie for this board seems to expire after some interval ...
Title: Fingerprints support
Post by: rejetto on July 04, 2006, 04:23:49 PM
i think the problem is about the change of domain (www.rejetto.com -> rejetto.com).
i have problems accessing admin panel ATM.

anyway, i don't think it is a good idea to have this feature that calculates MD5 at addition.
we can't have 2 commands, one that saves to disk and one that keeps the MD5 in memory: concerning the GUI it would be too intrusive.
Title: Fingerprints support
Post by: mastabog on July 05, 2006, 09:40:58 AM
Quote from: "rejetto"
we can't have 2 commands, one that saves to disk and one that keeps the MD5 in memory: concerning the GUI it would be too intrusive.

Well, why would you want to implement two commands? :) You would either save to file or to memory, but not both. I would say to save the md5 hash to file as it will be there later at subsequent launches of HFS and it wouldn't need to be computed again, unless instructed to by the user.

Quote from: "rejetto"
anyway, i don't think it is a good idea to have this feature that calculates MD5 at addition.

Please allow me to differ, I think its a great idea to have the hashes computed at addition time *provided* that there is a global option in the menu to limit the maximum file size for which the md5 hash should be automatically computed on addition. It could be by default set to a very low value (e.g. 4 MB) and the user can adjust it to his liking or enable/disable it for all files, regardless of their size.

Your call, as always, but I think it would make a great addition.
Title: Re: Fingerprints support
Post by: CuriousGuest on November 07, 2006, 01:36:24 AM
Sorry, but im interested on the advantages of the fingeprint feature...
Tell me an example plz...
Title: Re: Fingerprints support
Post by: rejetto on November 07, 2006, 02:00:31 AM
it is a way to automatically check the integrity of the file you are downloading.