rejetto forum

[Question] How to download and save an external file (Macro)

LeoNeeson · 24 · 24154

0 Members and 1 Guest are viewing this topic.

Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 842
  • Status: On hiatus (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
@Rejetto: There is no hurry for me, take all the time you need before publishing a new stable version (I also confirm that the 'remote upload' feature works perfect now on this new build). :)

@Mars: Your code posted here works fine with local files (and with small changes on the code, also with external files).

» EDIT: I'm currently doing several tests...
(I'll post the results on the next few days).
;)
« Last Edit: May 24, 2018, 10:12:52 AM by LeoNeeson »
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 842
  • Status: On hiatus (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
@Mars: After doing several tests, everything seems to be working OK, but somehow, I've found a small rare bug/issue: if connection gets closed (or dropped) by the remote server (or by bad connection quality) while downloading a chunk, then the result was getting a corrupted file (I've ended up with a corrupted 15,7MB file, instead of being 12,6MB). It only happened once, and then I couldn't recreate the error. I don't know if we can get a workaround or modify the code to avoid bugs like this happen again, but I will leave you the logs of my tests:

GOOD download (normal log):
Quote
10:25:05 127.0.0.1:1281 Requested GET /
10:25:13 127.0.0.1:1281 loading from 'http://distro.ibiblio.org/tinycorelinux/9.x/x86/release/Core-9.0.iso' 13256704 bytes
10:25:34 127.0.0.1:1281 chunk=10000000 from=0 size=10000000
10:26:08 127.0.0.1:1281 chunk=3256704 from=10000000 size=3256704
10:26:09 127.0.0.1:1281 saved file contain 13256704 bytes
10:26:09 127.0.0.1:1281 Requested POST /

BAD download (issue log):
(Marked in bold = are the abnormal execution of code)
(Marked in red = comments added by me to describe the situation)
Quote
10:36:05 127.0.0.1:1257 Requested GET /
10:36:17 127.0.0.1:1257 loading from 'http://distro.ibiblio.org/tinycorelinux/9.x/x86/release/Core-9.0.iso' 13256704 bytes
10:36:53 127.0.0.1:1257 chunk=10000000 from=0 size=10000000 > Connection dropped!
10:37:16 127.0.0.1:1269 loading from 'http://distro.ibiblio.org/tinycorelinux/9.x/x86/release/Core-9.0.iso' 13256704 bytes
10:37:38 127.0.0.1:1269 chunk=10000000 from=0 size=10000000
10:37:47 127.0.0.1:1269 chunk=3256704 from=10000000 size=3256704
10:37:48 127.0.0.1:1269 saved file contain 13256704 bytes
10:37:48 127.0.0.1:1269 Requested POST /
10:37:48 127.0.0.1:1257 chunk=3256704 from=10000000 size=3256704 > WTF!?
10:37:49 127.0.0.1:1257 saved file contain 13256704 bytes > Not true, file was bigger, (exactly 16513408 bytes)!
10:37:49 127.0.0.1:1257 Requested POST /

Code used (RemoteUpload3.tpl):
Quote
<form method='post'>
URL: <input name='url' value="http://distro.ibiblio.org/tinycorelinux/9.x/x86/release/Core-9.0.iso">
<br>Filename: <input name='dest' value="%folder%Core-9.0.iso">
<br><input type='submit'>
</form>
{.set|url|{.postvar|url.}.}
{.break|if={.not|{.^url.}.}.}
{.set|dest|{.or|{.filename|{.postvar|dest.}.}|{.filename|{.^url.}.}|downloaded.}.}
{.delete|{.^dest.}.}
{.add to log| loading from '{.^url.}' {.filesize|{.^url.}.} bytes .}
{.break|if={.not|{.filesize|{.^url.}.}.}|result=Source file can't be downloaded: server return nul size.}
{.set|from|0.}
{.save|{.^dest.}|.}
{.comment| define CHUNK with the min size, if nul then WHILE is never executed.}
{.set|chunk|{.min|{.filesize|{.^url.}.}|10000000.}.}
{.while|chunk|{:
   {.load|{.^url.}|from={.^from.}|var=data|size={.^chunk.}.}
   {.add to log| chunk={.^chunk.} from={.^from.} size={.length|var=data.}.}
   {.if|{.length|var=data.}
      |   {:
         {.append|{.^dest.}|var=data.}{.inc|from|{.length|var=data.}.}
         {.comment| redefine CHUNK with the min size, if nul then WHILE is stoped.}
         {.set|chunk|{.min|{.^chunk.}|{.sub|{.filesize|{.^url.}.}|{.^from.}/sub.}.}.}
         :}
      | {:{.set|chunk|0.}:}
   /if.}:}
   |timeout=0
   |else={:{.add to log|Error during WHILE.}
   :}/while.}
{.add to log| saved file contain {.^from.} bytes.}

» Ideas for solutions/enhancements: I don't know if there is a way to 'detect' if a connection was closed (or dropped) while downloading a chunk, but I think a simple verification of file size after finish the download, could be of help. Also, perhaps a 'progress' bar could be of help, meanwhile the file is downloading (if the download on the server takes too long).

I was trying to add a 'progress' bar, like HFS has by default for uploads (useful if the download on the server takes too long), but using %progress-files%, %total% and %speed-kb% doesn't seem to have effect.
« Last Edit: May 27, 2018, 04:10:42 AM by LeoNeeson »
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline Mars

  • Operator
  • Tireless poster
  • *****
    • Posts: 2059
    • View Profile
the solution is to memorize the size of the initial file in a temporary variable, and when the complete download is complete, to check that it matches the saved file. If this is not the case, we force the physical deletion of the loaded file, except if source file is on a FTP server, in this case interrupted download can be restarted later

at the beginning of script, we test used protocol, if FTP  is used then we start download from the last know position, else destination is simply deleted

The FTP protocol is functional, but the download can only be done in full or be taken at the position of its interruption. Due to my limited knowledge about using sockets, I could not set up partial loads like with the http

* bug corrected: the file was tranferred to the Recycle Bin instead of being physically destroyed, which could have led to disk saturation

there is a serious security hole in the use of this script, this one in the state allows access to the entire hard disk hosting HFS

the macro load is intended to work with windows-type paths as c:\windows\explorer.exe

so if it is offered the opportunity to a user of your hfs to use the form, it will be able to recover any file of your hard disk by transferring it in hfs

moreover the url can be encrypted using % followed by a hex code representing a character of the ascii table as c%3A%5Fwindows%5Fexplorer.exe

it is therefore absolutely necessary to detect and prevent any continuation of the script if the url contains the \


Quote
<form method='post'>
URL: <input name='url' value="http://distro.ibiblio.org/tinycorelinux/9.x/x86/release/Core-9.0.iso">
<br>Filename: <input name='dest' value="%folder%Core-9.0.iso">
<br><input type='submit'>
</form>

{.set|url|{.postvar|url.}.}

{.break|if={.not|{.^url.}.}.}

{.break|if={.count substring|\|{.decodeuri|{.^url.}.}.}|result=Direct acces on hard disk not allowed.}

{.set|dest|{.or|{.filename|{.postvar|dest.}.}|{.filename|{.^url.}.}|downloaded.}.}

{.set|filesize|{.filesize|{.^url.}.}.}

{.add to log|Start loading from '{.^url.}' {.^filesize.} bytes.}

{.if|{.match|^ftp?://|{.^url.}.}| {:{.set|from|{.^filesize.}.}:} | {:{.delete|{.^dest.}|bin=0.}{.set|from|0.}:}.}

{.break|if={.not|{.^filesize.}.}{.=|{.^filesize.}|{.filesize|{.^dest.}.}.}|result=Source file can't be downloaded: server return nul size or destination match source size.}

{.save|{.^dest.}|.}
{.^from.}
{.comment| define CHUNK with the min size, if nul then WHILE is never executed.}
{.set|chunk|{.min|{.^filesize.}|10000000.}.}
{.^chunk.}
{.while|chunk|{:
 {.add to log|loading {.^chunk.} bytes.}
   {.load|{.^url.}|from={.^from.}|var=data|size={.^chunk.}.}
   {.length|var=data.}
   {.add to log|Download: from={.^from.} request={.^chunk.}  loaded= {.length|var=data.}.}
   {.if|{.length|var=data.}
      |   {:
         {.add to log|TPL sauvegarde en cours.}
         {.append|{.^dest.}|var=data.}{.inc|from|{.length|var=data.}.}
         {.comment| redefine CHUNK with the min size, if nul then WHILE is stoped.}
         {.set|chunk|{.min|{.^chunk.}|{.sub|{.^filesize.}|{.^from.}/sub.}.}.}
         :}
      |   {:
         {.set|chunk|0.}
         {.if|{.and| {.not|{.match|^ftp?://|{.^url.}.}/not.} | {.not|{.^filesize.} = {.filesize|{.^dest.}.}/not.}   /and.}
            |{:{.delete|{.^dest.}|bin=0.}:}.}
         {.add to log|End of download.}
         :}
   /if.}:}
   |timeout=0
   |else={:{.add to log|Error during WHILE.}:}
/while.}
{.add to log|Saved file contain {.^from.} bytes.}

I am studying the case taking into account the eventuality where the source file would be also the file of destination and would be destroyed irremediably during the transfer and its size becomes zero
« Last Edit: May 28, 2018, 03:27:13 PM by Mars »


Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 842
  • Status: On hiatus (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
there is a serious security hole in the use of this script, this one in the state allows access to the entire hard disk hosting HFS
Thanks for reporting the security issue. I've updated the first post to warn users about this.

I am studying the case taking into account the eventuality where the source file would be also the file of destination and would be destroyed irremediably during the transfer and its size becomes zero
Do you mean: if the same destination file exist, then delete the destination file? I think it's better to automatically rename the destination file, not to delete/overwrite it. So, if destination file ("filename.zzz") exists, then save new file destination as "filename_001.zzz". If "filename_001.zzz" exists, then save as "filename_002.zzz", or "filename_003.zzz", "filename_004.zzz", and so on...
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline Mars

  • Operator
  • Tireless poster
  • *****
    • Posts: 2059
    • View Profile
Do you mean: if the same destination file exist, then delete the destination file? I think it's better to automatically rename the destination file, not to delete/overwrite it. So, if destination file ("filename.zzz") exists, then save new file destination as "filename_001.zzz". If "filename_001.zzz" exists, then save as "filename_002.zzz", or "filename_003.zzz", "filename_004.zzz", and so on...

I advance slowly but I found a solution for that, what posed me the most problem was to choose to integrate it in the macro save or to create a new one, the second chose turned out much wiser, and the name of the macro will be left to the complacency of rejetto if a better name is dear to him.

I made the temporary choice as we could find it in the wiki:  ;D

 {.first free|A.} 
 You can specify A as a file name, or as a URL. It will load and expand to it. The file or URL you specify must be accessible from the server machine. A can be C:\windows\win.ini or also absolute /another_file_in_VFS/the_file_i_want.txt or also relative from current folder hello/the_file_i_want.txt
 in case the path is not provided, the macro generates an error of unsatisfactory parameters
 Optional parameter limit  to specify a max seaching index. if ommited default value is 10.
 Example  {.first free|c:\temp\hfs.tmp|limit=5.}

the principle is to provide  to the macro a path in the form of url or absolute, an increment is then used to find a new free file name, without exceeding the number of 10, unless we specify a parameter named 'max'

the function will return an empty string in case of invalid path, or of limit exceded

Code: [Select]
  function firstfree():string;
    var
      dest, ext: string;
      i, limit : integer;
    begin
      result:='';
      try
      limit:=parI('limit', 10);
      pars.Delete(pars.IndexOfName('limit')); // to be sure there is a valid parameter for the path in pars[1]
      p:=trim(pars[1]);
      // we make sure that the file can be put on a physical support
      if not DirectoryExists(uri2diskMaybeFolder(p)) then exit;
      ext:=extractFileExt(p);
      dest:= copy(p,1,length(p)-length(ext));
      i:=0;
      while fileExists(uri2diskMaybe(p)) do
        begin
        inc(i);
        // test limit here to avoid prolonging the loop unnecessarily
        if (i > limit) then exit; // function must return an empty string
        p:=format('%s (%d)%s', [dest, i, ext]);
        end;
      except exit; end;
      result:=p;
    end;
« Last Edit: May 30, 2018, 07:17:48 PM by Mars »


Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 842
  • Status: On hiatus (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
Howabout this way, with %encoded-folder% as the destination?  Better security.
Code: [Select]
<center><form method='post'>Paste a URL: <input name='url' value=""><input type='hidden' name='dest' value=""><br><input type='submit' value='Transfer'></form>
{.set|url|{.postvar|url.}.}
{.break|if={.count substring|\|{.decodeuri|{.^url.}.}.}|result=Direct access on hard disk not allowed.}
{.set|dest|%encoded-folder%{.or|{.filename|{.postvar|dest.}.}|{.filename|{.^url.}.}.}.}
{.set|filesize|{.filesize|{.^url.}.}.}
{.if|{.match|^ftp?://|{.^url.}.}| {:{.set|from|{.^filesize.}.}:} | {:{.delete|{.^dest.}|bin=0.}{.set|from|0.}:}.}
{.break|if={.not|{.^filesize.}.}{.=|{.^filesize.}|{.filesize|{.^dest.}.}.}.}
{.save|{.^dest.}|.}
{.^from.}
{.comment| define CHUNK with the min size, if nul then WHILE is never executed.}
{.set|chunk|{.min|{.^filesize.}|10000000.}.}
{.^chunk.}
{.while|chunk|{:
 {.add to log|loading {.^chunk.} bytes.}
   {.load|{.^url.}|from={.^from.}|var=data|size={.^chunk.}.}
   {.length|var=data.}
   {.add to log|Download: from={.^from.} request={.^chunk.}  loaded= {.length|var=data.}.}
   {.if|{.length|var=data.}
      |   {:
         {.add to log|saving.}
         {.append|{.^dest.}|var=data.}{.inc|from|{.length|var=data.}.}
         {.comment| redefine CHUNK with the min size, if nul then WHILE is stoped.}
         {.set|chunk|{.min|{.^chunk.}|{.sub|{.^filesize.}|{.^from.}/sub.}.}.}
         :}
      |   {:
         {.set|chunk|0.}
         {.if|{.and| {.not|{.match|^ftp?://|{.^url.}.}/not.} | {.not|{.^filesize.} = {.filesize|{.^dest.}.}/not.}   /and.}
            |{:{.delete|{.^dest.}|bin=0.}:}.}
         {.add to log|End of download.}:}/if.}:}|timeout=0.}</center>
Thanks! I will check it out :)

if reMatch(fn, '^https?://', 'i!') > 0 then
    try result:=httpGet(fn, from, size)
Where does that part go?
It's already part of the source code (in the file 'scriptLib.pas'). I think it was pointed (cited) by Mars to ask to Rejetto why HTTPS doesn't work. I hope Rejetto could bring us some information about this...
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13510
    • View Profile
i'm sorry but the original post from danny was deleted and i don't understand what information you are asking to me


Offline LeoNeeson

  • Tireless poster
  • ****
    • Posts: 842
  • Status: On hiatus (sporadically here)
    • View Profile
    • twitter.com/LeoNeeson
i'm sorry but the original post from danny was deleted and i don't understand what information you are asking to me
Danny said this: "The transload feature could be ready for prime-time, if it were possible to load files from https:// url source.". But it seems HFS can't handle any kind of HTTPS, right?...
HFS in Spanish (HFS en Español) / How to compile HFS (Tutorial)
» Currently taking a break, until HFS v2.4 get his stable version.


Offline rejetto

  • Administrator
  • Tireless poster
  • *****
    • Posts: 13510
    • View Profile
yes, for what i've seen, HFS is not able to load https URLs at the moment