rejetto forum

Software => HFS ~ HTTP File Server => FHFS => Topic started by: JGraceffo on April 09, 2014, 03:57:13 PM

Title: FHFS having issues with special characters in file names
Post by: JGraceffo on April 09, 2014, 03:57:13 PM
Hello, I've been working with FHFS for a fair bit now and it's been great but we've been noticing some links that we were sending to people were giving a 404 file not found error.

I dug a little deeper yesterday and found that this was happening only with files with a "#" in their name.  I didn't test any other special characters because our file names don't contain any, but I was wondering if anyone had a fix or workaround (besides renaming files).

Thanks in advance for any help!
Title: Re: FHFS having issues with special characters in file names
Post by: bmartino1 on April 12, 2014, 06:36:14 PM
i don't use FHFS, but if you need a quick and easy way to rename files, here is a free open source program that can help

http://www.den4b.com/?x=products&product=renamer

it called renamer, you point to a directy, and ad rules to name the files, a preview will apear donw below to what the files are called...
Title: Re: FHFS having issues with special characters in file names
Post by: LeoNeeson on April 13, 2014, 02:20:20 AM
i don't use FHFS, but if you need a quick and easy way to rename files, here is a free open source program that can help
Thanks for sharing that, but ReNamer is not Open Source, it's freeware at least (in his Lite edition, but it's normally a paid application). Thanks anyway.....
Title: Re: FHFS having issues with special characters in file names
Post by: raybob on April 28, 2014, 06:02:29 PM
It's well known that HFS itself can't handle Unicode, and there's nothing I can do about that.  As for the # character, it doesn't work because that's a character with special meaning in URLs.  FHFS doesn't do anything to change it so your browser misinterprets the URL when it tries to get files with the # character.  I suppose you can try taking the URL and replacing # with %23 and see if it works.
Title: Re: FHFS having issues with special characters in file names
Post by: JGraceffo on May 09, 2014, 02:25:53 PM
Interesting, would it be possible to change the "generate random url's (for public uploads)" option in FHFS so that it doesn't append the filename to the end of the randomly generated url? That way we could avoid the # in URLs.

...and HFS doesn't like unicode either? That's a pain, but I suppose we could rename files we want to send.  Thanks for the help, if you happen to think of a workaround or anything else, you can pm me whenever!
Title: Re: FHFS having issues with special characters in file names
Post by: JeremyB796 on May 09, 2014, 05:07:57 PM
This is actually a big issue I'm having-
I use FHFS as a public server for some people...because it's the only thing I have found so far that works.

Most users look to be having an issue with unicode characters, most of them being japanese.
I Personally have no issues with file such as this, but others are when they try to re-download their files...odd

I have the PC set to japanese for non-unicode software, not sure if that does anything, I had it set like that for other software that requires the japanese locale to run...
Title: Re: FHFS having issues with special characters in file names
Post by: bmartino1 on May 10, 2014, 04:57:34 PM
okay, i see the isue now, as if the file was named and saved, there are certain chacters that are not allowed in the name space. This is due to the code runing behinds sceens to make the aplications work:

The # in html broswers when it is read stps the load sequence at that last directory and will contiune to laod that director...
There are many "illegal" chacters that wil intrurpet url path reading which intourns stops actula paths to files.

One of these issues is a known problem with hfs and unicode, other issues are the brtowser and web paths you are using to acess the file. in the end, you will have to look up the "illegal" chacters that will not work and follow the (URI)!

-------------
A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract or physical resource. This specification defines the generic URI syntax and a process for resolving URI references that might be in relative form, along with guidelines and security considerations for the use of URIs on the Internet. The URI syntax defines a grammar that is a superset of all valid URIs, allowing an implementation to parse the common components of a URI reference without knowing the scheme-specific requirements of every possible identifier.
-----------

http://perishablepress.com/stop-using-unsafe-characters-in-urls/
Title: Re: FHFS having issues with special characters in file names
Post by: bmartino1 on May 10, 2014, 05:00:27 PM

Character Encoding Chart

To help promote the cause of Web Standards and adhering to specifications, here is a quick reference chart explaining which characters are “safe” and which characters should be encoded in URLs.

Classification   Included characters   Encoding required?
Safe characters   Alphanumerics [0-9a-zA-Z], special characters $-_.+!*'(), and reserved characters used for their reserved purposes (e.g., question mark used to denote a query string)   NO
ASCII Control characters   Includes the ISO-8859-1 (ISO-Latin) character ranges 00-1F hex (0-31 decimal) and 7F (127 decimal.)   YES
Non-ASCII characters   Includes the entire “top half” of the ISO-Latin set 80-FF hex (128-255 decimal.)   YES
Reserved characters   $ & + , / : ; = ? @ (not including blank space)   YES*
Unsafe characters   Includes the blank/empty space and " < > # % { } | \ ^ ~ [ ] `   YES
* Note: Reserved characters only need encoding when not used for their defined, reserved purposes.

Usafe Characters

More about “unsafe” characters from RFC1738:

Characters can be unsafe for a number of reasons. The space character is unsafe because significant spaces may disappear and insignificant spaces may be introduced when URLs are transcribed or typeset or subjected to the treatment of word-processing programs. The characters “<” and “>” are unsafe because they are used as the delimiters around URLs in free text; the quote mark (“"”) is used to delimit URLs in some systems. The character “#” is unsafe and should always be encoded because it is used in World Wide Web and in other systems to delimit a URL from a fragment/anchor identifier that might follow it. The character “%” is unsafe because it is used for encodings of other characters. Other characters are unsafe because gateways and other transport agents are known to sometimes modify such characters. These characters are “{”, “}”, “|”, “\”, “^”, “~”, “[”, “]”, and “`”.

All unsafe characters must always be encoded within a URL. For example, the character “#” must be encoded within URLs even in systems that do not normally deal with fragment or anchor identifiers, so that if the URL is copied into another system that does use them, it will not be necessary to change the URL encoding.
Title: Re: FHFS having issues with special characters in file names
Post by: raybob on May 11, 2014, 03:29:35 PM
Wait for the next FHFS - supports Unicode and download links ALL look like this:  http://www.example.com/?p=download&id=BaVYlIwuRnxOESVq     :D