Hello!
First of all, thank you for the great component. I have moved from Phoca Download and can say jDownloads is a way better!
But I have some annoying problem with file names.
I have carefully read docs and have enabled UTF-8 support in the component settings. After that, folders named in russian are creating without any problem only when utf-8 option is enabled. But when I uploaded file with the russian name, for example, "тест.txt", I was unable to download it. System just didn't found it. I have looked into DB jdownloads_files table and turns out, that file name is not saved correctly. Instead of "тест.txt" we have ".txt". File itself is uploaded correctly and shows in folder like "тест.txt".
So main problem is that file names are not converted from windows-1251 to utf-8 and, probably, because of this, a not saving correctly to DB. I asume we sould do something like this $file = iconv("windows-1251", "UTF-8", $filename), but can't figure out, where to do this.
Please, can you help me with this issue?
Info
JDownloads: 3.2.16 Beta
Joomla version: Joomla! 3.3.6 Stable [ Ember ]
MySQL: 5.6.19
PHP: 5.5.9
Cache: all Joomla cache options are deactivated
Files are uploading from Windows 8.1
P.S. My web server runs on production server with Ubuntu 14.04.1 LTC and support utf-8.
Hi,
the utf-8 support is not easy and i think this is the reason that nearly no software will support this.
Quote
I asume we sould do something like this $file = iconv("windows-1251", "UTF-8", $filename), but can't figure out, where to do this.
All upload forms should be defined as utf-8 but maybe must we convert here the filename.
Thanks for this hint. I will check it.
Hi! That would be great, because it is IMHO most simple and working solution! Uploading files in utf-8 and folder creation are working great, problem is only with saving correct value into the DB.
BTW, may be you can point at the place in the jDownloads code, where it will be most appropriate to do?
Thank you!
Sorry but i can not view now in the source code.
So you must wait a few hours. :-\
Ok, it's not a problem! Thank you very much!
I also have found the bug with transliteration of russian directory names. When it is enabled, folders rename into this format: "2014-11-14 10:11:54" instead of transliteration. Standart Joomla transliteration is working correctly! Are this problems connected?
So this was not correct in your first post?
Quote
After that, folders named in russian are creating without any problem.
???
Edit:
please post here all your settings from the 'folder and files' TAB in your jD configuration.
My mistake! Working well, if only utf-8 option is enabled.
Info:
1) When UTF-8 is set to "No"
You see, alias is formed correctly "proverka", but folder not "2014-11-14 15:53:48", problem with DB exist.
2) When UTF-8 is set to Yes
Current settings. Folders are created correctly with russian names, problem with DB exist.
P.S. I have tried to disabled "change names to lowercase" and "Replace spaces with underscores in names" problems with transliteration and DB population stay.
I have attached file with russian name for tesing (inside zip).
[gelöscht durch Administrator]
One strange behavior - when uploading file with russian name and "_" betwin words, like Тестиг_тестинг.doc, JDocs trims it to _тестинг.doc, and when uploading тестинг.doc it simply trims it to .doc.
I am speaking about value in DB. File, itself, uploaded fine and remain untouched.
So the problem is only with the(uploaded) filename self, or also how is stored the filename in the DB? ::)
1) Problem is not with filename itself (it is uploaded correctly), but with how it is stored in the DB (because of it we a getting broken download link)!
2) There is the problem with transliting russian Folder names into transliteration. Here we are talking only about a folders.
Hi
I think that the jDownloads transliteration tables need extending. The present tables work fine for 'Western European' languages but have a problem with Cyrillic languages such as Russian. Also I do not think it is possible to user a single table for Cyrillic languages or include the in the existing table as there are potential clashes of usage. Rather it would seem to me to make use of the native locale to decide what language to use. An alternative and perhaps more extensible scheme is to treat the transliteration tables as 'transliteration packages' along the lines of the 'language packages' that are loaded by the admin person.
Attached as an example is a sort of transliteration package for Russian. With a few minor extensions I think this would also cover most of the Cyrillic countries.
I can readily get Czech, Hungarian, Lithuanian, Polish, Slovak, Serbian, Yugoslavian and Ukrainian txt files is the same form as the Russian one.
Also have a Greek one that could be added into the existing table as a short term measure.
As a very short term measure the selection of which transliteration table should be used could be specified in the Configuration, with Western European as the default for existing users.
Colin
[gelöscht durch Administrator]
to ColinM
I agree about the transliteration table, but have a question - could we use Joomla core transliteration functional, as we do in alias field, for example?
What do you think about the problem with storing cyrillic filenames in a DB?
Hi
Following your question I delved a bit deeper got useful article https://www.corejoomla.com/forum/community-polls/13873-alias-transliteration.html
and followed up to see stringurlsafe at http://www.reference.joomlademo.de/nav.html?_functions/index.html#stringurlsafe
It processes a string and replaces all accented UTF-8 characters by unaccented ASCII-7 "equivalents", white spaces are replaced by hyphens and the string is lowercase.
Best reference is http://www.reference.joomlademo.de/nav.html?libraries/joomla/filter/output.php.source.html#l77
Code is very short
public static function stringURLSafe($string)
88 {
89 // Remove any '-' from the string since they will be used as concatenaters
90 $str = str_replace('-', ' ', $string);
91
92 $lang = JFactory::getLanguage();
93 $str = $lang->transliterate($str);
94
95 // Trim white spaces at beginning and end of alias and make lowercase
96 $str = trim(JString::strtolower($str));
97
98 // Remove any duplicate whitespace, and ensure all characters are alphanumeric
99 $str = preg_replace('/(\s|[^A-Za-z0-9\-])+/', '-', $str);
100
101 // Trim dashes at beginning and end of alias
102 $str = trim($str, '-');
103
104 return $str;
105 }
Maybe it could be included by Arno in a similar way but without the strtolower
Colin
I think, the last one is the best solution!
I have tested it with another Joomla extention - Fabrik, upload element. And it works great!
Ähm...
what is with this?
Quote
I have found a simpler solution!
We should use Joomla standard transliteration method:
$lang = JLanguage::getInstance('ru-RU'); (in this case)
$filename = $lang->transliterate($filename);
and that is it!
It is about integrating filename transliteration option.
ColinM suggest more detailed answer, so I have removed my post.
Idea is to use Joomla standard method to transliterate filenames from any lang.
Can you implement this code?
I will try it...
Any progress about correcting filename saving into DB?
If you need any additional info, just let me know!
Sorry for the delay but i work extensive on the new jD content plugin.
But i had to wait until my test domains was moved from my hoster to another server with utf-8 support. :-\
This is now done so i can start now with some tests.
Edit:
Use you a windows 7 system for the file upload?
Frontend or backend creation form?
Which ftp program use you to check the upload result (filename) on the server?
Hi,
i have now the first results.
Server Test Environment:jDownloads 3.2.16
PHP Built On
Linux dd34424 3.2.0-67-generic #101-Ubuntu SMP Tue Jul 15 17:46:11 UTC 2014 x86_64
Database Version
5.5.40-nmm1-logDatabase Collation
utf8_general_ciPHP Version
5.3.28-nmm1Web Server
ApacheWebServer to PHP Interface fpm-fcgi
Joomla! Version
Joomla! 3.3.6 Stable [ Ember ] 01-October-2014 02:00 GMT
Joomla! Platform Version Joomla Platform 13.1.0 Stable [ Curiosity ] 24-Apr-2013 00:00 GMT
User Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:33.0) Gecko/20100101 Firefox/33.0
I have use this configuration settings:- Create the directory name for a category automatically?
Yes- Use UTF-8?
Yes- Replace spaces with underscores in names?
No- Change names to lowercase?
No- Remove/Change special characters in name?
NoFor the upload i have renamed before a zip file to
файл.zip with the
Windows 7 File Explorer.
The new filename is correct viewed in the File Explorer.
Test Procedure:1. go to backend 'downloads'
2. click on 'new'
3. type in a title and select as main file from the windows directory a file with name: файл.zip (see pic 1039a & 1039)
4. save the new download
5. result check in the new download self (see pic )
6. result check in the DB (see pic 1040)
7. result check on the server with the ftp program FileZilla 3.7.1 (see pic 1040b)
8. result check via WebFTP Script on the server (see pic 1040c)
9. go to the jDownloads in the frontend and try to download the file (see pic 1040d)
10. test the downloaded file with winzip to make sure that it is not corrupt (result okay)
I must say that i have only problems to view the filename with FileZilla. This seems not to support utf-8.
All other test results are for me okay! ;)
So i can not reproduce why you get wrong values in your DB. Maybe a wrong Database Collation?
I will tomorrow test the 'folder creation process' with russian characters...
Edit:@ Colin:
QuoteIt processes a string and replaces all accented UTF-8 characters by unaccented ASCII-7 "equivalents", white spaces are replaced by hyphens and the string is lowercase.
Thanks for the hints. But I know this and i have already a modified version from this in uses: getCleanFolderFileName($str, $is_monitoring = false) in backend jdownloadshelper.php.
[gelöscht durch Administrator]
Quote from: Arno on 19.11.2014 00:16:24
Sorry for the delay but i work extensive on the new jD content plugin.
But i had to wait until my test domains was moved from my hoster to another server with utf-8 support. :-\
This is now done so i can start now with some tests.
Edit:
Use you a windows 7 system for the file upload?
Frontend or backend creation form?
Which ftp program use you to check the upload result (filename) on the server?
1) I have uploaded files from both Windows 8.1 and Ubuntu 14.10. Results were the same.
2) Fronted and backend. Problem is the same in both scenarios.
3) FTP|SFTP client - WinSCP
Do yo have mbstring HTTP input encoding translation enable?
Could you please post your administrator/index.php?option=com_admin&view=sysinfo page
Especially interested in mbstring settings!
I think it may be something with php settings, that I am missing!
Thank you!
Relevant PHP Settings
Setting Value
Safe Mode Off
Open basedir None
Display Errors On
Short Open Tags On
File Uploads On
Magic Quotes Off
Register Globals Off
Output Buffering Off
Session Save Path /tmp
Session Auto Start 0
XML Enabled Yes
Zlib Enabled Yes
Native ZIP Enabled Yes
Disabled Functions None
Mbstring Enabled Yes
Iconv Available Yes
I have done all actions the same way, as you did!
All this settings is the same as yours except disabled functions:
disable_functions pcntl_alarm, pcntl_fork, pcntl_waitpid, pcntl_wait, pcntl_wifexited, pcntl_wifstopped, pcntl_wifsignaled, pcntl_wexitstatus, pcntl_wtermsig, pcntl_wstopsig, pcntl_signal, pcntl_signal_dispatch, pcntl_get_last_error, pcntl_strerror, pcntl_sigprocmask, pcntl_sigwaitinfo, pcntl_sigtimedwait, pcntl_exec, pcntl_getpriority, pcntl_setpriority
I have made some experiments with mb strings settings and it turns out there is some connections.
With this settings I was able to download and upload files (filename in the db is not trimmed), but file name is a mess. Please, look at the SH and send me this values of yours.
AddDefaultCharset UTF-8
php_value mbstring.language Neutral
php_value mbstring.internal_encoding UTF-8
php_value mbstring.encoding_translation On
php_value mbstring.http_input auto
php_value mbstring.http_output auto
php_value mbstring.detect_order auto
#php_value mbstring.substitute_character none
php_value default_charset UTF-8
[gelöscht durch Administrator]
See pic from phpinfo
[gelöscht durch Administrator]
What has worked here exactly with Fabrik?
http://www.jdownloads.com/forum/index.php?topic=7528.msg29442#msg29442
With your settings i am getting this:
Hmmm
[gelöscht durch Administrator]
Quote from: Arno on 19.11.2014 11:28:46
What has worked here exactly with Fabrik?
http://www.jdownloads.com/forum/index.php?topic=7528.msg29442#msg29442
Transliteration of names of uploaded files using Function stringURLSafe, provided by ColinM.
Quote from: Makulia on 19.11.2014 11:34:07
Transliteration of names of uploaded files using Function stringURLSafe, provided by ColinM.
What for a language have you activated in Joomla?
Quote from: Arno on 19.11.2014 11:39:00
What for a language have you activated in Joomla?
Russian Frontend and Backend.
I have tried to switch to English. The same result!
The only different seems for me to be how is the filename stored in the db.
I have checked again the settings in my phpmyadmin and i have seen this setting (see pic)
I have for my tests only use english language in backend and frontend.
I will test it again with russian language. But it works for me without any special transliteration for cyrillic characters. So it does exactly what it shall do.
[gelöscht durch Administrator]
With your settings file itself is uploading incorrectly.
So I believe problem is with php settings, not Mysql.
I can manually via PHPMA change file name in the DB, and it saves correctly.
[gelöscht durch Administrator]
Have you read my last posting above?
Yes I have!
1) My DB settings are identical, pls look at SH
2) I have tried CP UI in the English lang, it makes no difference.
[gelöscht durch Administrator]
What mean you with SH, and CP UI ? ???
Control panel user interface?
but SH?
Sorry, SH = Screenshot, CP UI - Control Panel User Interface.
Update on this case! I have tested Jdownloads on another web-server with php running from php5-fpm and it worked like a sharm! My current web-server running with Apache2+mod_php
First, I sought that there is a problem in my php.ini config. I copy php.ini from php5-fpm and..problem is still exist.
My conclusions:
1) The problem is with mod_php.
2) The problem is with on of php modules.
Very good message! ;D
Can you describe a little bit more what exactly seems to be the problem on your first server?
The most users here are nor really server experts. ;)
Sorry for the delay.
I was busy re-configuring my web server to the stuck Apache2.4+PHP5-FPM.
And guess what? Problem with encoding still in the place! Unbelievable! Investigation continues.
My first stuck was Apache 2.4 + mod_php. You know that we can execute php script in a number of ways. Apache, as web server, only receives requests and redirects it to the php interpreter. And in this step we have a several ways to redirect requests to the PHP interpreter.
First and most easy to config - through libapache2-mod-php5. But this approach has a number of disadvantages. You are unable to configure php environment individually for each of your php applications. Also you can only run all of your applications from a single user (Apache user, like www-data.www-data.). Apache2.4+ libapache2-mod-fastcgi +php5-fpm allows you to set php environment independently for each web-app and also runs them from a different users.
Very strange ::)
Maybe should you look more at the database. You had described that in your case is the filename not correctly stored in cyrillic chars?
No, it is not the DB itself, because:
1) I have attached you screenshots of my db. You can see that settings are correct.
2) All other Cyrillic names are stored in the DB without problem (User names, Article Titles, Jdownloads category titles). Problem is only with Jdowloads filenames. And that is really wired. As I have already told, if you replace filename in the DB manually, Jdownloads works correctly.
Problem rises only when file name of Jdownload file is saving to DB. Could you, please, point me in the exact place in the code, where storing to the DB takes place, just in case.
Go to /administrator/components/com_jdownloads/tables/download.php function checkData()
Here is the files handling.
Thank you, I will check it out.
I have a new clue. I have created VM with Ubuntu 14.10 , php 5.5, mysql 5.6.
And stuck with the same bug. I have changed mysql to 5.5, no result. I have also unloaded almost every modules from php setup, replace php.ini config and mysql config with one from working server.
No effect! It seems to me, that probelm is with php 5.5. My second config has Ubuntu 13.04 and php 5.4.9 and your config has even 5.3.28.
May be it is the bug inside the php or in a newer php version something changes related to encodings, so old approach is not working correctly!
Pleas confirm, which version of the php do you use in your dev environment? Do you use ubuntu packs or build php from sources (phpbrew maybe)? It would be great, if you can test Jdownloads with php 5.5!
Could it be some issues with php5.5 according to this:
http://php.net/manual/ru/migration54.incompatible.php?
Hi,
i had used a server from my hoster for my earlier tests.
And on this i can not change the php version. But i will contact my hoster.
And locally i use only xampp on windows 7. And i am not sure that it is useful to test it on a windows system. ::)
When you have all this problems, could it be an alternate when we use your russian tranliteration file for your filenames?
Quote from: Arno on 27.11.2014 12:37:54
Hi,
i had used a server from my hoster for my earlier tests.
And on this i can not change the php version. But i will contact my hoster.
And locally i use only xampp on windows 7. And i am not sure that it is useful to test it on a windows system. ::)
When you have all this problems, could it be an alternate when we use your russian tranliteration file for your filenames?
Try to setup virtual machine with Ubuntu server or desktop and test it there. It is really awesome to quickly shift between different setups. Also I recommend phpbrew version manager https://github.com/phpbrew/phpbrew for testing different php builds.
I have thought about it as the last option, but rejected. We live in a multilingual environment and sticking to ugly transliteration is not a good way of treating users. Let us see a use case:
1) Users stores all files in there native lang, because not all of users are comfortable with English (Such a pity :( )
2) Users upload files with native names, then download it with transit names. After they need to rename file back to native to make there file-structure consistent. The they modify files and upload them again.
Good example here is wikimedia. This CMS from the start has no problem with Cyrillic file names, Cyrillic seo-links or any other language. I believe it is the problem of implementation, because files itself uploaded to server hdd correctly. Only problem is saving there names into the DB. I think we should use mbstring functions to deal with file names! BTW, have you tried it?
If I can help you in any way with investigating this issue, just let me know. If need be, can discuss this problem on skype.
Hi.
QuoteWe live in a multilingual environment and sticking to ugly transliteration is not a good way of treating users.
Exactly. This is the reason why i have implemented this possibility for utf-8 filenames.
QuoteOnly problem is saving there names into the DB. I think we should use mbstring functions to deal with file names! BTW, have you tried it?
I must not try this as i have not any problems to store the (cyrillc) filenames in the DB. :)
And i have seen that i use not any additional mbstring options in the .htaccess file (
different to you). So on my server are only the early posted options used.
I have an answer from my hoster and i can self change my php version in the .htaccess file like this:
- AddHandler php56-cgi .php (when supported from server)
- AddHandler php55-cgi .php (when supported from server)
- AddHandler php54-cgi .php (when supported from server)
- AddHandler php53-cgi .php
- AddHandler php52-cgi .php
But what should i do here since i have not your problems here?
Shall be the target whether i also get your problems when i change to your used php version? ::)
QuoteIf need be, can discuss this problem on skype.
Sorry but skype is not a choice for me in this case. I can write english passable but it is not good enough for a live talk. :-\
Quote from: Arno on 27.11.2014 16:10:49
Shall be the target whether i also get your problems when i change to your used php version? ::)
Yes, please try with php 5.5 and php 5.6 and tell, if you will have any problems, similar to mine.
See, a lot of setups has php 5.4+ because it is default in ubuntu 13.10 and higher. And it will be good to know, that problem is not in php 5.3 and its mbstring realization.
I have used additional .htaccess mb_string configs just for the test. Without them problem stays the same!
I think, the main goal here is to make Jdownloads as much compatible with different kinds of production environments as possible.
I have made the titanic work ;) and finally,
problem is located!!!
Problem descriptionIt is 100% problem of
using php basename function! I am sure now, that you can not replicate this bug because on Windows this seems to work fine, but on some Linux php builds unicode characters where inadvertently removed!
QuoteFrom official php manual: http://php.net/manual/en/function.basename.php
On Windows, both slash (/) and backslash (\) are used as directory separator character. In other environments, it is the forward slash (/).
If the name component ends in suffix this will also be cut off.
basename() is locale aware, so for it to see the correct basename with multibyte character paths, the matching locale must be set using the setlocale() function.
I have tried official Ubuntu PPA as well as
Ondřej Surý PPA. Builds from both of them suffer from basename bug.
You can 100% replicate this bug if you install Ubuntu 14.04|14.10 on virtual machine and inside it php through apt-get. Way of calling php (mod-fastcgi or mod_php) doesn't matter.
Then I have installed several php 5.4 and 5.5 build through PhpBrew. This builds have not suffer from this bug (because they were build from source code).
So all official Ubuntu php builds installed trough apt-get
suffer from this bug!
Doesn't matter:1) Php version and php.ini settings have
absolutely no positive effect.
2) Ubuntu version also doesn't matter.
3) Even DB encoding dosen't matter. I have tested with utf8-general-ci and latin-1. With this bug fix names stores correctly even in latin1_swedish_ci!
Please, look to
this topic through google translator.
Topic starter has the same problem with uploading and getting filepath using basename!
Problem solutionsI have tried several solutions. Hear is the results: 1) Do not use default basename function and write our own function
function basename2($path){
return substr(strrchr($path, "/"), 1);
}
$this->url_download = basename2($target_path);
2) Use basename function but set locale before it
defined('_JEXEC') or die('Restricted access');
setlocale(LC_ALL, 'C.UTF-8', 'C');
I have tested both approaches and all of them worked!
IMPORTANT! In order to correctly work with locale we must insert this code both in frontend and backend download handlers!
Final thoughtsSimple googling and reading http://php.net/basename user comments about utf8 and basename turns into understanding that basename function is not UTF8 save.
So if we want to use basename we MUST set userlocale first!
bugs.php.net/bug.php?id=60554c
Quotebasename() features this warning:
basename() is locale aware, so for it to see the correct basename with multibyte character paths, the matching locale must be set using the setlocale() function.
As I understand, you're passing it utf-8 data while setting the locale to something else; this is not allowed.
Quote
There is a real problem when using this function on *nix servers, since it does not handle Windows paths (using the \ as a separator). Why would this be an issue on *nix servers? What if you need to handle file uploads from MS IE? In fact, the manual section "Handling file uploads" uses basename() in an example, but this will NOT extract the file name from a Windows path such as C:\My Documents\My Name\filename.ext. After much frustrated coding, here is how I handled it (might not be the best, but it works):
<?php
$filen = stripslashes($_FILES['userfile']['name']);
$newfile = basename($filen);
if (strpos($newfile,'\\') !== false) {
$tmp = preg_split("[\\\]",$newfile);
$newfile = $tmp[count($tmp) - 1];
}
?>
$newfile will now contain only the file name and extension, even if the POSTed file name included a full Windows path.
What do you think of it?
Some additional links for reference:http://www.sitepoint.com/localizing-php-applications-2/
Hi Makulia,
congratulations that you have found find the causer. 8) ;) :) :D ;D
I know that you have spend very much time to find it. Many thanks for your help.
I think i will add your first solution with an own 'basename' function.
But a last question: i use the 'basename' function in many functions from the source code.
Have you replaced for your tests ALL this with the own basename2() function?
Quote from: Arno on 01.12.2014 10:43:50
Have you replaced for your tests ALL this with the own basename2() function?
No, I have not, but it can be easily done with find replace using IDE or notepad++. Basename includes can be found using something like Total Commander.
But I
strongly suggest you to stick with second solution.
It is right approach according to php manual! And you don't need to worry about basename2 function work.http://php.net/manual/en/function.basename.php
Quotebasename() is locale aware, so for it to see the correct basename with multibyte character paths, the matching locale must be set using the setlocale() function.
QuoteNo, I have not, but it can be easily done with find replace using IDE or notepad++.
This is not a problem for me.
I would only know what you have doing for your tests. ;)
QuoteBut I strongly suggest you to stick with second solution.
Quotebasename() is locale aware, so for it to see the correct basename with multibyte character paths, the matching locale must be set using the setlocale() function.
Hm... but what is 'the matching locale must be set using the setlocale() function'. How can i be sure that it is always correctly?
I have no experiences with the setlocale function.
You have use in your example:
setlocale(LC_ALL, 'C.UTF-8', 'C');
Is this setting for all users correctly? Or only in your case? ::)
Quote from: Arno on 01.12.2014 11:07:56
Is this setting for all users correctly? Or only in your case? ::)
I have not tested it, but 99% sure it is, because it is the way, php developers mean this function to work!
You can easily test it. Download some files in other langs and try to upload them.
I will also check this out.
Okay. Many thanks again. ;)
Thank you too! In my opinion, Jdowloads is the best filestorage solution for Joomla. BTW, what about translation? I have PM you my offer :)
P.S. When do you plan to release fixed build of jDownloads?
Sorry but i was in the last days very busy with some bug fixes and the new jD content plugin.
I will publish all this very soon (tomorrow?).
QuoteI can translate and send to you .ini files or work through Transifex.
Every help also here is very welcomed. Please use for it the jD 3.x translations group on transifex.
It exist already a russian group so you must not start with null:
https://www.transifex.com/projects/p/jdownloads-s3/language/ru_RU/
Send here a request and i will add you to this team. 8)
Quote from: Arno on 01.12.2014 11:28:53
Send here a request and i will add you to this team. 8)
Great! Request sent!