News:

Dear forum visitors, if the support forum is not available, please try again a few minutes later. Thanks!

Main Menu
Support-Forum

Fix for the "Download directory monitoring". Adding support for UTF8 file names!

Started by Makulia, 08.12.2014 18:29:00

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Makulia

Hello Arno and Colin!

I am continue my series of posts about utf8 support. Today we are fixing Download directory monitoring feature. I know this is not clearly a bug, but in the light of almost finished full utf8 file names support, I strongly suggest this feature to be fixed!

Problem

Currently, this option can add only limited file names!

I have carefully read:

QuoteImportant information: The 'auto monitoring' function use always the results from the function: 'Remove/Change special characters in name'. Regardless of whether it is activated or not. So you should only use for this files and folders (which are to be added automatically) the characters: a-zA-Z0-9 ._- in the names.

but I was completely unsatisfied with this excuse, so I decided to add support to utf8 names by myself.

We are talking about: /administrator/components/com_jdownloads/helpers/scan.php

Solution

1) Remove unnecessary Encoding  $only_name = utf8_encode(JFile::stripExt($filename));

$only_name = JFile::stripExt($filename);


2) Remove getCleanFolderFileName because we want file name exactly as it is on OS and we can handle this!

$filename_new = $only_name.'.'.$file_extension;


3) Comment line with utf8_encoding dir name
//$cat_dir_value = utf8_encode($cat_dir_value);

4) Disable dir name cleaning    $checked_cat_dir = JDownloadsHelper::getCleanFolderFileName( $cat_dir_value, true );

// check the founded folder name
                  $checked_cat_dir = $cat_dir_value;


Only problem remaining is with Folder and Files aliases, because we need to transliterate file name before forming URL with JApplication::stringURLSafe. But we do not know in what lang or langs file name is written.

Folder alias

So, for now, there are two options:

1) Do not create additional alias if file is not in English and leave code us it is


// set alias
$alias = JApplication::stringURLSafe($cat_dir_value);


2) Create alias but use current user lang settings for that.

Edit:

Assume that a user uploads files with file names which only include english language, like "Test.doc, mix with local language, like "Тест_final.doc", and clear local language like "Тест.doc". To perform a transliteration "from" language MUST be set!
For testing purpose I have hardcoded current user Lang settings and it worked.


// set alias
  $lang = JLanguage::getInstance('ru-RU'); //From lang
  $translitAlias = $lang->transliterate($cat_dir_value);
                                  $alias = JApplication::stringURLSafe($translitAlias);

               

File alias

If we want to make file alias we need to fill 'file_alias' => in the data array



 $sha1_value = sha1_file($target_path);
                            $md5_value  =  md5_file($target_path);                            
   $lang = JLanguage::getInstance('ru-RU');
                            // build data array
                            $data = array (
                               'file_id' => 0,
                               'cat_id' => $id,
                               'file_title' => $title,
                               'file_alias' => $lang->transliterate($filename),



And that is it!



Please, add this fix to the next release!

[gelöscht durch Administrator]
  •  

Arno

Hi Makulia,
many thanks again for your work.  8)
I will add this in next beta but i must check before your 'alias' solution.

Edit:
when it works we should add it also in the internal scan function. ;)
Best Regards / Gruß
Arno
Please make a Donation for jDownloads and/or write a review on the Joomla! Extensions directory!
  •  

Makulia

If we want to be independent from Joomla transliteration, because, ideally, user can upload files in any language, no matter in what language Joomla UI is, we should use some custom transliteration implementation, like:
Quotehttps://github.com/jbroadway/urlify

I will check it out later!

P.S. updated .zip file in the first post!

I think, that we also should try transliterator_transliterate function
http://php.net/manual/ru/transliterator.transliterate.php

But be aware of system requirements: (PHP >= 5.4.0, PECL intl >= 2.0.0);

Update.

I have tried to detect user language, but I think it is not possible, despite setting my default lang to Russian in index.php?option=com_languages. It is because we are not logged in when performing scan, and this code will always return "en-GB".
So we can only make option in our admin panel, so user can manually specify desired language code in format of "xx-XX". After that we can put this value into JLanguage::getInstance($translitLang);


 $lang = JFactory::getLanguage();

               echo '<table width="100%"><tr><td><font size="1" face="Verdana">'.JText::_('COM_JDOWNLOADS_BACKEND_AUTOCHECK_TITLE').'</font><br />';
echo 'Current language is: ' . $lang->getTag();


Could you implement such option?
  •  

Arno

Hm... the scan.php can be started from the jD control panel in the backend but also external (as example via cronjob).
For the backend could we add a new option in the auto monitoring config tab (or better in files and folders tab?  ::) )
This setting (in your case: ru-RU) could we also use for the scan.php as param.

QuoteBut be aware of system requirements: (PHP >= 5.4.0, PECL intl >= 2.0.0);
When we add this only in jD 3.2 it should be possible:
QuoteRequirements for Joomla! 3.x
Software   Recommended   Minimum   
PHP             5.4 +             5.3.10 +    
Best Regards / Gruß
Arno
Please make a Donation for jDownloads and/or write a review on the Joomla! Extensions directory!
  •  

Makulia

1) I think it will be better to put this new option to files and folders, because we may use this settings not only in scan but also in some other cases (Will be good solution for now).
2) Yes, I am talking about implementation of transliteration using this class function http://php.net/manual/en/class.transliterator.php only in jD 3.2!
3) Yes, I know about scan.php executing options. For now I am executing it trough xtcronjob.
  •  

Makulia

I have tested php native transliteration and guess what? It works just awesome!  8) ;D

Testing file

<?php
   $testStr = "№1223 Ёперный Театр_в _доме";
   echo "Оригинал:" . $input . "
";
   echo "Транслит:" . transliterator_transliterate('Any-Latin; Latin-ASCII;', $testStr);
?>

Output:

Оригинал:№1223 Ёперный Театр_в _доме
Транслит:No1223 Epernyj Teatr_v _dome

Folder alias creation



// set alias
                  $alias = JApplication::stringURLSafe(transliterator_transliterate('Any-Latin; Latin-ASCII;', $cat_dir_value));

                  // set note hint
                  $note = JText::_('COM_JDOWNLOADS_RUN_MONITORING_NOTE_TEXT');

                  // build table array
                  $data = array (
                       'id' => 0,
                       'parent_id' => $parent_id,
                       'title' => $original_folder_name,
                       'alias' => $alias,


File alias creation

$md5_value  =  md5_file($target_path);                            
                            // build data array
                            $data = array (
                               'file_id' => 0,
                               'cat_id' => $id,
                               'file_title' => $title,
                               'file_alias' => JApplication::stringURLSafe(transliterator_transliterate('Any-Latin; Latin-ASCII;', $filename)),
                               'notes' => $note,
                               'url_download' => $filename,


Some final thoughts

Multilingual transliteration was pain in the ass until ICU Project. ICU is a mature, widely used set of C/C++ and Java libraries providing Unicode and Globalization support for software applications. This project has transliteration matrix's and reach API and implementation in different programming languages to transliterate almost any language to Latin and much more!

Php version 5.4.0 brought to us ICU support through intl extension functions! Starting from Php 5.4.0 we can use transliterator_transliterate function to perform transliteration from almost any language and do not need to make transliteration matrix's (arrays) for every language manually. Also, what is important, we do not need to specify "from language" to perform transliteration, as we did in:  

$lang = JLanguage::getInstance('ru-RU');
$translitAlias = $lang->transliterate($cat_dir_value);


Fully working scan.php from JD3.2 build 18 is attached to this post!

I strongly recommend to implement this solution (with transliterator) in jD v 3.2!

[gelöscht durch Administrator]
  •  

Arno

QuoteI have tested php native transliteration and guess what? It works just awesome!
Wonderful!  8) ;) ;D

I will test this here with other languages tonight and will then fast add it in 3.2.19 this weekend.

Many thanks again for your hard work.  8)
Best Regards / Gruß
Arno
Please make a Donation for jDownloads and/or write a review on the Joomla! Extensions directory!
  •