You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some documents do not seem to get all their words indexed and sometimes empty words get indexed. Sometimes there are attempts to index empty words multiple times and give the error message:
DBD::mysql::st execute failed: Duplicate entry 'documents--75' for key 'PRIMARY' at /opt/eprints3/bin/../perl_lib/EPrints/Database.pm line 1289.'
The text was updated successfully, but these errors were encountered:
Reviewing the particular characters that were reported by the person who lead me to create this issue, I think most if not all the characters come from the range0x1d400 to 0x1d7ff. This includes alphabetical characters, greek letters and numbers using certain font styles. These are probably used in formulae that appear within research publications.
Adding extra entries to $EPrints::Index::FREETEXT_CHAR_MAPPING should solve the problem. The problem may also be solved by changing EPrints' database to use a utf8mb4 character set. However, changing from utf8 to utf8mb4 for an existing table is non-trivial.
It is liable that there will continue to be more characters used in publications that fall outside utf8 (3-bytes) range. So support for utf8mb4 in EPrints should be given greater importance. Probably as a feature of 3.5.
Some documents do not seem to get all their words indexed and sometimes empty words get indexed. Sometimes there are attempts to index empty words multiple times and give the error message:
The text was updated successfully, but these errors were encountered: