Umlauts and other diacritics are broken
Moderator: Alastair
-
Mr Creosote
- Posts: 1146
- Joined: Tue Sep 22, 2009 9:23 am
- Contact:
Re: Umlauts and other diacritics are broken
Very good! I've updated the linked page. This now represents what I consider a complete set of fixes all across the website. I'll leave it for review in case anyone spots any mistakes which make it worse. I can pull the trigger any time.
Re: Umlauts and other diacritics are broken
I've taken a few minutes to peruse the fixes and I haven't spotted any mistakes. The only thing I have noticed is that my Cliff Johnson comment needs updating, his current site is unknown to me and it shouldn't be Flash anything let alone Flash heavy.Mr Creosote wrote: Fri Jun 05, 2026 7:46 am Very good! I've updated the linked page. This now represents what I consider a complete set of fixes all across the website. I'll leave it for review in case anyone spots any mistakes which make it worse. I can pull the trigger any time.
- Gunness
- Site Admin
- Posts: 1951
- Joined: Tue Dec 07, 2004 7:04 pm
- Location: Copenhagen, Denmark
- Contact:
Re: Umlauts and other diacritics are broken
There's a lot to keep track of on the linked page, but I haven't spotted anything egregious, either.
So please pull the trigger! And thanks a lot for preparing the fix
So please pull the trigger! And thanks a lot for preparing the fix
Re: Umlauts and other diacritics are broken
A mass edit of individual characters is madness. They were fine in the past. Surely it's just a case of changing the encoding from ASCII (or ANSI) to UTF8.
Re: Umlauts and other diacritics are broken
Hannes, I see that you have committed the fix but some parts have gone wrong 
In the fix page that you posted the apostrophes and dashes for the user comments for https://solutionarchive.com/game/id%2C3 ... Karma.html were properly corrected, but with the fix applied those messed up apostrophes and dashes have been replaced by question marks. In the few entries I have checked there are also question marks in the synopsis for Rimblenden - https://solutionarchive.com/game/id%2C5 ... enden.html - "Rimblend?n" instead of "Rimblendén", and the notes for Indiana Jones na Václavském náměstí - https://solutionarchive.com/game/id%2C9 ... C3%AD.html - "The conversion was produced by Jaroslav ?velch and Martin Kouba. It featured new graphics by Jana ?Yuffie? Kilianov?." instead of "The conversion was produced by Jaroslav Švelch and Martin Kouba. It featured new graphics by Jana “Yuffie” Kilianová." (thanks again to the "Wayback Machine" - https://web.archive.org/web/20251014094 ... C3%AD.html). I also see that a number of authors now have question marks in their names.
In the fix page that you posted the apostrophes and dashes for the user comments for https://solutionarchive.com/game/id%2C3 ... Karma.html were properly corrected, but with the fix applied those messed up apostrophes and dashes have been replaced by question marks. In the few entries I have checked there are also question marks in the synopsis for Rimblenden - https://solutionarchive.com/game/id%2C5 ... enden.html - "Rimblend?n" instead of "Rimblendén", and the notes for Indiana Jones na Václavském náměstí - https://solutionarchive.com/game/id%2C9 ... C3%AD.html - "The conversion was produced by Jaroslav ?velch and Martin Kouba. It featured new graphics by Jana ?Yuffie? Kilianov?." instead of "The conversion was produced by Jaroslav Švelch and Martin Kouba. It featured new graphics by Jana “Yuffie” Kilianová." (thanks again to the "Wayback Machine" - https://web.archive.org/web/20251014094 ... C3%AD.html). I also see that a number of authors now have question marks in their names.
-
Mr Creosote
- Posts: 1146
- Joined: Tue Sep 22, 2009 9:23 am
- Contact:
Re: Umlauts and other diacritics are broken
Yup, the bit-level operation was accidentally run twice on a subset of some synopsis, notes and author names. At least I think this is the affected set. If you spot any issues on other fields, let me know. I have an automated fix for that as well, but I'm going slowly, doing a visual inspection of every field's proposed change one by one.
Re: Umlauts and other diacritics are broken
I only checked the German and noticed question marks in author's names for:
Re: Umlauts and other diacritics are broken
Garry, you won't be able to see it because it is admin only, but there is a page for editing authors' names and details which shows all of the names entered on the site. So there is no need for to go through individual entries to find problem names. The same goes for "Genre", "Publisher", "System", "Platform", and "Game Group".
What do need individual checking are "Synopsis", "Notes", and "User Comments" per entry; and the separate "Articles > Interviews", "Articles > Author's notes", and "Reviews".
On the subject of authors, alongside those with question marks in their names I can also see "František Fuka" (https://solutionarchive.com/list/author%2C3870/) and "Jiři Koudelka" (https://solutionarchive.com/list/author%2C3868/).
What do need individual checking are "Synopsis", "Notes", and "User Comments" per entry; and the separate "Articles > Interviews", "Articles > Author's notes", and "Reviews".
On the subject of authors, alongside those with question marks in their names I can also see "František Fuka" (https://solutionarchive.com/list/author%2C3870/) and "Jiři Koudelka" (https://solutionarchive.com/list/author%2C3868/).
-
Mr Creosote
- Posts: 1146
- Joined: Tue Sep 22, 2009 9:23 am
- Contact:
Re: Umlauts and other diacritics are broken
Alright, question marks in notes, synopsis and author names should be taken care of - all I found has been restored. If you find more broken things, I encourage you to fix by hand. If, however, you indeed find large amounts of broken entries, do alert me here. Thank you for all the support and the pointers so far!
-
Mr Creosote
- Posts: 1146
- Joined: Tue Sep 22, 2009 9:23 am
- Contact:
Re: Umlauts and other diacritics are broken
The problem with Jiři Koudelka and František Fuka is that they are actually duplicates. When I tried correcting them, it is not possible due to that.Alastair wrote: Sun Jun 07, 2026 5:41 pm On the subject of authors, alongside those with question marks in their names I can also see "FrantiÅ¡ek Fuka" (https://solutionarchive.com/list/author%2C3870/) and "JiÅ™i Koudelka" (https://solutionarchive.com/list/author%2C3868/).
- Gunness
- Site Admin
- Posts: 1951
- Joined: Tue Dec 07, 2004 7:04 pm
- Location: Copenhagen, Denmark
- Contact:
Re: Umlauts and other diacritics are broken
Thanks for your effort!Mr Creosote wrote: Sun Jun 07, 2026 5:52 pm Alright, question marks in notes, synopsis and author names should be taken care of - all I found has been restored. If you find more broken things, I encourage you to fix by hand. If, however, you indeed find large amounts of broken entries, do alert me here. Thank you for all the support and the pointers so far!
Re: Umlauts and other diacritics are broken
So should I delete FrantiÅ¡ek Fuka and JiÅ™i Koudelka, then rename Frantisek Fuka and Jiri Koudelka to František Fuka and Jiři Koudelka respectively by hand?Mr Creosote wrote: Sun Jun 07, 2026 6:28 pm The problem with Jiři Koudelka and František Fuka is that they are actually duplicates. When I tried correcting them, it is not possible due to that.
-
Mr Creosote
- Posts: 1146
- Joined: Tue Sep 22, 2009 9:23 am
- Contact:
Re: Umlauts and other diacritics are broken
Well, it depends. If Frantisek Fuka and František Fuka are actually the same person, I suggest relinking all entries of both to one and deleting the one no longer needed. It doesn't really matter whether you retain František or Frantisek. When the other is gone, you can rename the remaining one as you like. If they are actually different people... we have not technically foreseen two people with the same name. I believe the common way libraries handle it is to call them "František Fuka (I)" and "František Fuka (II)".
- Gunness
- Site Admin
- Posts: 1951
- Joined: Tue Dec 07, 2004 7:04 pm
- Location: Copenhagen, Denmark
- Contact:
Re: Umlauts and other diacritics are broken
We have the same issue with publishers. Like, two Dragonsoft, two Dream Software, two Excalibur Software etc. I've added a [1] and [2] to the names in question.
Re: Umlauts and other diacritics are broken
Hannes, there are no games attributed to FrantiÅ¡ek Fuka and JiÅ™i Koudelka so deleting those entries is not a problem. The question that you have now made me realise is which is the correct spelling for František Fuka/Frantisek Fuka and Jiři Koudelka/Jiri Koudelka? Not knowing the answer I will leave them alone for the moment.