Umlauts and other diacritics are broken

Information from and to the site administrators.

Moderator: Alastair

Message
Author
Mr Creosote
Posts: 1146
Joined: Tue Sep 22, 2009 9:23 am
Contact:

Re: Umlauts and other diacritics are broken

#16 Post by Mr Creosote »

Very good! I've updated the linked page. This now represents what I consider a complete set of fixes all across the website. I'll leave it for review in case anyone spots any mistakes which make it worse. I can pull the trigger any time.
Alastair
Posts: 1169
Joined: Fri Nov 11, 2005 12:21 am

Re: Umlauts and other diacritics are broken

#17 Post by Alastair »

Mr Creosote wrote: Fri Jun 05, 2026 7:46 am Very good! I've updated the linked page. This now represents what I consider a complete set of fixes all across the website. I'll leave it for review in case anyone spots any mistakes which make it worse. I can pull the trigger any time.
I've taken a few minutes to peruse the fixes and I haven't spotted any mistakes. The only thing I have noticed is that my Cliff Johnson comment needs updating, his current site is unknown to me and it shouldn't be Flash anything let alone Flash heavy.
User avatar
Gunness
Site Admin
Posts: 1951
Joined: Tue Dec 07, 2004 7:04 pm
Location: Copenhagen, Denmark
Contact:

Re: Umlauts and other diacritics are broken

#18 Post by Gunness »

There's a lot to keep track of on the linked page, but I haven't spotted anything egregious, either.

So please pull the trigger! And thanks a lot for preparing the fix :)
User avatar
Garry
Posts: 513
Joined: Sun Oct 28, 2012 11:43 am
Location: Sydney, Australia
Contact:

Re: Umlauts and other diacritics are broken

#19 Post by Garry »

A mass edit of individual characters is madness. They were fine in the past. Surely it's just a case of changing the encoding from ASCII (or ANSI) to UTF8.
Alastair
Posts: 1169
Joined: Fri Nov 11, 2005 12:21 am

Re: Umlauts and other diacritics are broken

#20 Post by Alastair »

Hannes, I see that you have committed the fix but some parts have gone wrong :(

In the fix page that you posted the apostrophes and dashes for the user comments for https://solutionarchive.com/game/id%2C3 ... Karma.html were properly corrected, but with the fix applied those messed up apostrophes and dashes have been replaced by question marks. In the few entries I have checked there are also question marks in the synopsis for Rimblenden - https://solutionarchive.com/game/id%2C5 ... enden.html - "Rimblend?n" instead of "Rimblendén", and the notes for Indiana Jones na Václavském náměstí - https://solutionarchive.com/game/id%2C9 ... C3%AD.html - "The conversion was produced by Jaroslav ?velch and Martin Kouba. It featured new graphics by Jana ?Yuffie? Kilianov?." instead of "The conversion was produced by Jaroslav Švelch and Martin Kouba. It featured new graphics by Jana “Yuffie” Kilianová." (thanks again to the "Wayback Machine" - https://web.archive.org/web/20251014094 ... C3%AD.html). I also see that a number of authors now have question marks in their names.
Mr Creosote
Posts: 1146
Joined: Tue Sep 22, 2009 9:23 am
Contact:

Re: Umlauts and other diacritics are broken

#21 Post by Mr Creosote »

Yup, the bit-level operation was accidentally run twice on a subset of some synopsis, notes and author names. At least I think this is the affected set. If you spot any issues on other fields, let me know. I have an automated fix for that as well, but I'm going slowly, doing a visual inspection of every field's proposed change one by one.
User avatar
Garry
Posts: 513
Joined: Sun Oct 28, 2012 11:43 am
Location: Sydney, Australia
Contact:

Re: Umlauts and other diacritics are broken

#22 Post by Garry »

I only checked the German and noticed question marks in author's names for:
Alastair
Posts: 1169
Joined: Fri Nov 11, 2005 12:21 am

Re: Umlauts and other diacritics are broken

#23 Post by Alastair »

Garry, you won't be able to see it because it is admin only, but there is a page for editing authors' names and details which shows all of the names entered on the site. So there is no need for to go through individual entries to find problem names. The same goes for "Genre", "Publisher", "System", "Platform", and "Game Group".

What do need individual checking are "Synopsis", "Notes", and "User Comments" per entry; and the separate "Articles > Interviews", "Articles > Author's notes", and "Reviews".


On the subject of authors, alongside those with question marks in their names I can also see "František Fuka" (https://solutionarchive.com/list/author%2C3870/) and "Jiři Koudelka" (https://solutionarchive.com/list/author%2C3868/).
Mr Creosote
Posts: 1146
Joined: Tue Sep 22, 2009 9:23 am
Contact:

Re: Umlauts and other diacritics are broken

#24 Post by Mr Creosote »

Alright, question marks in notes, synopsis and author names should be taken care of - all I found has been restored. If you find more broken things, I encourage you to fix by hand. If, however, you indeed find large amounts of broken entries, do alert me here. Thank you for all the support and the pointers so far!
Mr Creosote
Posts: 1146
Joined: Tue Sep 22, 2009 9:23 am
Contact:

Re: Umlauts and other diacritics are broken

#25 Post by Mr Creosote »

Alastair wrote: Sun Jun 07, 2026 5:41 pm On the subject of authors, alongside those with question marks in their names I can also see "František Fuka" (https://solutionarchive.com/list/author%2C3870/) and "Jiři Koudelka" (https://solutionarchive.com/list/author%2C3868/).
The problem with Jiři Koudelka and František Fuka is that they are actually duplicates. When I tried correcting them, it is not possible due to that.
User avatar
Gunness
Site Admin
Posts: 1951
Joined: Tue Dec 07, 2004 7:04 pm
Location: Copenhagen, Denmark
Contact:

Re: Umlauts and other diacritics are broken

#26 Post by Gunness »

Mr Creosote wrote: Sun Jun 07, 2026 5:52 pm Alright, question marks in notes, synopsis and author names should be taken care of - all I found has been restored. If you find more broken things, I encourage you to fix by hand. If, however, you indeed find large amounts of broken entries, do alert me here. Thank you for all the support and the pointers so far!
Thanks for your effort!
Alastair
Posts: 1169
Joined: Fri Nov 11, 2005 12:21 am

Re: Umlauts and other diacritics are broken

#27 Post by Alastair »

Mr Creosote wrote: Sun Jun 07, 2026 6:28 pm The problem with Jiři Koudelka and František Fuka is that they are actually duplicates. When I tried correcting them, it is not possible due to that.
So should I delete FrantiÅ¡ek Fuka and JiÅ™i Koudelka, then rename Frantisek Fuka and Jiri Koudelka to František Fuka and Jiři Koudelka respectively by hand?
Mr Creosote
Posts: 1146
Joined: Tue Sep 22, 2009 9:23 am
Contact:

Re: Umlauts and other diacritics are broken

#28 Post by Mr Creosote »

Well, it depends. If Frantisek Fuka and František Fuka are actually the same person, I suggest relinking all entries of both to one and deleting the one no longer needed. It doesn't really matter whether you retain František or Frantisek. When the other is gone, you can rename the remaining one as you like. If they are actually different people... we have not technically foreseen two people with the same name. I believe the common way libraries handle it is to call them "František Fuka (I)" and "František Fuka (II)".
User avatar
Gunness
Site Admin
Posts: 1951
Joined: Tue Dec 07, 2004 7:04 pm
Location: Copenhagen, Denmark
Contact:

Re: Umlauts and other diacritics are broken

#29 Post by Gunness »

We have the same issue with publishers. Like, two Dragonsoft, two Dream Software, two Excalibur Software etc. I've added a [1] and [2] to the names in question.
Alastair
Posts: 1169
Joined: Fri Nov 11, 2005 12:21 am

Re: Umlauts and other diacritics are broken

#30 Post by Alastair »

Hannes, there are no games attributed to FrantiÅ¡ek Fuka and JiÅ™i Koudelka so deleting those entries is not a problem. The question that you have now made me realise is which is the correct spelling for František Fuka/Frantisek Fuka and Jiři Koudelka/Jiri Koudelka? Not knowing the answer I will leave them alone for the moment.
Post Reply