The dashboard currently generates slugs from titles (i.e. for the show). This is not really great because the rules around slug-generation are rather complex and might include parsing and applying unicode-rules to strings that are very large (and therefore shouldn’t be loaded by browsers) and should therefore be done in steering.
A few notes that we should discuss and derive tickets from:
slug generation: check if we can actually use unicode in slugs so that characters like ä are not stripped and entirely-non-ascii-strings (like arabic titles) are not empty
API: auto-derive the slug from the title where appropriate and if no slug is provided
API: if a slug is provided make sure it complies with the slug rules
python-slugify looks nice if we decide to enforce ascii-representations, though they don’t have examples for languages/language families like hebrew and arabic for which latin-transcriptions may not exist.
We still have to define, if we want to allow unicode in slugs.
python-slugify probably uses good-enough transliterations for most characters. German umlauts are supported with the PRE_TRANSLATIONS replacement characters. Control characters and emoji are stripped in any case.
But if we are confident that no-one expects pure ASCII but just URL-compatible strings we might wan’t to consider setting allow_unicode=True. From the few tests I did with arabic, hebrew and simplified chinese characters in URLs not one of them did funky stuff.
Can you please point me to some online resources, that Unicode characters in URLs are allowed and fully compatible in average browsers? I quick search didn't give me any results. Honestly, I even never saw such URL fragments in the wild.
Just got involved and tried out some online slug generators with this input: Není zač! Schönes, heißes Køttbullår?. There was no big difference between each output except formatting such as uppercasing or stripping spaces: neni_zac_schones_heisses_kottbullar_
Emojis and arabic letter are handled differently in each generator, some keep both emojis and arabic letters, some strip them and one translated "مثال" (which means "example") to "mthl". The last generator offered the option "Strip Special Characters" toggling this didn't change the output.
I noticed the growing unicode support of some libraries but I'd suggest to use ASCII in order to keep it simple and avoid any potential issues; e.g. with outdated browsers.
There are two situations that the API needs to handle and as far as I can tell, this calls for two different status codes, in order to be useful.
If the request to create or update a show receives a slug that already exists, a validation error is raised and the response is a "400 Bad Request":
{"slug":["show mit diesem slug existiert bereits."]}
If the request to create a show doesn’t contain a slug, the name ("AURA-3") is used to generate one, validating this raises an integrity error and the response is a "409 Conflict":
{"slug":["show with the slug 'aura-3' already exists."]}
I think it can work this way.
What do you think @kmohrf , would this be good enough for the dashboard?
In some cases steering already returns error objects like this:
{"slug":[{"message":"This field must be unique","code":"unique"}]}
The dashboard then returns a localized translation based on the code, rather than the provided message. The unique code is already implemented in the dashboard. So I’d prefer the error object instead of the simple error message.