مستخدم:StevenJ81/ملعب
Language committee activities
عدلFirst of all, I apologize for writing in English. But as you see, I understand English and French, but not Arabic. So holding a conversation here in Arabic is just not going to be very helpful to me.
Second, clearly some of my recent work at Meta has unexpectedly (and unintentionally) caused a lot of discomfort here. User:TonyBalloni suggested that I reach out and try to explain what has just happened, and why it happened, so as to lower any tensions, and so that you understand what has happened and how we can move forward in the most productive way.
Courtesy links
عدلGeneral rules on definition of a language
عدلAbout twelve years ago, Language Committee and the WMF Board decided that Wikimedia would follow the ISO 639-3 (more complete page in en) standard as to what constitutes a language and what doesn't. Understand that the lines between a "dialect" and a "language", and for that matter a "language" and a "macrolanguage", are not always very clear. WMF Board and Language Committee decided that we could not do a better job than the organizations managing the standard at defining languages. So we go by the official, standard definitions. The decisions are made elsewhere, not anywhere within WMF. This system is not perfect, but it works well overall. Any locally-run system we would replace it with would be no better, and would add an additional political issue to WMF that no one really wants here.
When the current policy was established, the following detail was part of the policy:
- Any projects and languages already existing within WMF would be allowed to continue as they were, regardless of their official definition in the ISO standard.
ISO 639-3 Macrolanguages
عدلI am copying in the entire definition of "macrolanguage" from the standards organization for ISO 639-3:[1]
Other parts of ISO 639 have included identifiers designated as individual language identifiers that correspond in a one-to-many manner with individual language identifiers in this part of ISO 639. For instance, this part of ISO 639 contains over 30 identifiers designated as individual language identifiers for distinct varieties of Arabic, while ISO 639-1 and ISO 639-2 each contain only one identifier for Arabic, "ar" and "ara" respectively, which are designated as individual language identifiers in those parts of ISO 639. It is assumed here that the single identifiers for Arabic in parts 1 and 2 of ISO 639 correspond to the many identifiers collectively for distinct varieties of Arabic in part 3 of ISO 639.
In this example, it may appear that the single identifiers in ISO 639-1 and ISO 639-2 should be designated as collective language identifiers. That is not assumed here. In various parts of the world, there are clusters of closely-related language varieties that, based on the criteria discussed above, can be considered distinct individual languages, yet in certain usage contexts a single language identity for all is needed. Typical situations in which this need can occur include the following:
- There is one variety that is more developed and that tends to be used for wider communication by speakers of various closely-related languages; as a result, there is a perceived common linguistic identity across these languages. For instance, there are several distinct spoken Arabic languages, but Standard Arabic is generally used in business and media across all of these communities, and is also an important aspect of a shared ethno-religious unity. As a result, a perceived common linguistic identity exists.
- There is a common written form used for multiple closely-related languages. For instance, multiple Chinese languages share a common written form.
- There is a transitional socio-linguistic situation in which sub-communities of a single language community are diverging, creating a need for some purposes to recognize distinct languages while, for other purposes, a single common identity is still valid. For instance, in some contexts it is necessary to make a distinction between Bosnian, Croatian and Serbian languages, yet there are other contexts in which these distinctions are not discernible in language resources that are in use.
Where such situations exist, an identifier for the single, common language identity is considered in this part of ISO 639 to be a macrolanguage identifier.
Macrolanguages are distinguished from language collections in that the individual languages that correspond to a macrolanguage must be very closely related, and there must be some domain in which only a single language identity is recognized.As you can see, the ISO standard lists "Arabic" as a macrolanguage, with some 30 constituent languages as part of it. Notice also the difference between the infoboxes on the English Wikipedia page on Arabic (en:Arabic) and the Arabic Wikipedia page (اللغة العربية).
Macrolanguages and eligibility requirement #3
عدلThis definition makes it hard to know exactly what to do with macrolanguages insofar as Eligibility requirement #3 is concerned.
- As a side note, I encourage you to read the reasons for that requirement by clicking the hyperlink. The additional reason not stated there, though, is that it is better not to dilute the effort required to build projects by creating parallel projects that are extremely similar.
To be sure, there are some macrolanguages that are just barely similar to their constituent languages, and some that are extremely similar. On the whole, for languages not yet having projects, the Language Committee usually wants only either a macrolanguage project or projects in constituent languages, not both. On the whole, we favor projects in the constituent languages, but we can be convinced to go the other way when the constituent languages are very similar (especially in writing).
Where projects already existed when the policy was established, they are allowed to continue. Thus, there were, and are, very large, successful Wikipedias in both Arabic and Chinese, which are both macrolanguages. No one would dream of changing that. In most such cases (except those two), if the constituent language projects didn't exist then, there is a strong preference to support the macrolanguage project rather than encouraging multiple similar projects.
That said, there is a long-established practice in WMF to allow projects in both the macrolanguages and constituent languages for both Arabic and Chinese. (This practice long predates my involvement in the matter.) Consider:
- There are independent Wikipedias in Literary Chinese, Cantonese, Eastern Min and Southern Min in parallel to Chinese Wikipedia.
- There is the Egyptian Arabic Wikipedia in parallel to this project.
Requests for projects in Moroccan Arabic and Algerian Arabic were marked as eligible as early as 2008 and 2009. Please keep in mind:
- This community tends to see these as "dialects" of Arabic.
- Language Committee, on the other hand, treats them as "languages" within the Arabic macrolanguage. It does not see itself as approving projects in "dialects", but rather as approving projects in "languages".
- Whether or not you agree with those approaches, at this point this is the long-standing practice of the Language Committee.
The recent issue concerning San'ani
عدلIt is not the ordinary practice of the Language Committee to notify communities about routine requests to create new projects. We wouldn't necessarily know in most cases who needs to be told, and in any event the list of requests is always open and available at Requests for new languages on Meta. Please watch that page. That is the best way to know what projects are being proposed.
In this particular case, the request was sitting open and unaddressed for just about six years, from July 2013 until June 2019. Over that period, nobody objected. I could have no way to know anyone would be troubled by this. As part of my efforts over the last two years to move these requests along and make decisions on them, I came to this particular request. Seeing no reason not to approve its eligibility routinely—particularly in view of the longstanding practice around Arabic varieties—I marked it "eligible". The only difference is that in this case it came to the attention of this community. I don't regret that, but if you want to know why and how this happened just now—and why nobody informed this community—that's the answer.
"Eligibility" vs. "Final approval"
عدلI will copy here from the comments at the end of the request for Wikipedia :
- It is one thing for a project to be "eligible". It is another thing for a project to be approved. That takes a lot of work. This particular test project currently contains exactly ten pages, none of which is ready for publication anyway. It will likely be years, if ever, before this test project is ready to consider for approval.
- It's almost unheard of for projects like this to become full-featured encyclopedias, anyway. If this project is ever close to approval, the most likely scenario would be that it covers cultural, geographic, and perhaps political issues that are local to its community, but that aren't considered interesting (or even, perhaps, notable) to the Arabic community at large.
- To illustrate, look at the Guianan Creole [Test] Wikipedia. It is written in a French-based creole spoken in Guyane. People in Guyane also understand standard French, and this test Wikipedia does not cover anything close to the breadth of material that French Wikipédia does. However, it covers material of interest to people in Guyane and nearby parts of South America and the Caribbean that are not really of interest or note to the larger French-speaking world. So it serves its purpose. I could easily see this project as similar.
To elaborate just a little bit, I'd have you open another tab in your browser to look at m:List of Wikipedias. There are over 300 Wikipedia projects in various languages. But of those, probably no more than around 50 are really what I'd call "full-featured encyclopedias" – encyclopedias that are at least equivalent to a student's bookshelf encyclopedia, where you can look up a wide range of regular, every-day subjects. Surely once you get to projects of fewer than 100,000 pages, all (or nearly all) are in languages where people regularly also use another language to communicate. (I'm thinking, for example, that people who speak Asturian also speak Spanish, people who speak Pashto also speak Urdu, people who speak Romansh also speak at least one of the other languages of Switzerland.
So then why have an Asturian, or Pashto, or Romansh Wikipedia at all? There are a couple of reasons:
- To some extent, there's just a matter of pride: We can actually do this. And there's something to be said for that.
- To a great extent, projects like those, or like the Guianan project mentioned above, provide an opportunity for some focused coverage of a certain community and its interests. It gives a small community an opportunity to showcase things that it holds to be interesting, important and notable, even when the larger community it is part of is not so interested.
- A little bit, there's also a shield against domination by a large cultural group, too. This whole discussion/argument has hinged in part on a cultural belief that the entire Arabic world should remain culturally unified and participate in the project together. But it's also a legitimate argument to say, "I agree for 95% of all matters—but 5% of the time it's worth being able to give some attention to local subjects, too."
Conclusion
عدلThe proposed Sanaani project is not competing with this project, the Arabic Wikipedia. It's not going to divert material resources from Arabic Wikipedia. It's far from being approvable. There's probably a far-better-than-50-50 chance that it will never become approvable. But if it does become approvable, then it's nearly certain that what you will see is a project that is focused on issues around Yemeni geography, history, culture, etc., with most people still relying on Arabic Wikipedia for most of their information on subjects of broader interest.