Boy eating xiao long bao
Photo Credit: VCG

Preserving Chinese Dialects One Wikipedia Entry at a Time

The amateurs trying to preserve Chinese dialects via Wikipedia

The Wikipedia rabbit hole is full of magic portals, and some might take you straight to specific regions of China.

For example, next time you check out “Mid-Autumn Festival,” “Crimean Tatars,” or “Alpha Wave” on the online encyclopedia, look for a small line of text reading “吴语 (Wu languages)” in the bottom left corner of your browser among the list of other languages the entry is available in. Click on it and you will be transported to Jiangnan (江南, south of the Yangtze), a region that includes southern Jiangsu province, Shanghai, and much of Zhejiang province, where many varieties of Wu, a Chinese dialect or fangyan, are spoken with soft and flowing tones.

At a cursory glance, the Wu Wikipedia might look like regular Mandarin Chinese versions, as most characters are shared between the two varieties of Chinese. Sharp-eyed readers, though, quickly pick up the differences. “No” is no longer 不 (bù) but 弗 ([vəʔ] or [fəʔ], depending on the speaker), while 个 ([geʔ], [keʔ], or [ɦeʔ]) takes the place of 的 (de) as the structural particle that connects modifiers and nouns.

The sentences and grammar also tend to take on a less formal air than standard Mandarin. In addition to vocabulary differences and grammatical quirks, the content is sometimes edited to show a regional flair. On the entry for “hot pot,” editors have highlighted the local relevance of this famous cuisine: “Wu-speaking people particularly like this type [of food] (吴越人交关欢喜搿种).” The word 交关 ([tɕjɔ.kwɛ]) is Wu for “very”; 欢喜 ([hwø.ɕi]), inverted from the Mandarin 喜欢, means “to like”; and 搿种 ([geʔ.tsoŋ]) indicates “this type.”

Wikipedia page in Wu Chinese

The Wu language Wikipedia page

People from Wu-speaking regions might chuckle in self-recognition when they stumble upon these pages. In 1986, the State Language Commission set a national agenda to popularize Mandarin, and local dialects in many places gradually fell out of favor, especially in the public sphere. In 1992, for instance, Shanghai’s government banned TV programs in Shanghai dialect (Shanghaihua) and told students not to use fangyan in school.

But now, dialect revival efforts are increasingly visible: at the Shanghai Language Institute’s annual convention in 2011, for example, 82 linguists issued a joint proposal for schools to encourage students to speak Shanghai dialect at school and have dialect announcements on TV, radio, and public buses.

However, for those who grew up in the 90s and early aughts, dialects could still feel informal, private, and sidelined. Seeing the dialect on Wikipedia pages, transmitting information to a global community, can be empowering or even feel subversive for Wu speakers.

The home page of the Wu Wikipedia has the face of a relatively active project: There are 42,678 entries at the time of writing. The page consists of a section highlighting new entries, one introducing the Wu language, a list of trivia whose answers can be found in Wu Wikipedia entries, and a news section, last updated in August 2021, with the latest headline on the Taliban’s occupation of Kabul. On a page called “community hall (社区门堂),” individual editors discuss—and sometimes argue over—the nitty-gritty of linguistics, protocols for deleting posts, and setting up community rules and administrator applications.

“You might think we are an organized body, but we are rather unstructured,” Ignatius Yoe, an editor on the Wu Wikipedia tells TWOC over the phone. “Some people founded the page, some other people started writing, and others followed.”

Yoe, a 29-year-old Shanghainese engineer, got involved in 2016, when he saw that there were only a little more than 3,000 entries on the Wu Wikipedia and a lot of blanks to fill. “I usually edit entries when I’m in the mood...I’ve edited at least 100 entries, mostly translating from Mandarin pages.” Yoe has penned the entry on Shanghai’s Xujiahui business district (Zikawei in Shanghaihua), among others.

Editors of other Chinese dialect Wikipedia communities are similarly humble about their participation. The Cantonese Wikipedia, a larger resource with 120,888 entries at the time of writing, has a page called “Embassy,” where volunteer editors are listed as “Ambassadors” and offer help to any kindred spirits in languages ranging from English to Vietnamese to Classical Chinese.

Among them is Karl Ho, a 24-year-old Guangzhou native who chats with TWOC via Weibo. Ho first started editing 10 years ago, “but I’m not very active,” he says. “I only fix bugs and errors when I come across them.” Born-and-raised in Guangzhou, Ho is surrounded by lively Cantonese in everyday life, but notes that it is rarely used outside of informal spoken contexts. “Wikipedia is the first place I saw it in written format. It was quite exciting.”

Fujian province, sandwiched between the Cantonese-speaking Guangdong province and Wu-speaking Zhejiang, is home to many dialects. One of them is Eastern Min, represented by the city of Fuzhou and its vicinity. Ye Jianfei, a software engineer from Fuzhou, didn’t think too much when he joined the Eastern Min Wikipedia project. “I just wanted to help preserve some Fuzhou culture. After all, people who know how to speak the dialect are now few in number,” Ye writes on Zhihu, an online question-and-answer platform.

Other than helping with the technical aspect of the project, Ye also had fun setting up pages unique to Fuzhou, such as one on 𥻵𥻵 (sì), a dessert made of glutinous rice—on which he even included his own recordings of rhyme in Eastern Min about the dish, in addition to translating the entry into Mandarin and English. In total, there are eight versions of Wikipedia in Chinese: Mandarin, Cantonese, Eastern Min, Wu, Gan, Hakka, Southern Min (Hokkien), and Classical Chinese.

Wu Wikipedia entry

The Eastern Min dialect Wikipedia entry on Sì written by Ye Jianfei

There are also people who are more invested in keeping the community alive. Yoe has heard of someone in the Wu Wikipedia community who has kept up with writing a few sentences everyday. The Cantonese group, on the other hand, sometimes hosts member gatherings. “But I’ve never been to any,” says Ho, “I’m a bit shy.”

Keeping up a Wikipedia in a dialect is no easy job. “It’s a dry task sometimes. An entry usually takes me an hour to translate. Some editors start as college students, but drop out gradually when they graduate and become busy,” says Yoe. Wu Wikipedia as a whole, in Yoe’s opinion, is still quite rough: There are many obvious mistakes, as there is no unified standard on which characters to use for certain words in Wu, while some pages simply copy the Mandarin entries without translation. There are also simply too few entries for Wu Wikipedia to be a complete, useful network of knowledge.

Chinese Wu Wikipedia page

Some dialect Wikipedia entries are more detailed than others. This one in Wu dialect has one sentence only

Existing entries vary in quality—while “Alpha Wave” is relatively fleshed out, “Tuk Tuk” consists of one photo, and one short line that ends in a comma. Dialect pages are also inconsistent in their referencing. Take “Taliban” as an example: While the English page lists 571 citations, and the Mandarin Chinese page 112, there are only 32 references in Cantonese. The entry has just three lines of text and no citations in Wu, and one line of text in Eastern Min.

A lack of standardization is a problem across all dialect versions of Wikipedia. The famously tough variety of Wu spoken in Wenzhou can be unintelligible to speakers in Suzhou, and there are countless shades of differences in between. According to Yoe, most Wu Wikipedia entries adopt the Shanghai variety as standard, although sometimes editors also write in the Jiangyin, Wuxi, Suzhou, Taizhou styles. Sometimes a page can be a mixed bag of variations.

“Wu Wikipedia is like a mascot [for Wu speakers online]. Most people keep it up just for fun. It is one additional way to represent the Wu language,” says Yoe. He feels that true, substantive efforts to keep dialects alive lie beyond Wikipedia. He names a few groups making more concerted efforts to preserve the language, such as the Wu Language School (吴语学堂), a volunteer group operating a WeChat account that regularly shares educational articles and an online dictionary that includes various branches of Wu.

Yoe himself teaches the Shanghai dialect online as a passion project. According to him, Shanghai locals born in the 1980s still speak the dialect quite a lot in their daily communication, although with many pronunciation mistakes, while those born between 1990 and 1995 are less willing to speak it among their peers. He finds that Gen Z are typically able to understand the dialect when spoken to, but cannot speak it themselves. In 2012, a survey of students in seven elementary and middle schools by the Shanghai Academy of Social Sciences found that only about 60 percent of local students can fully understand and speak basic Shanghai dialect.

Shanghaihua is still very alive, but it would be a shame to see it go into decline. The dialect is a distinguishing characteristic of Shanghai,” says Yoe. The students he now teaches are mostly people who have moved to Shanghai from elsewhere and are seeking to integrate more into this city. Yoe’s work means they at least have one more resource to help them in that endeavor.


author Siyi Chu (褚司怡)

Siyi is the former Culture Editor at The World of Chinese. She writes about arts, culture, and society, and is ever-curious about the minds, hearts, and souls inside all of these spheres. She is now a freelance writer with additional work experience in independent filmmaking and the field of education.

Related Articles