乡音苑 Phonemica

Major bug fix


There was a bug in the code that handles uploading audio. It basically prevented users from being able to actually contribute. I’m not sure how long the bug has been in place, but it certainly explains why the number of recent contributions has been so low.

It’s fixed now, thanks to it being brought to our attention by our friend Giau Ya.


Related to that, I’ve made a change to how the recent stories on the home page show up. The change is just how they’re chosen and sorted so that it more accurately represents recent additions while still prioritising those which have uploaded audio, still filtering out those which don’t.

More audio fixes

I’ve found another couple recordings which had corrupted audio. I’ve gone ahead and fixed the ones I’ve come across. In the case of the linked example, I only found it because I was talking to someone from Heilongjiang who wanted to know if we had any recordings from his hometown, and Loli’s was one of the first recordings we had on the site so I remembered it well.

In the mean time I’m working on streamlining the process to add recordings and speakers to the site. Hopefully more on that this coming weekend when, if all goes well, I’ll have some screenshots to show.

Help us spread the word

We’ve got a few things going on this week and next but I wanted to just make a note that we’re gonna start doing some promotional work. In the past this was basically done for us by way of news coverage. We’ve not done any interviews in a while, so the submissions that come from that kind of publicity haven’t been coming in.

But we’re still hoping to hear from you.

Help us share your stories Phonemica

We’re not doing many interviews, mostly because we’re just not around Beijing often enough to co-ordinate the kind of filmed interviews most outlets have requested. We’re still happy to do phone interviews and the like, but we just don’t have the time we used to have in Beijing.

However what we are doing is more grassroots publicity like we’ve done in the past. That means getting people in different parts of the country who can help encourage their friends to submit stories.

Want to help?

Do you know anyone who might be interested in sharing their story? We’d love your help.

Talk to your friends or family who you think might have interesting stories to tell and see if they’d be willing to record one. We have around 800 entries on the site right now, but there’s a lot of dialects we haven’t heard, and there are countless stories that are waiting to be told.

Everyone’s got a story. Tell us yours.


Recently I’ve been getting back into digital cartography. I’ve been helping to develop resources for minority language communities and trying to piece together history based on not only oral histories but linguistic information as well. To do this we need to map a bunch of historical data which can be found across about a dozen different old paper maps, but which doesn’t seem to exist anywhere in a single location.

With that I’ve also been helping to get others involved in projects like OpenStreetMap, as Google Maps (while otherwise great) doesn’t have much (anything) about a lot of the locations of towns and villages throughout rural China where people are coming from.

All that to say that I’ve been getting back into the mapping aspect of things. At one point, we were using this map:

That was the appearance of the map used for this site, and one that I was really pretty happy with. The problem with it and the reason it got taken down basically came down to bandwidth. That, and to get it with the kind of resolution for the level of detail we wanted, it just wasn’t feasible. Or rather, it was but barely, and we felt things were going to improve so we put it on the back burner and focused on other things.

The hope was that MapBox was going to continue development of TileMill which is what I’d used to make it, and that with that continued development we’d have more options and more flexibility. That didn’t happen. Instead, TileMill was discontinued and MabBox instead pushed MapBox Studio, which is nice, but doesn’t let you host your own data. Screw that. We don’t have the money to pay their fees to do what we’d want to do.

While we were waiting for things to happen that never happened, we lost track of the map and kinda just left it at the OpenStreetMap default tileset, which I hate. It’s super ugly and not easy to see features in a lot of places.

However, I’ve been thinking, since I’m already spending the time working on custom tilesets for other projects, it wouldn’t really be much trouble to kick one back out for Phonemica. Maybe even getting back to that particular design.

I’m not saying it’s going to happen. But it might. I’d like it to. So now it’s just seeing if I can get the time together to get it done with everything else that’s going on around here.

Man I miss that map.

General Update

Hello. It’s been a couple weeks since I’ve posted, even though I said I’d post every week. The reason I haven’t posted is because there hasn’t been much work done. Other work, unrelated to Phonemica, has come up and taken over much of my otherwise-free time.


I’ve been doing a few small things to improve the site and make sure files were working as needed, but not enough that I felt it needed a blog post. If every week’s post is “I fixed some things”, then that’s not really useful.

So instead what I’m going to do is post less frequently, but still try to make it a regular thing.


I wanted to take a minute and point out some great work on transcripts that’s been done recently. This is one of those things that tends to not be done. It takes time to transcribe and a lot of people would rather just upload and be done. However we have a few people who have recently been going around and writing in transcriptions for the varieties they know. And that’s awesome. We need more of that. So thank you to those people.


We’re also working on getting some more recordings up, but of some language varieties we don’t have up. Looking forward to that, which hopefully we can get done soon. In this case it’s us actually going out and getting the recordings rather than being user-submitted, so it’s all a matter of getting things scheduled.

That’s about it for this update. As usual feel free to let us know if there are things that need to be done, recordings that need to be fixed, UI translations that need to be corrected.

Until next time.

Weekly Progress Report #4

It’s been a busy couple of weeks, with a lot going on that wasn’t related to Phonemica and ended up just taking some time away from planned updates.

Still made some progress with the mobile app, but it looks like getting a test version out is gonna have to wait a couple weeks more than I’d expected.


The following errors in the Traditional Chinese interface have been corrected:

1. 浏览→瀏覽
2. 博客→網誌
3. 志愿者→志工
4. 搜索→搜尋

Similar issues were fixed in the Simplified Chinese version. In a number of cases (like 浏览) the two were just swapped, so Traditional showed the Simplified, but also vice versa.

Thanks to Lim Hian-tong for pointing out the mistakes.

No progress report this week

Other things have come up so there isn’t an update this week. The reports will continue next week

Weekly Progress Report #3

Pretty straightforward today. Only a couple small fixes that I found needing attention. Then otherwise I’m continuing work on the Android app.


UI Localisation

The user menu was missing some labels if you were using the site in Chinese. Instead they just showed up as a question mark. That’s been fixed now.

The Korean version of the UI was also missing these items. I’ve put in labels for that, but my Korean is pretty bad so they are probably not the right labels. I’m open to corrections.

In Progress

Mobile app

Still working on the app. I think I’m going to push back my prediction for when I’ll have a basic version up, only because there are a couple issues that still need to be worked out.

Later this week I’m going to give Syz a copy of the app as it is now so he can mess around a bit. We talked yesterday about how to present transcripts. That and the general speaker page is what I’m working on today.

Weekly Progress Report #2

It’s still early on Sunday and I’ll keep working on things as the day progresses, but there’s already been some good progress, so I’m writing this up now in order to get back to work.


More of the same. Some small changes to correct typos, and still looking for corrupted audio that needs to be replaced.

In Progress

Mobile app development

This is the biggest thing that’s been getting attention this week. I’ve spent some considerable time working on the Android app. Right now we’ve got a working app that lets you see what’s new, browse stories on the site and listen to audio. It’s still pretty basic but we’re making progress. Since I only have weekends to work on this, I’m planning to work this out in stages, releasing progressively more complex versions of the app as the weeks go on. The rough plan is to release a basic app that lets you keep up with the site, but involves less interaction, not worrying about transcribing at this point, and honestly, who’s going to want to do that on their phones anyway? Then a few weeks later put out an update that offers better functionality.

The end goal is of course to be able to upload new stories directly from your phone. There are a couple technological roadblocks though, since we want to make the Android and iOS versions as close to identical as possible.

There’s also the question of if we care about transcribing on the phone. I’m pretty sure we don’t. So few people fully transcribe their stories anyway, so it’s probably reasonable to assume they’d prefer a proper keyboard and bigger screen. But I could be wrong. We’ll have to see.

Today the primary goal is this. First, to finish getting all the “pages” laid out and working in the app on the phone. Then getting all of these views correctly populated from the API. Then related to these two, making sure all the navigation is in place.

There won’t be time to get all the mapping stuff done today, but that’s fine because it’s too soon for that anyway.

Still, it’s close to being ready for that, so probably next week that can be in.

Site & Database maintenance

I’ve been doing a little bit of cleanup with this. There are a few things I’d like to get organised better in the database which I’m still deciding how I want it to be in the end. As an example, in some cases the speakers sex is written “male” and in some cases it’s “男” and that really needs to be unified since not all the recordings are Chinese. As a short-term fix, the API we’ve set up for mobile automatically cleans these things up, but that’s not the right way to solve it long term. Same issue needs to be fixed for some of the metadata being in simplified characters (簡體字) and some in traditional (繁體字). That needs to be made consistent.

To do

Expanding the available languages

In the past few months we’ve gathered a number of recordings in Tibetoburman varieties that we think people would be interested in hearing. We need to make some small changes to the site to accommodate that, but nothing major. Just a matter of making time to do it. It’s lower priority right now than the mobile app or the database cleanup, so it might have to wait til next week.

Introducing Weekly Progress Reports

I’ve just returned from 6 months of fieldwork. During that time I didn’t have internet access, so I wasn’t able to do anything to take care of the site.

I’m back now, and starting today I’m instituting a weekly development day. From here on, Sundays will be a dedicated day to work on developing the site, building a mobile app, and fixing the issues that need attention.

Each week I will write a blog post like this one. In that post I’ll be detailing the following:

  1. What has been worked on this past week
  2. What do we hope to work on for the coming week
  3. What the longer term goals for the coming weeks will be

This blog will also be a place you can leave comments if you think there’s something that should be brought to our attention. If you see a recording that is broken or has some other issues, you can comment here to let us know.

I want to make sure things are very transparent so anyone who’s interested can know what’s going on with the site. I know things have been very quiet around here lately. I assure you we haven’t gone away. We just got much busier with other things than we’d planned.

Weekly progress reports are the first step to change that.

13 March 2016



  1. Entries with missing waveform images now have the images
  2. An administrative tool to manually create these stopped working a couple months ago because of a change in the server-side software that we use. That’s been fixed now
  3. Waveforms were not being generated automatically when new stories were added. This is now also corrected, so any story that gets added as of today will no longer be missing the waveform

Broken audio

  1. A number of recordings had broken audio. This was the result of changing servers about 16 months ago. Some files were corrupted or incompletely copied. We have backups of all of these. The problem is just finding them. It takes a lot of time to go in and check them out one by one, but that’s what needs to happen, and that what I’ve been doing today. A number of these have now been fixed, and we’ll keep our eyes open for others.

In Progress:

Mobile APP

Some time ago we had a basic Android app for the site. It didn’t help you upload stories, but it at least let you browse the site and listen to recordings. It’s something we need to have again.

We’re in the early stages of making that happen. I’ve spent some time today working on that, and we’re off to a good start. We should have something in a couple weeks, I think. It’s going to take some time to get all the features in place, but at least for a basic app that will let you browse stories and eventually upload new ones, I think we’re on schedule to have something preliminary ready in 2-3 weeks. However there are a few barriers we need to get past first, so I don’t want to commit to that schedule just yet.

Either way, next week I’ll be able to provide a clearer estimate on that.