Gavin Greig (ggreig) wrote,
Gavin Greig

Mind your language!

I've spent most of this week at work thinking about language.

What we do is to generate personality profiles, using a psychological model developed by the company founder. These profiles are then used in a number of scenarios - mostly business oriented, but our services and materials are also used in education and in relationship counselling.

The job of the software development team is to write software that creates the profiles, and we're currently engaged in a major rewrite in order to integrate and extend a lot of independent utilities that have grown up over the years. This includes more universal support for the languages - over 20 of them at the time of writing - that some of our materials are translated into. Hence the contemplation of language.

English may be a difficult language to learn, but it's an easy one to work with. We are remarkably unfussy about structure and grammar compared to other languages. This means there are less helpful rules to follow for novices, but for our purposes it means you can write a piece of text once about a man, make a copy with all the "he"s changed to "she"s for a woman (and a few other easily identifiable common changes), and you're done. And by and large, this approach of storing only a male and a female version works for other languages too, though they may require more extensive modification to take account of the gender of the subject.

However, it doesn't work universally. There are languages that require modification depending on whether a man or a woman is discussing the person who's the subject of the text. There are languages where you don't just have to consider "singular or plural", but you may also have to consider "dual", which is when exactly two people or objects are being discussed. And you may have to modify your language depending on the relative social standing of the person it is addressed to. The classic example of this is Japanese, which takes a number of different forms depending on context, but you can find it even in languages as close to home as French and German, both of which contain a formal and a less formal version of "you". (Or should that be "thou"? :-)

What we're tentatively looking at as a solution is to allow text to apply to one or more "grammar contexts", each grammar context being a single combination of the possibilities sketched out above. We're going to have to be careful about how we manage it though - if we combine all the possibilities we might need, including all levels of Japanese formality and such considerations as whether we need to define neuter and mixed genders (for non-personalised text and text discussing mixed groups, respectively), then we could end up with about 4000 possible "grammar contexts" which might - or might not - apply to each individual text item. Er... let us know if you think we've missed anything, as we'd rather know now!

At the moment, we think the solution is sound in its flexibility and can kept within reasonable bounds of complexity by being careful about how much of it we actually use and expose - we don't have to define all 4000 possible grammar contexts, for example - but it's a scary thought how complex the whole business could become.

And I haven't even mentioned issues like defining the languages and scripts themselves, though thankfully there are international standards to help us out with those.
Tags: software development, thought, work

  • A March for Independence (in April)

    On Saturday, YES North East Fife organised a march across the Tay Bridge from Fife to Dundee followed by a rally in the City Square. It was…

  • State of the Union

    Two and a half years ago, almost to the day, I stopped blogging on political matters as the argument I cared about was over for the time. My last…

  • 20 Years

    On the first of October, it was 20 years since I started work at Insights, and today I got my fourth block signifying a period of 5 years…

  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded 

    When you submit the form an invisible reCAPTCHA check will be performed.
    You must follow the Privacy Policy and Google Terms of use.