lingtypology
: Glottolog
functionsThis package is based on the Glottolog database (v. 4.4), so
lingtypology
has several functions for accessing data from
that database.
Most of the functions in lingtypology
have the same
syntax: what you need.what you have. Most of them are
based on language name.
Some of them help to define a vector of languages.
Additionally there are some functions to convert glottocodes to ISO 639-3 codes and vice versa:
The most important functionality of lingtypology
is the
ability to create interactive maps based on features and sets of
languages (see the third section):
Glottolog database (v. 4.1)
provides lingtypology
with language names, ISO codes,
glottocodes, affiliation, macro area, coordinates, and much information.
This set of functions doesn’t have a goal to cover all possible
combinations of functions. Check out additional information that is
preserved in the version of the Glottolog database used in
lingtypology
:
names(glottolog)
## [1] "glottocode" "language" "iso"
## [4] "level" "area" "latitude"
## [7] "longitude" "countries" "affiliation"
## [10] "subclassification"
Using R functions for data manipulation you can create your own database for your purpose.
All functions introduced in the previous section are regular functions, so they can take the following objects as input:
iso.lang("Adyghe")
## Adyghe
## "ady"
lang.iso("ady")
## ady
## "Adyghe"
lang.aff("West Caucasian")
## character(0)
I would like to point out that you can create strings in R using
single or double quotes. Since inserting single quotes in a string
created with single quotes causes an error in R, I use double quotes in
my tutorial. You can use single quotes, but be careful and remember that
'Ma'ya'
is an incorrect string in R.
area.lang(c("Adyghe", "Aduge"))
## Adyghe Aduge
## "Eurasia" "Africa"
<- c("Adyghe", "Russian")
lang aff.lang(lang)
## Adyghe
## "Abkhaz-Adyge, Circassian"
## Russian
## "Indo-European, Classical Indo-European, Balto-Slavic, Slavic, East Slavic"
iso.lang(lang.aff("Circassian"))
## Adyghe Kabardian
## "ady" "kbd"
If you are new to R, it is important to mention that you can create a table with languages, features and other parametres with any spreadsheet software you used to work. Then you can import the created file to R using standard tools.
All functions which take a vector of languages are enriched with a kind of a spell checker. If a language from a query is absent in the database, functions return a warning message containing a set of candidates with the minimal Levenshtein distance to the language from the query.
aff.lang("Adyge")
## Warning: Language Adyge is absent in our version of the Glottolog database. Did
## you mean Adyghe, Aduge, Abkhaz-Adyge?
## Adyge
## NA
subc.lang()
functionThe subc.lang()
function returns language
subclassification in the Newick tree format.
subc.lang("Lechitic")
## Lechitic
## "((Kashubian_Proper:1,Slovincian:1)kash1274:1,Polabian:1,(Old_Polish:1)poli1260:1)lech1241:1;"
This format is hard to interpret by itself, but there are some tools in R that make it possible to visualise those subclassifications:
library(ape)
plot(read.tree(text = subc.lang("Lechitic")))
It is possible to specify colors of tips in case you want to emphasize some nodes:
plot(read.tree(text = subc.lang("Lechitic")),
tip.color = c("red", "black", "black", "black"))
As you can see nodes are counted from bottom to top.
For more sophisticated tree visualization you can look into ggtree
package. There are several linguistic packages that provide some
functionality for creating glottolog trees:
glottoTrees
package by Erich Roundlingtypr
package by Laura Becker