Sometimes, it is practically or legally not possible to move corpus data to a local machine. This vignette explains the usage of CWB corpora that are hosted on an OpenCPU server.
library(polmineR)
The GermaParl corpus is hosted on an OpenCPU server with the IP
132.252.238.66 (subject to change). To use the corpus, use the
corpus()
-method. The only difference is that you will need
to supply the IP address using the argument server
.
<- corpus("GERMAPARL", server = "http://opencpu.politik.uni-due.de") gparl
The gparl
object is an object of class
remote_corpus
.
is(gparl)
The polmineR at this stage exposes a limited set of its functionality for remote corpora. Simple investigations in the remote corpus are possible.
size(gparl)
s_attributes(gparl)
<- subset(gparl, year == "2006") gparl2006
The returned object has the class remote_subcorpus
.
is(gparl2006)
count(gparl, query = "Integration")
The count()
-method works for
remote_subcorpus
objects, too.
count(gparl2006, query = "Integration")
kwic(gparl, query = "Islam", left = 15, right = 15, meta = c("speaker", "party", "date"))
Works for the remote_subcorpus
, too.
kwic(gparl2006, query = "Islam", left = 15, right = 15, meta = c("speaker", "party", "date"))
Create directory for registry file-style files with credentials
Create file with credentials for your corpus in this directory
Note: Filename is corpus id in lowercase
##
## registry entry for corpus GERMAPARLSAMPLE
##
# long descriptive name for the corpus
"GermaParlSample"
NAME # corpus ID (must be lowercase in registry!)
ID germaparlsample# path to binary data files
://localhost:8005
HOME http# optional info file (displayed by ",info;" command in CQP)
://zenodo.org/record/3823245#.XsrU-8ZCT_Q
INFO https
# corpus properties provide additional information about the corpus:
##:: user = "YOUR_USER_NAME"
##:: password = "YOUR_PASSWORD"
Set environment variable “OPENCPU_REGISTRY” in .Renviron to dir just mentioned
Get server whereabouts
<- corpus("MIGPRESS_FAZ", server = "YOURSERVER", restricted = TRUE) x
Upcoming versions of polmineR will expose further functionality. This is a simple proof-of-concept!