sqldf is installed like any other R package using this command from within R: install.packages("sqldf") This will download and install sqldf and all its dependencies including the RSQLite driver which includes SQLite itself so its particularly simple to install if you use sqlite. (More info later on how to use one of the other supported databases.) sqldf can use one of 4 database systems: - sqlite via RSQLite driver R package (which includes sqlite itself) - H2 via RH2 driver R package (which includes H2 itself) - PostgreSQL via the RpostgreSQL driver R package or the RpgSQL driver - MySQL via the RMySQL driver R package EXAMPLE OF INSTALLING AND USING WITH SQLITE As SQLite itself is included right in the RSQLite driver and since sqldf automatically installs RSQLite this is the easiest to install. Just install sqldf and there is nothing else to do: install.packages("sqldf") library(sqldf) # now we can use it with sqlite, e.g. sqldf("select * from iris limit 3") DF <- data.frame(a = 1:3, b = 4:6) sqldf("select * from DF") EXAMPLE OF INSTALLING AND USING WITH RH2 H2 is the next easiest to install because, like sqlite, its an embedded database that is included right in its R driver package. You only need to install sqldf, java and RH2. # install java runtime install.packages("sqldf") install.packages("RH2") library(RH2) library(sqldf) # now we can use it with H2, e.g. sqldf("select * from iris limit 3") DF <- data.frame(a = 1:3, b = 4:6) sqldf("select * from DF") EXAMPLE OF INSTALLING AND USING WITH RPOSTGRESQL First install PostgreSQL. The R driver, RPostgreSQL, has no special installation instructions: install.packages("sqldf") install.packages("RPostgreSQL") library(RPostgreSQL) library(sqldf) # now we can use it with PostgreSQL, e.g. sqldf("select * from iris limit 3") DF <- data.frame(a = 1:3, b = 4:6) # careful! Postgres translates everything not quoted to lower case sqldf('select * from "DF"') MYSQL To use MySQL, it must be installed separately and RMySQL must be built by the user making it a bit more difficult to install than others. MORE INFO There is installation info on the other databases on their CRAN web pages. See the Installation Info links on these pages: http://cran.r-project.org/package=RH2 http://cran.r-project.org/package=RPostgreSQL http://cran.r-project.org/package=RpgSQL http://cran.r-project.org/package=RMySQL RUNNING If RH2, RpgSQL or RMySQL is loaded, e.g. library(RH2), then sqldf will notice it and use that database so there is nothing to specify. The following is typically unnecessary but if you have multiple database drivers loaded and wish to explicitly specify one of them then use the drv=... argument of each sqldf call or set options(sqldf.driver=...) in R. The possible values of ... are "SQLite", "MySQL", "h2" or "pgSQL". See FAQ #7 on the sqldf home page (http://sqldf.googlecode.com) for more. PROBLEMS sqldf uses gsubfn which can optionally use the R tcltk package but can also run without that package substituting R code for it. On Windows and most other R installations, tcltk comes with the core of R and so you automatically get it. If the tcltk is package not present it will be automatically detected and it will automatically switch to R code. There have been reports of bad R builds in which R reports that it has tcltk capability even though its missing. In that case you will get an error message about tcltk being missing. To address this issue its easiest to just run this command to force it to use R code in place of tcltk: options(gsubfn.engine = "R") See FAQ #5 on the sqldf home page (http://sqldf.googlecode.com) for more. MORE INFO There is more information on sqldf on the sqldf home page: http://sqldf.googlecode.com Note the TROUBLESHOOTING, FAQs and EXAMPLES sections there. There is also information in the help files: library(help = sqldf) ?sqldf ?read.csv.sql