Package 'eatGADS' reference manual

Title:	Data Management of Large Hierarchical Data
Description:	Import 'SPSS' data, handle and change 'SPSS' meta data, store and access large hierarchical data in 'SQLite' data bases.
Authors:	Benjamin Becker [aut, cre], Karoline Sachse [ctb], Johanna Busse [ctb]
Maintainer:	Benjamin Becker <[email protected]>
License:	GPL (>= 2)
Version:	1.1.1.9000
Built:	2025-03-27 09:34:09 UTC
Source:	https://github.com/beckerbenj/eatgads

Apply Meta Data Changes.

Description

Function to apply meta data changes to a GADSdat object specified by a change table extracted by getChangeMeta.

Usage

applyChangeMeta(changeTable, GADSdat, ...)

## S3 method for class 'varChanges'
applyChangeMeta(changeTable, GADSdat, checkVarNames = TRUE, ...)

## S3 method for class 'valChanges'
applyChangeMeta(
  changeTable,
  GADSdat,
  existingMeta = c("stop", "value", "value_new", "drop", "ignore"),
  ...
)
applyChangeMeta(changeTable, GADSdat, ...)

## S3 method for class 'varChanges'
applyChangeMeta(changeTable, GADSdat, checkVarNames = TRUE, ...)

## S3 method for class 'valChanges'
applyChangeMeta(
  changeTable,
  GADSdat,
  existingMeta = c("stop", "value", "value_new", "drop", "ignore"),
  ...
)

Arguments

`changeTable`	Change table as provided by `getChangeMeta`.
`GADSdat`	`GADSdat` object imported via `eatGADS`.
`...`	further arguments passed to or from other methods.
`checkVarNames`	Logical. Should new variable names be checked by `checkVarNames`?
`existingMeta`	If values are recoded, which meta data should be used (see details)?

Details

Values for which the change columns contain NA remain unchanged. If changes are performed on value levels, recoding into existing values can occur. In these cases, existingMeta determines how the resulting meta data conflicts are handled, either raising an error if any occur ("stop"), keeping the original meta data for the value ("value"), using the meta data in the changeTable and, if incomplete, from the recoded value ("value_new"), or leaving the respective meta data untouched ("ignore").

Furthermore, one might recode multiple old values in the same new value. This is currently only possible with existingMeta = "drop", which drops all related meta data on value level, or existingMeta = "ignore", which leaves all related meta data on value level untouched.

Value

Returns the modified GADSdat object.

Examples

# Change a variable name and label
varChangeTable <- getChangeMeta(pisa, level = "variable")
varChangeTable[1, c("varName_new", "varLabel_new")] <- c("IDstud", "Person ID")

pisa2 <- applyChangeMeta(varChangeTable, GADSdat = pisa)
# Change a variable name and label
varChangeTable <- getChangeMeta(pisa, level = "variable")
varChangeTable[1, c("varName_new", "varLabel_new")] <- c("IDstud", "Person ID")

pisa2 <- applyChangeMeta(varChangeTable, GADSdat = pisa)

Recode via lookup table.

Description

Recode one or multiple variables based on a lookup table created via createLookup (and potentially formatted by collapseColumns).

Usage

applyLookup(GADSdat, lookup, suffix = NULL)
applyLookup(GADSdat, lookup, suffix = NULL)

Arguments

`GADSdat`	A `GADSdat` object.
`lookup`	Lookup table created by `createLookup` and - if necessary - collapsed by `collapseColumns`. Column names must be `c("variable", "value", "value_new")`.
`suffix`	Suffix to add to the existing variable names. If `NULL`, the old variables will be overwritten.

Details

If there are missing values in the column value_new, NAs are inserted as new values and a warning is issued.

The complete work flow when using a lookup table to recode multiple variables in a GADSdat could be: (0) optional: Recode empty strings to NA (necessary, if the look up table is written to excel). (1) create a lookup table with createLookup. (2) Save the lookup table to .xlsx with write_xlsx from eatAnalysis. (3) fill out the lookup table via Excel. (4) Import the lookup table back to R via read_excel from readxl. (5) Apply the final lookup table with applyLookup.

See applyLookup_expandVar for recoding a single variable into multiple variables.

Value

Returns a recoded GADSdat.

Examples

## create an example GADSdat
iris2 <- iris
iris2$Species <- as.character(iris2$Species)
gads <- import_DF(iris2)

## create Lookup
lu <- createLookup(gads, recodeVars = "Species")
lu$value_new <- c("plant 1", "plant 2", "plant 3")

## apply lookup table
gads2 <- applyLookup(gads, lookup = lu, suffix = "_r")

## only recode some values
lu2 <- createLookup(gads, recodeVars = "Species")
lu2$value_new <- c("plant 1", "plant 2", NA)
gads3 <- applyLookup(gads, lookup = lu2, suffix = "_r")

## create an example GADSdat
iris2 <- iris
iris2$Species <- as.character(iris2$Species)
gads <- import_DF(iris2)

## create Lookup
lu <- createLookup(gads, recodeVars = "Species")
lu$value_new <- c("plant 1", "plant 2", "plant 3")

## apply lookup table
gads2 <- applyLookup(gads, lookup = lu, suffix = "_r")

## only recode some values
lu2 <- createLookup(gads, recodeVars = "Species")
lu2$value_new <- c("plant 1", "plant 2", NA)
gads3 <- applyLookup(gads, lookup = lu2, suffix = "_r")

Recode via lookup table into multiple variables.

Description

Recode one or multiple variables based on a lookup table created via createLookup. In contrast to applyLookup, this function allows the creation of multiple resulting variables from a single input variable. All variables in lookup except variable and value are treated as recode columns.

Usage

applyLookup_expandVar(GADSdat, lookup)
applyLookup_expandVar(GADSdat, lookup)

Arguments

`GADSdat`	A `GADSdat` object.
`lookup`	Lookup table created by `createLookup`.

Details

If a variable contains information that should be split into multiple variables via manual recoding, applyLookup_expandVar can be used. If there are missing values in any recode column, NAs are inserted as new values. A warning is issued only for the first column.

The complete work flow when using a lookup table to expand variables in a GADSdat based on manual recoding could be: (1) create a lookup table with createLookup. (2) Save the lookup table to .xlsx with write_xlsx from eatAnalysis. (3) fill out the lookup table via Excel. (4) Import the lookup table back to R via read_excel from readxl. (5) Apply the final lookup table with applyLookup_expandVar.

See applyLookup for simply recoding variables in a GADSdat.

Value

Returns a recoded GADSdat.

Examples

## create an example GADSdat
example_df <- data.frame(ID = 1:6,
                        citizenship = c("germ", "engl", "germ, usa", "china",
                                        "austral, morocco", "nothin"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## create Lookup
lu <- createLookup(gads, recodeVars = "citizenship", addCol = c("cit_1", "cit_2"))
lu$cit_1 <- c("German", "English", "German", "Chinese", "Australian", NA)
lu$cit_2 <- c(NA, NA, "USA", NA, "Morocco", NA)

## apply lookup table
gads2 <- applyLookup_expandVar(gads, lookup = lu)

## create an example GADSdat
example_df <- data.frame(ID = 1:6,
                        citizenship = c("germ", "engl", "germ, usa", "china",
                                        "austral, morocco", "nothin"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## create Lookup
lu <- createLookup(gads, recodeVars = "citizenship", addCol = c("cit_1", "cit_2"))
lu$cit_1 <- c("German", "English", "German", "Chinese", "Australian", NA)
lu$cit_2 <- c(NA, NA, "USA", NA, "Morocco", NA)

## apply lookup table
gads2 <- applyLookup_expandVar(gads, lookup = lu)

Apply recodes according to a numerical check data.frame.

Description

Applies recodes as specified by a numCheck data.frame, as created by createNumCheck.

Usage

applyNumCheck(GADSdat, numCheck)
applyNumCheck(GADSdat, numCheck)

Arguments

`GADSdat`	A `GADSdat` object.
`numCheck`	A `data.frame` as created by `createNumCheck`.

Details

This function is currently under development.

Value

A recoded GADSdat.

Examples

# tbd

# tbd

Assimilate value labels.

Description

Assimilate all value labels of multiple variables as part of a GADSdat or all_GADSdat object.

Usage

assimilateValLabels(GADSdat, varNames, lookup = NULL)
assimilateValLabels(GADSdat, varNames, lookup = NULL)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varNames`	Character string of a variable name.
`lookup`	Look up `data.frame`.

Details

Assimilation can be performed using all existing value labels or a look up table containing at least all existing value labels. Missing codes are reused based on the meta data of the first variable in varNames.

Value

Returns the GADSdat object with changed meta data and recoded values.

Examples

# Example data set
facs_df <- data.frame(id = 1:3, fac1 = c("Eng", "Aus", "Ger"),
                      fac2 = c("Ger", "Franz", "Ita"),
                      fac3 = c("Kor", "Chi", "Alg"),
                      stringsAsFactors = TRUE)
facs_gads <- import_DF(facs_df)

assimilateValLabels(facs_gads, varNames = paste0("fac", 1:3))

# Example data set
facs_df <- data.frame(id = 1:3, fac1 = c("Eng", "Aus", "Ger"),
                      fac2 = c("Ger", "Franz", "Ita"),
                      fac3 = c("Kor", "Chi", "Alg"),
                      stringsAsFactors = TRUE)
facs_gads <- import_DF(facs_df)

assimilateValLabels(facs_gads, varNames = paste0("fac", 1:3))

Auto recode a variable in a `GADSdat`.

Description

Auto recode a variable in a GADSdat. A look up table is created containing the respective recode pairs. An existing look up table can be utilized via template. This function somewhat mirrors the functionality provided by the SPSS function autorecode.

Usage

autoRecode(
  GADSdat,
  var,
  var_suffix = "",
  label_suffix = "",
  csv_path = NULL,
  template = NULL
)
autoRecode(
  GADSdat,
  var,
  var_suffix = "",
  label_suffix = "",
  csv_path = NULL,
  template = NULL
)

Arguments

`GADSdat`	A `GADSdat` object.
`var`	Character string of the variable name which should be recoded.
`var_suffix`	Variable suffix for the newly created `GADSdat`. If an empty character, the existing variables are overwritten.
`label_suffix`	Suffix added to variable label for the newly created variable in the `GADSdat`.
`csv_path`	Path for the `.csv` file for the look up table.
`template`	Existing look up table.

Details

If an existing template is used and a look up table is saved as a .csv file, the resulting look up table will contain the existing recodes plus additional recode pairs required for the data.

Value

Returns a GADSdat object.

Examples

gads <- import_DF(data.frame(v1 = letters))

# auto recode without saving look up table
gads2 <- autoRecode(gads, var = "v1", var_suffix = "_num")

# auto recode with saving look up table
f <- tempfile(fileext = ".csv")
gads2 <- autoRecode(gads, var = "v1", var_suffix = "_num", csv_path = f)
gads <- import_DF(data.frame(v1 = letters))

# auto recode without saving look up table
gads2 <- autoRecode(gads, var = "v1", var_suffix = "_num")

# auto recode with saving look up table
f <- tempfile(fileext = ".csv")
gads2 <- autoRecode(gads, var = "v1", var_suffix = "_num", csv_path = f)

Calculate a scale.

Description

Calculate a scale variable based on multiple items.

Usage

calculateScale(
  GADSdat,
  items,
  scale,
  maxNA = length(items),
  reportDescr = FALSE
)
calculateScale(
  GADSdat,
  items,
  scale,
  maxNA = length(items),
  reportDescr = FALSE
)

Arguments

`GADSdat`	A `data.frame` or `GADSdat` object.
`items`	A character vector with all item variable names.
`scale`	A character vector with the scale name.
`maxNA`	Maximum number of allowed `NA` values on the items.
`reportDescr`	Should descriptive statistics be reported for the calculated scale.

Details

Descriptive statistics (including Cronbach's alpha, credit to the psy package) are calculated and printed to the console. The new scale variable is automatically inserted right after the last item in the original GADSdat.

Value

Returns a GADSdat containing the newly computed variable.

Examples

##
items <- paste0("norms_", letters[1:6])
pisa_new <- calculateScale(pisa, items = items, scale = "norms")

##
items <- paste0("norms_", letters[1:6])
pisa_new <- calculateScale(pisa, items = items, scale = "norms")

Bind two `GADSdat` objects into a single `GADSdat` object by columns.

Description

Is a secure way to cbind the data and the meta data of two GADSdat objects. Currently, only limited merging options are supported.

Usage

## S3 method for class 'GADSdat'
cbind(..., deparse.level = 1)
## S3 method for class 'GADSdat'
cbind(..., deparse.level = 1)

Arguments

`...`	Multiple `GADSdat` objects imported via `eatGADS`.
`deparse.level`	Argument is ignored in this method.

Details

If there are duplicate variables (except the variables specified in the by argument), these variables are removed from y. The meta data is joined for the remaining variables via rbind.

Value

Returns a GADSdat object.

Change missing code.

Description

Change or add missing codes of one or multiple variables as part of a GADSdat object.

Usage

changeMissings(GADSdat, varName, value, missings)
changeMissings(GADSdat, varName, value, missings)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character vector containing variable names.
`value`	Numeric values.
`missings`	Character vector of the new missing codes, either `"miss"` or `"valid"`. Missings tags are applied in the same ordering as `value`.

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta. The function supports changing multiple missing tags (missings) as well as missing tags of multiple variables (varName) at once.

Value

Returns the GADSdat object with changed meta data.

Examples

# Set a specific value to missing
pisa2 <- changeMissings(pisa, varName = "computer_age",
                        value = 5, missings = "miss")

# Set multiple values to missing
pisa3 <- changeMissings(pisa, varName = "computer_age",
                        value = 1:4,
                        missings = c("miss", "miss", "miss", "miss"))

# Set a specific value to not missing
pisa4 <- changeMissings(pisa2, varName = "computer_age",
                        value = 5, missings = "valid")

# Add missing tags to multiple variables
pisa5 <- changeMissings(pisa, varName = c("g8g9", "computer_age"),
                        value = c(-99, -98), missings = c("miss", "miss"))

# Set a specific value to missing
pisa2 <- changeMissings(pisa, varName = "computer_age",
                        value = 5, missings = "miss")

# Set multiple values to missing
pisa3 <- changeMissings(pisa, varName = "computer_age",
                        value = 1:4,
                        missings = c("miss", "miss", "miss", "miss"))

# Set a specific value to not missing
pisa4 <- changeMissings(pisa2, varName = "computer_age",
                        value = 5, missings = "valid")

# Add missing tags to multiple variables
pisa5 <- changeMissings(pisa, varName = c("g8g9", "computer_age"),
                        value = c(-99, -98), missings = c("miss", "miss"))

Change SPSS format.

Description

Change the SPSS format of one or multiple variables as part of a GADSdat object.

Usage

changeSPSSformat(GADSdat, varName, format)
changeSPSSformat(GADSdat, varName, format)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character vector of variable names.
`format`	A single string containing the new SPSS format, for example 'A25' or 'F10'.

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta.

SPSS format is supplied following SPSS logic. 'A' represents character variables, 'F' represents numeric variables. The number following this letter represents the maximum width. Optionally, another number can be added after a dot, representing the number of decimals in case of a numeric variable. For instance, 'F8.2' is used for a numeric variable with a maximum width of 8 with 2 decimal numbers.

Value

Returns the GADSdat object with changed meta data..

Examples

# change SPSS format for a single variable (numeric variable with no decimals)
pisa2 <- changeSPSSformat(pisa, varName = "idstud",
                          format = "F10.0")

# change SPSS format for multiple variables (numeric variable with no decimals)
pisa2 <- changeSPSSformat(pisa, varName = c("idstud", "idschool"),
                          format = "F10.0")

# change SPSS format for a single variable (numeric variable with no decimals)
pisa2 <- changeSPSSformat(pisa, varName = "idstud",
                          format = "F10.0")

# change SPSS format for multiple variables (numeric variable with no decimals)
pisa2 <- changeSPSSformat(pisa, varName = c("idstud", "idschool"),
                          format = "F10.0")

Change value labels.

Description

Change or add value labels of one or multiple variables as part of a GADSdat object.

Usage

changeValLabels(GADSdat, varName, value, valLabel)
changeValLabels(GADSdat, varName, value, valLabel)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character vector containing variable names.
`value`	Numeric values which are being labeled.
`valLabel`	Character vector of the new value labels. Labels are applied in the same ordering as `value`.

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta. The function supports changing multiple value labels (valLabel) as well as value labels of multiple variables (varName) at once.

Value

Returns the GADSdat object with changed meta data.

Examples

# Change existing value labels
pisa2 <- changeValLabels(pisa, varName = "repeated",
                        value = c(1, 2),
                        valLabel = c("no grade repetition", "grade repitition"))

# Add value label to unlabeled value
mtcars_g <- import_DF(mtcars)
mtcars_g2 <- changeValLabels(mtcars_g, varName = "cyl",
                             value = c(4, 6, 8),
                             valLabel = c("four", "six", "eight"))

# Add value labels to multiple variables at once
mtcars_g3 <- changeValLabels(mtcars_g, varName = c("mpg", "cyl", "disp"),
                             value = c(-99, -98),
                             valLabel = c("missing", "not applicable"))


# Change existing value labels
pisa2 <- changeValLabels(pisa, varName = "repeated",
                        value = c(1, 2),
                        valLabel = c("no grade repetition", "grade repitition"))

# Add value label to unlabeled value
mtcars_g <- import_DF(mtcars)
mtcars_g2 <- changeValLabels(mtcars_g, varName = "cyl",
                             value = c(4, 6, 8),
                             valLabel = c("four", "six", "eight"))

# Add value labels to multiple variables at once
mtcars_g3 <- changeValLabels(mtcars_g, varName = c("mpg", "cyl", "disp"),
                             value = c(-99, -98),
                             valLabel = c("missing", "not applicable"))

Change variable labels.

Description

Change variable labels of one or multiple variables as part of a GADSdat object.

Usage

changeVarLabels(GADSdat, varName, varLabel)
changeVarLabels(GADSdat, varName, varLabel)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character vector of variable names.
`varLabel`	Character vector of the new variable labels.

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta.

Value

Returns the GADSdat object with changed meta data.

Examples

# Change one variable label
pisa2 <- changeVarLabels(pisa, varName = "repeated",
                         varLabel = c("Has a grade been repeated?"))

# Change multiple variable labels
pisa2 <- changeVarLabels(pisa, varName = c("repeated", "gender"),
                         varLabel = c("Has a grade been repeated?",
                                      "Student gender"))

# Change one variable label
pisa2 <- changeVarLabels(pisa, varName = "repeated",
                         varLabel = c("Has a grade been repeated?"))

# Change multiple variable labels
pisa2 <- changeVarLabels(pisa, varName = c("repeated", "gender"),
                         varLabel = c("Has a grade been repeated?",
                                      "Student gender"))

Change Variable Names.

Description

Change variable names of a GADSdat or all_GADSdat object.

Usage

changeVarNames(GADSdat, oldNames, newNames, checkVarNames = TRUE)
changeVarNames(GADSdat, oldNames, newNames, checkVarNames = TRUE)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`oldNames`	Vector containing the old variable names.
`newNames`	Vector containing the new variable names, in identical order as `oldNames`.
`checkVarNames`	Logical. Should new variable names be checked by `checkVarNames`?

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta

Value

Returns the GADSdat object with changed variable names.

Examples

# Change multiple variable name
pisa2 <- changeVarNames(pisa, oldNames = c("idstud", "idschool"),
                        newNames = c("IDstud", "IDschool"))

# Change multiple variable name
pisa2 <- changeVarNames(pisa, oldNames = c("idstud", "idschool"),
                        newNames = c("IDstud", "IDschool"))

Check `SPSS` Compliance of Meta Data

Description

Function to check if variable names and labels, value labels and missing codes comply with SPSS requirements for meta data.

Usage

check4SPSS(GADSdat)
check4SPSS(GADSdat)

Arguments

GADSdat

GADSdat object imported via eatGADS.

Details

The function measures the length of variable names ("varNames_length", maximum of 64 characters) variable labels ("varLabels", maximum of 256 characters), value labels ("valLabels", maximum of 120 characters). Furthermore, missing codes are counted ("missings", maximum of three missing codes for character variables) and special characters are flagged in variable names ("varNames_special"). Check results are reported back on variable level, with the exception of "valLabels", which is a list with entries per violating variable.

Value

Returns a list with the entries "varNames_special", "varNames_length", "varLabels", "valLabels" and "missings".

Examples

# Change example data set (create a violating label)
pisa2 <- changeVarLabels(pisa, varName = "computer_age",
                        varLabel = paste(rep("3", 125), collapse = ""))

check4SPSS(pisa2)

# Change example data set (create a violating label)
pisa2 <- changeVarLabels(pisa, varName = "computer_age",
                        varLabel = paste(rep("3", 125), collapse = ""))

check4SPSS(pisa2)

Check Value Labels

Description

Check value labels for (a) value labels with no occurrence in the data (checkEmptyValLabels) and (b) values with no value labels (checkMissingValLabels).

Usage

checkEmptyValLabels(
  GADSdat,
  vars = namesGADS(GADSdat),
  valueRange = NULL,
  output = c("list", "data.frame")
)

checkMissingValLabels(
  GADSdat,
  vars = namesGADS(GADSdat),
  classes = c("integer"),
  valueRange = NULL,
  output = c("list", "data.frame")
)
checkEmptyValLabels(
  GADSdat,
  vars = namesGADS(GADSdat),
  valueRange = NULL,
  output = c("list", "data.frame")
)

checkMissingValLabels(
  GADSdat,
  vars = namesGADS(GADSdat),
  classes = c("integer"),
  valueRange = NULL,
  output = c("list", "data.frame")
)

Arguments

`GADSdat`	A `GADSdat` object.
`vars`	Character vector with the variable names to which `checkValLabels()` should be applied.
`valueRange`	[optional] Numeric vector of length 2: In which range should numeric values be checked? If specified, only numeric values are returned and strings are omitted.
`output`	Should the output be structured as a `"list"` or a `"data.frame"`?
`classes`	Character vector with the classes to which `checkMissingLabels()` should be applied. Valid options are `"integer"`, `"double"`, and `"character"`.

Details

NAs are excluded from this check. Designated missing codes are reported normally.

Value

Returns a list of length vars or a data.frame.

Functions

checkEmptyValLabels(): check for superfluous value labels
checkMissingValLabels(): check for missing value labels

Examples

# Check a categorical and a metric variable
checkMissingValLabels(pisa, vars = c("g8g9", "age"))
checkEmptyValLabels(pisa, vars = c("g8g9", "age"))

# Check while defining a specific value range
checkMissingValLabels(pisa, vars = c("g8g9", "age", "idschool"),
              valueRange = c(0, 5))
checkEmptyValLabels(pisa, vars = c("g8g9", "age", "idschool"),
              valueRange = c(0, 5))

# Check a categorical and a metric variable
checkMissingValLabels(pisa, vars = c("g8g9", "age"))
checkEmptyValLabels(pisa, vars = c("g8g9", "age"))

# Check while defining a specific value range
checkMissingValLabels(pisa, vars = c("g8g9", "age", "idschool"),
              valueRange = c(0, 5))
checkEmptyValLabels(pisa, vars = c("g8g9", "age", "idschool"),
              valueRange = c(0, 5))

Check and Adjust SPSS Format

Description

Function to check if SPSS format statements are specified correctly in a GADSdat object.

Usage

checkFormat(GADSdat, type = "SPSS", changeFormat = TRUE)
checkFormat(GADSdat, type = "SPSS", changeFormat = TRUE)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`type`	If `type='other'`, the function `nchar` will be used to determine character lengths and decimals are not rounded to 16 decimal places. With `type='SPSS'` additional width for character variables will be added in order to let SPSS read in lengthy characters correctly and .
`changeFormat`	If `changeFormat=TRUE` the `GADSdat` meta data will be updated otherwise only information will be reported.

Details

The function compares SPSS format statements "format" and actual character length and decimal places of all variables in a GADSdat object and its meta data information. Mismatches are reported and can be automatically adjusted.

Value

Returns a GADSdat object.

Examples

# Change example meta information (create a value label with incorrect missing code)
pisa2 <- checkFormat(pisa)


# Change example meta information (create a value label with incorrect missing code)
pisa2 <- checkFormat(pisa)

Check and Adjust Missing Tags

Description

Functions to check if missings are tagged and labeled correctly in a GADSdat object.

Usage

checkMissings(
  GADSdat,
  missingLabel = "missing",
  addMissingCode = TRUE,
  addMissingLabel = FALSE
)

checkMissingsByValues(GADSdat, missingValues = -50:-99, addMissingCode = TRUE)
checkMissings(
  GADSdat,
  missingLabel = "missing",
  addMissingCode = TRUE,
  addMissingLabel = FALSE
)

checkMissingsByValues(GADSdat, missingValues = -50:-99, addMissingCode = TRUE)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`missingLabel`	Single regular expression indicating how missing labels are commonly named in the value labels.
`addMissingCode`	If `TRUE`, missing tags are added according to `missingLabel` or `missingValues`.
`addMissingLabel`	If `TRUE`, `"generic missing"` is added according to occurrence of `"miss"` in `"missings"`. As often various value labels for missings are used, this argument should be used with great care.
`missingValues`	Numeric vector of values which are commonly used for missing values.

Details

checkMissings() compares value labels (valLabels) and missing tags (missings) of a GADSdat object and its meta data information. checkMissingsByValues() compares labeled values (value) and missing tags (missings) of a GADSdat object and its meta data information. Mismatches are reported and can be automatically adjusted. Note that all checks are only applied to the meta data information, not the actual data. For detecting missing value labels, see checkMissingValLabels.

Value

Returns a GADSdat object with - if specified - modified missing tags.

Functions

checkMissings(): compare missing tags and value labels
checkMissingsByValues(): compare missing tags and values in a certain range

Examples

# checkMissings
pisa2 <- changeValLabels(pisa, varName = "computer_age",
                        value = 5, valLabel = "missing: No computer use")

pisa3 <- checkMissings(pisa2)

# checkMissingsByValues
pisa4 <- changeValLabels(pisa, varName = "computer_age",
                        value = c(-49, -90, -99), valLabel = c("test1", "test2", "test3"))

pisa5 <- checkMissingsByValues(pisa4, missingValues = -50:-99)

# checkMissings
pisa2 <- changeValLabels(pisa, varName = "computer_age",
                        value = 5, valLabel = "missing: No computer use")

pisa3 <- checkMissings(pisa2)

# checkMissingsByValues
pisa4 <- changeValLabels(pisa, varName = "computer_age",
                        value = c(-49, -90, -99), valLabel = c("test1", "test2", "test3"))

pisa5 <- checkMissingsByValues(pisa4, missingValues = -50:-99)

Checks compatibility of two `eatGADS` data bases.

Description

This function checks if both data bases perform identical joins via foreign keys, if they contain the same variable names and if these variables have the same value labels. Results of this comparison are reported on data table level as messages and as an output list.

Usage

checkTrendStructure(filePath1, filePath2)
checkTrendStructure(filePath1, filePath2)

Arguments

`filePath1`	Path of the first `eatGADS` `.db` file.
`filePath2`	Path of the second `eatGADS` `.db` file.

Details

An error is thrown if the key structure or the data table structure differs between the two data bases. Differences regarding meta data for missing value labels and for variables labels (and formatting) are ignored.

Reported differences regarding meta data can be inspected further via inspectMetaDifferences.

Value

Returns a report list.

Check uniqueness of a variable.

Description

Function to check if a variable is unique for all cases of an identifier variable.

Usage

checkUniqueness(GADSdat, varName, idVar)
checkUniqueness(GADSdat, varName, idVar)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Single string containing the variable name for which the check should be performed.
`idVar`	Single string containing the identifier variable name.

Details

For example if missing values are multiple imputed and data is stored in a long format, checking the uniqueness of a variable within an identifier can be tricky. This function automates this task.

Value

Returns either TRUE if the variable is unique within each value for idVar or a GADSdat object including the not unique cases.

Examples

## create an example GADSdat
iris2 <- iris
iris2$Species <- as.character(iris2$Species)
gads <- import_DF(iris2, checkVarNames = FALSE)

## check uniqueness
checkUniqueness(gads, varName = "Sepal.Length", idVar = "Species")

## create an example GADSdat
iris2 <- iris
iris2$Species <- as.character(iris2$Species)
gads <- import_DF(iris2, checkVarNames = FALSE)

## check uniqueness
checkUniqueness(gads, varName = "Sepal.Length", idVar = "Species")

Check uniqueness of a variable.

Description

Function to check if a variable is unique for all cases of an identifier variable. This is a fast and more efficient version of checkUniqueness which always returns a logical, non missing value of length one.

Usage

checkUniqueness2(GADSdat, varName, idVar, impVar)
checkUniqueness2(GADSdat, varName, idVar, impVar)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Single string containing the variable name for which the check should be performed.
`idVar`	Single string containing the name of the identifier variable.
`impVar`	Single string containing the name of the imputation number.

Details

For example if missing values are multiple imputed and data is stored in a long format, checking the uniqueness of a variable within an identifier can be tricky. This function automates this task via reshaping the data into wide format and testing equality among the reshaped variables. Similar functionality (via matrices) is covered by lme4::isNested, which is more general and performs similarly.

Value

Returns a logical of length one.

Examples

## create an example GADSdat
l <- 1000
long_df <- data.table::data.table(id = sort(rep(1:l, 15)),
                               v1 = sort(rep(1:l, 15)),
                                 imp = rep(1:15, l))
gads <- import_DF(long_df)
## check uniqueness
checkUniqueness2(gads, varName = "v1", idVar = "id", impVar = "imp")

## create an example GADSdat
l <- 1000
long_df <- data.table::data.table(id = sort(rep(1:l, 15)),
                               v1 = sort(rep(1:l, 15)),
                                 imp = rep(1:15, l))
gads <- import_DF(long_df)
## check uniqueness
checkUniqueness2(gads, varName = "v1", idVar = "id", impVar = "imp")

Check for a specific value

Description

Function to look for occurrences of a specific value in a GADSdat.

Usage

checkValue(GADSdat, value, vars = namesGADS(GADSdat))
checkValue(GADSdat, value, vars = namesGADS(GADSdat))

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`value`	Single string indicating how missing labels are commonly named in the value labels.
`vars`	Character vector with the variable names to which `checkValue` should be applied.

Details

The function checks occurrences of a specific value in a set of variables (default: all variables) in the GADSdat and outputs a vector containing the count of occurrences for all variables in which the value occurs. It explicitly supports checking for NA.

Value

A named integer.

Examples

# for all variables in the data
checkValue(pisa, value = 99)

# only for specific variables in the data
checkValue(pisa, vars = c("idschool", "g8g9"), value = 99)

# for all variables in the data
checkValue(pisa, value = 99)

# only for specific variables in the data
checkValue(pisa, vars = c("idschool", "g8g9"), value = 99)

Check names for `SQLite` column name conventions.

Description

Checks names for SQLite column name conventions and applies appropriate variable name changes to GADSdat or all_GADSdat objects.

Usage

checkVarNames(GADSdat, checkKeywords = TRUE, checkDots = TRUE)
checkVarNames(GADSdat, checkKeywords = TRUE, checkDots = TRUE)

Arguments

`GADSdat`	`GADSdat` or `all_GADSdat` object.
`checkKeywords`	Logical. Should `SQLite` keywords be checked and modified?
`checkDots`	Logical. Should occurrences of `"."` be checked and modified?

Details

Invalid column names in a SQLite data base include

SQLite keywords (see sqlite_keywords) and
column names with a "." in it.

The corresponding variable name changes are

appending the suffix "Var" to all SQLite keywords and
changing all "." in variable names to "_".

Note that avoiding "." in variable names is beneficial for multiple reasons, such as avoiding confusion with S3 methods in R and issues when importing from Stata.

Value

Returns the original object with updated variable names.

Examples

# Change example data set (create an invalid variable name)
pisa2 <- changeVarNames(pisa, oldNames = "computer_age",
                        newNames = "computer.age")

pisa3 <- checkVarNames(pisa2)

# Change example data set (create an invalid variable name)
pisa2 <- changeVarNames(pisa, oldNames = "computer_age",
                        newNames = "computer.age")

pisa3 <- checkVarNames(pisa2)

Clean temporary cache.

Description

Deprecated. The cached data base is now cleaned when the R sessions ends automatically.

Usage

clean_cache(tempPath = tempdir())
clean_cache(tempPath = tempdir())

Arguments

tempPath

Local directory in which the data base was temporarily be stored.

Details

Cleans the temporary cache, specified by tempdir(). This function had to be executed at the end of an R session if getGADS_fast or getTrendGADS with fast = TRUE had been used.

Value

Returns nothing.

Clone a variable.

Description

Clone a variable as part of a GADSdat object.

Usage

cloneVariable(
  GADSdat,
  varName,
  new_varName,
  label_suffix = "",
  checkVarName = TRUE
)
cloneVariable(
  GADSdat,
  varName,
  new_varName,
  label_suffix = "",
  checkVarName = TRUE
)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Name of the variable to be cloned.
`new_varName`	Name of the new variable.
`label_suffix`	Suffix added to variable label for the newly created variable in the `GADSdat`.
`checkVarName`	Logical. Should `new_varName` be checked by `checkVarNames`?

Details

The variable is simply duplicated and assigned a new name.

Value

Returns a GADSdat.

Examples

# duplicate the variable schtype
pisa_new <- cloneVariable(pisa, varName = "schtype", new_varName = "schtype_new")

# duplicate the variable schtype
pisa_new <- cloneVariable(pisa, varName = "schtype", new_varName = "schtype_new")

Collapse two columns of a lookup table.

Description

Collapse two columns or format a single column of a lookup table created by createLookup.

Usage

collapseColumns(lookup, recodeVars, prioritize)
collapseColumns(lookup, recodeVars, prioritize)

Arguments

`lookup`	For example a lookup table `data.frame` as created via `createLookup`.
`recodeVars`	Character vector of column names which should be collapsed (currently only up to two variables are supported).
`prioritize`	Character vector of length 1. Which of the columns in `recodeVars` should be prioritized, if multiple values are available? If `recodeVars` is of length 1, this argument can be omitted.

Details

If a lookup table is created by createLookup, different recoding columns can be specified by the addCols argument. This might be the case if two rater suggest recodes or one rater corrects recodes by another rater in a separate column. After the recoding columns have been filled out, collapseColumns can be used to either:

(a) Collapse two recoding columns into one recoding column. This might be desirable, if the two columns contain missing values. prioritize can be used to specify, which of the two columns should be prioritized if both columns contain valid values.

(b) Format the lookup table for applyLookup, if recodeVars is a single variable. This simply renames the single variable specified under recodeVars.

Value

Returns a data.frame that can be used for applyLookup, with the columns:

`variable`	Variable names
`value`	Old values
`value_new`	New values. Renamed and/or collapsed column.

Examples

## (a) Collapse two columns
# create example recode data.frame
lookup_raw <- data.frame(variable = c("var1"), value = c("germa", "German", "dscherman"),
           recode1 = c(NA, "English", "German"),
           recode2 = c("German", "German", NA), stringsAsFactors = FALSE)

# collapse columns
lookup <- collapseColumns(lookup_raw, recodeVars = c("recode1", "recode2"), prioritize = "recode2")

## (b) Format one column
# create example recode data.frame
lookup_raw2 <- data.frame(variable = c("var1"), value = c("germa", "German", "dscherman"),
           recode1 = c("German", "German", "German"), stringsAsFactors = FALSE)

# collapse columns
lookup2 <- collapseColumns(lookup_raw2, recodeVars = c("recode1"))

## (a) Collapse two columns
# create example recode data.frame
lookup_raw <- data.frame(variable = c("var1"), value = c("germa", "German", "dscherman"),
           recode1 = c(NA, "English", "German"),
           recode2 = c("German", "German", NA), stringsAsFactors = FALSE)

# collapse columns
lookup <- collapseColumns(lookup_raw, recodeVars = c("recode1", "recode2"), prioritize = "recode2")

## (b) Format one column
# create example recode data.frame
lookup_raw2 <- data.frame(variable = c("var1"), value = c("germa", "German", "dscherman"),
           recode1 = c("German", "German", "German"), stringsAsFactors = FALSE)

# collapse columns
lookup2 <- collapseColumns(lookup_raw2, recodeVars = c("recode1"))

Recode a multiple choice variable according to a character variable.

Description

Recode an labeled integer variable (based on an multiple choice item), according to a character variable (e.g. an open answer item).

Usage

collapseMC_Text(
  GADSdat,
  mc_var,
  text_var,
  mc_code4text,
  var_suffix = "_r",
  label_suffix = "(recoded)"
)
collapseMC_Text(
  GADSdat,
  mc_var,
  text_var,
  mc_code4text,
  var_suffix = "_r",
  label_suffix = "(recoded)"
)

Arguments

`GADSdat`	A `GADSdat` object.
`mc_var`	The variable name of the multiple choice variable.
`text_var`	The variable name of the text variable.
`mc_code4text`	The value label in `mc_var` that indicates that information from the text variable should be used.
`var_suffix`	Variable name suffix for the newly created variables. If `NULL`, variables are overwritten.
`label_suffix`	Variable label suffix for the newly created variable (only added in the meta data). If `NULL` no suffix is added.

Details

Multiple choice variables can be represented as labeled integer variables in a GADSdat. Multiple choice items with a forced choice frequently contain an open answer category. However, sometimes open answers overlap with the existing categories in the multiple choice item. collapseMC_Text allows recoding the multiple choice variable based on the open answer variable.

mc_code4text indicates when entries in the text_var should be used. Additionally, entries in the text_var are also used when there are missings on the mc_var. New values for the mc_var are added in the meta data, while preserving the initial ordering of the value labels. Newly added value labels are sorted alphabetically.

For more details see the help vignette: vignette("recoding_forcedChoice", package = "eatGADS").

Value

Returns a GADSdat containing the newly computed variable.

Examples

# Example gads
example_df <- data.frame(ID = 1:5, mc = c("blue", "blue", "green", "other", "other"),
                        open = c(NA, NA, NA, "yellow", "blue"),
                        stringsAsFactors = FALSE)
example_df$mc <- as.factor(example_df$mc)
gads <- import_DF(example_df)

# recode
gads2 <- collapseMC_Text(gads, mc_var = "mc", text_var = "open",
                        mc_code4text = "other")

# Example gads
example_df <- data.frame(ID = 1:5, mc = c("blue", "blue", "green", "other", "other"),
                        open = c(NA, NA, NA, "yellow", "blue"),
                        stringsAsFactors = FALSE)
example_df$mc <- as.factor(example_df$mc)
gads <- import_DF(example_df)

# recode
gads2 <- collapseMC_Text(gads, mc_var = "mc", text_var = "open",
                        mc_code4text = "other")

Recode multiple choice variable with multiple variables.

Description

Recode multiple variables (representing a single multiple choice item) based on multiple character variables (representing a text field).

Usage

collapseMultiMC_Text(
  GADSdat,
  mc_vars,
  text_vars,
  mc_var_4text,
  var_suffix = "_r",
  label_suffix = "(recoded)",
  invalid_miss_code = -98,
  invalid_miss_label = "Missing: Invalid response",
  notext_miss_code = -99,
  notext_miss_label = "Missing: By intention"
)
collapseMultiMC_Text(
  GADSdat,
  mc_vars,
  text_vars,
  mc_var_4text,
  var_suffix = "_r",
  label_suffix = "(recoded)",
  invalid_miss_code = -98,
  invalid_miss_label = "Missing: Invalid response",
  notext_miss_code = -99,
  notext_miss_label = "Missing: By intention"
)

Arguments

`GADSdat`	A `GADSdat` object.
`mc_vars`	A character vector with the variable names of the multiple choice variable. Names of the character vector are the corresponding values that are represented by the individual variables. Creation by `matchValues_varLabels` is recommended.
`text_vars`	A character vector with the names of the text variables which should be collapsed.
`mc_var_4text`	The name of the multiple choice variable that signals that information from the text variable should be used. This variable is recoded according to the final status of the text variables.
`var_suffix`	Variable suffix for the newly created `GADSdat`. If an empty character, the existing variables are overwritten.
`label_suffix`	Suffix added to variable label for the newly created or modified variables in the `GADSdat`.
`invalid_miss_code`	Missing code which is given to new character variables if all text entries where recoded into the dichotomous variables.
`invalid_miss_label`	Value label for `invalid_miss_code`.
`notext_miss_code`	Missing code which is given to empty character variables.
`notext_miss_label`	Value label for `notext_miss_code`.

Details

If a multiple choice item can be answered with ticking multiple boxes, multiple variables in the data set are necessary to represent this item. In this case, an additional text field for further answers can also contain multiple values at once. However, some of the answers in the text field might be redundant to the dummy variables. collapseMultiMC_Text allows to recode multiple MC items of this kind based on multiple text variables. The recoding can be prepared by expanding the single text variable (createLookup and applyLookup_expandVar) and by matching the dummy variables to its underlying values stored in variable labels (matchValues_varLabels).

The function recodes the dummy variables according to the character variables. Additionally, the mc_var_4text variable is recoded according to the final status of the text_vars (exception: if the text variables were originally NA, mc_var_4text is left as it was).

Missing values in the character variables can be represented either by NAs or by empty characters. The multiple choice variables specified with mc_vars can only contain the values 0, 1 and missing codes. The value 1 must always represent "this category applies". If necessary, use recodeGADS for recoding.

For cases for which the text_vars contain only values that can be recoded into the mc_vars, all new text_vars are given specific missing codes (see invalid_miss_code and invalid_miss_label). All remaining NAs on the character variables are given a specific missing code (notext_miss_code).

Value

Returns a GADSdat containing the newly computed variables.

Examples

# Prepare example data
mt2 <- data.frame(ID = 1:4, mc1 = c(1, 0, 0, 0), mc2 = c(0, 0, 0, 0), mc3 = c(0, 1, 1, 0),
                  text1 = c(NA, "Eng", "Aus", "Aus2"), text2 = c(NA, "Franz", NA, "Ger"),
                  stringsAsFactors = FALSE)
mt2_gads <- import_DF(mt2)
mt3_gads <- changeVarLabels(mt2_gads, varName = c("mc1", "mc2", "mc3"),
                            varLabel = c("Lang: Eng", "Aus spoken", "other"))

## All operations (see also respective help pages of functions for further explanations)
mc_vars <- matchValues_varLabels(mt3_gads, mc_vars = c("mc1", "mc2", "mc3"),
            values = c("Aus", "Eng", "Eng"), label_by_hand = c("other" = "mc3"))

out_gads <- collapseMultiMC_Text(mt3_gads, mc_vars = mc_vars,
             text_vars = c("text1", "text2"), mc_var_4text = "mc3")

out_gads2 <- multiChar2fac(out_gads, vars = c("text1_r", "text2_r"))

final_gads <- remove2NAchar(out_gads2, vars = c("text1_r_r", "text2_r_r"),
                              max_num = 1, na_value = -99, na_label = "missing: excessive answers")

# Prepare example data
mt2 <- data.frame(ID = 1:4, mc1 = c(1, 0, 0, 0), mc2 = c(0, 0, 0, 0), mc3 = c(0, 1, 1, 0),
                  text1 = c(NA, "Eng", "Aus", "Aus2"), text2 = c(NA, "Franz", NA, "Ger"),
                  stringsAsFactors = FALSE)
mt2_gads <- import_DF(mt2)
mt3_gads <- changeVarLabels(mt2_gads, varName = c("mc1", "mc2", "mc3"),
                            varLabel = c("Lang: Eng", "Aus spoken", "other"))

## All operations (see also respective help pages of functions for further explanations)
mc_vars <- matchValues_varLabels(mt3_gads, mc_vars = c("mc1", "mc2", "mc3"),
            values = c("Aus", "Eng", "Eng"), label_by_hand = c("other" = "mc3"))

out_gads <- collapseMultiMC_Text(mt3_gads, mc_vars = mc_vars,
             text_vars = c("text1", "text2"), mc_var_4text = "mc3")

out_gads2 <- multiChar2fac(out_gads, vars = c("text1_r", "text2_r"))

final_gads <- remove2NAchar(out_gads2, vars = c("text1_r_r", "text2_r_r"),
                              max_num = 1, na_value = -99, na_label = "missing: excessive answers")

Compare two GADS.

Description

Compare multiple variables of two GADSdat or all_GADSdat objects.

Usage

compareGADS(
  GADSdat_old,
  GADSdat_new,
  varNames,
  output = c("list", "data.frame", "aggregated")
)
compareGADS(
  GADSdat_old,
  GADSdat_new,
  varNames,
  output = c("list", "data.frame", "aggregated")
)

Arguments

`GADSdat_old`	`GADSdat` object imported via `eatGADS`.
`GADSdat_new`	`GADSdat` object imported via `eatGADS`.
`varNames`	Character string of variable names to be compared.
`output`	How should the output be structured?

Details

Returns "all equal" if the variable is identical across the objects or a data.frame containing a frequency table with the values which have been changed. Especially useful for checks after recoding.

Value

Returns either a list with "all equal" and data.frames or a single data.frame.

Examples

# Recode a GADS
pisa2 <- recodeGADS(pisa, varName = "schtype",
                        oldValues = 3, newValues = 9)
pisa2 <- recodeGADS(pisa2, varName = "language",
                        oldValues = 1, newValues = 15)

# Compare
compareGADS(pisa, pisa2,
            varNames = c("ganztag", "schtype", "language"), output = "list")
compareGADS(pisa, pisa2,
            varNames = c("ganztag", "schtype", "language"), output = "data.frame")
compareGADS(pisa, pisa2,
             varNames = c("ganztag", "schtype", "language"), output = "aggregated")

# Recode a GADS
pisa2 <- recodeGADS(pisa, varName = "schtype",
                        oldValues = 3, newValues = 9)
pisa2 <- recodeGADS(pisa2, varName = "language",
                        oldValues = 1, newValues = 15)

# Compare
compareGADS(pisa, pisa2,
            varNames = c("ganztag", "schtype", "language"), output = "list")
compareGADS(pisa, pisa2,
            varNames = c("ganztag", "schtype", "language"), output = "data.frame")
compareGADS(pisa, pisa2,
             varNames = c("ganztag", "schtype", "language"), output = "aggregated")

Create a composite variable.

Description

Create a composite variable out of two variables.

Usage

composeVar(GADSdat, sourceVars, primarySourceVar, newVar, checkVarName = TRUE)
composeVar(GADSdat, sourceVars, primarySourceVar, newVar, checkVarName = TRUE)

Arguments

`GADSdat`	`GADSdat` or `all_GADSdat` object imported via eatGADS.
`sourceVars`	Character vector of length two containing the variable names which represent the sources of information.
`primarySourceVar`	Character vector containing a single variable name. Which of the `sourceVars` should be preferred?
`newVar`	Character vector containing the name of the new composite variable.
`checkVarName`	Logical. Should `newVar` be checked by `checkVarNames`?

Details

A common use case for creating a composite variable is if there are multiple sources for the same information. For example, a child and the parents are asked about the child's native language. In such cases a composite variable contains information from both variables, meaning that one source is preferred and the other source is used to substitute missing values.

Value

The modified GADSdat.

Examples

# example data
dat <- data.frame(ID = 1:4,
nat_lang_child = c("Engl", "Ger", "missing", "missing"),
nat_lang_father = c("Engl", "Engl", "Ger", "missing"),
stringsAsFactors = TRUE)
gads <- import_DF(dat)
changeMissings(gads, "nat_lang_child", value = 3, missings = "miss")
changeMissings(gads, "nat_lang_father", value = 3, missings = "miss")

# compose variable
composeVar(gads, sourceVars = c("nat_lang_child", "nat_lang_father"),
           primarySourceVar = "nat_lang_child", newVar = "nat_lang_comp")


# example data
dat <- data.frame(ID = 1:4,
nat_lang_child = c("Engl", "Ger", "missing", "missing"),
nat_lang_father = c("Engl", "Engl", "Ger", "missing"),
stringsAsFactors = TRUE)
gads <- import_DF(dat)
changeMissings(gads, "nat_lang_child", value = 3, missings = "miss")
changeMissings(gads, "nat_lang_father", value = 3, missings = "miss")

# compose variable
composeVar(gads, sourceVars = c("nat_lang_child", "nat_lang_father"),
           primarySourceVar = "nat_lang_child", newVar = "nat_lang_comp")

Modify upper and lower case for strings.

Description

Convert a character vector, all character variables in a data.frame or selected variables in a GADSdat to upper ("uppper"), lower ("lower"), or first letter upper and everything else lower case ("upperFirst").

Usage

convertCase(x, case = c("lower", "upper", "upperFirst"), ...)

## S3 method for class 'GADSdat'
convertCase(x, case = c("lower", "upper", "upperFirst"), vars, ...)
convertCase(x, case = c("lower", "upper", "upperFirst"), ...)

## S3 method for class 'GADSdat'
convertCase(x, case = c("lower", "upper", "upperFirst"), vars, ...)

Arguments

`x`	A character vector, `data.frame`, or `GADSdat`.
`case`	Character vector of length 1. What case should the strings be converted to? Available options are `"lower"`, `"upper"`, or `"upperFirst"`.
`...`	further arguments passed to or from other methods.
`vars`	Character vector. What variables in the `GADSdat` should the conversion be applied to?

Value

Returns the converted object.

Methods (by class)

convertCase(GADSdat): convert case for GADSdats

Examples

# for character
convertCase(c("Hi", "HEllo", "greaT"), case = "upperFirst")

# for GADSdat
input_g <- import_DF(data.frame(v1 = 1:3, v2 = c("Hi", "HEllo", "greaT"),
                          stringsAsFactors = FALSE))
convertCase(input_g, case = "upperFirst", vars = "v2")


# for character
convertCase(c("Hi", "HEllo", "greaT"), case = "upperFirst")

# for GADSdat
input_g <- import_DF(data.frame(v1 = 1:3, v2 = c("Hi", "HEllo", "greaT"),
                          stringsAsFactors = FALSE))
convertCase(input_g, case = "upperFirst", vars = "v2")

Create an `eatGADS` data base.

Description

Creates a relational data base containing hierarchically stored data with meta information (e.g. value and variable labels).

Usage

createGADS(allList, pkList, fkList, filePath)
createGADS(allList, pkList, fkList, filePath)

Arguments

`allList`	An object created via `mergeLabels`.
`pkList`	List of primary keys.
`fkList`	List of foreign keys.
`filePath`	Path to the db file to write (including name); has to end on '.db'.

Details

Uses createDB from the eatDB package to create a relational data base. For details on how to define keys see the documentation of createDB.

Value

Creates a data base in the given path, returns NULL.

Examples

# see createDB vignette

# see createDB vignette

Extract values for recoding.

Description

Extract unique values from one or multiple variables of a GADSdat object for recoding (e.g. via an Excel spreadsheet).

Usage

createLookup(GADSdat, recodeVars, sort_by = NULL, addCols = c("value_new"))
createLookup(GADSdat, recodeVars, sort_by = NULL, addCols = c("value_new"))

Arguments

`GADSdat`	A `GADSdat` object.
`recodeVars`	Character vector of variable names which should be recoded.
`sort_by`	By which column (`variable` and/or `value`) should the long format `data.frame` be sorted? If `NULL`, no sorting is performed.
`addCols`	Character vector of additional column names for recoding purposes.

Details

If recoding of one or multiple variables is more complex, a lookup table can be created for later application via applyLookup or applyLookup_expandVar. The function allows the extraction of the values of multiple variables and sorting of these unique values via variable and/or values. If addCols are specified the lookup table has to be formatted via collapseColumns, before it can be applied to recode data.

Value

Returns a data frame in long format with the following variables:

`variable`	Variables as specified in `recodeVars`
`value`	Unique values of the variables specified in `recodeVars`
`value_new`	This is the default for `addCols`. If different additional column names are supplied, this column is missing.

Examples

# create example GADS
dat <- data.frame(ID = 1:4, var1 = c(NA, "Eng", "Aus", "Aus2"),
                  var2 = c(NA, "French", "Ger", "Ita"),
                  stringsAsFactors = FALSE)
gads <- import_DF(dat)

# create Lookup table for recoding
lookup <- createLookup(gads, recodeVars = c("var1", "var2"), sort_by = c("value", "variable"))

# create Lookup table for recoding by multiple recoders
lookup2 <- createLookup(gads, recodeVars = c("var1", "var2"), sort_by = c("value", "variable"),
                        addCols = c("value_recoder1", "value_recoder2"))

# create example GADS
dat <- data.frame(ID = 1:4, var1 = c(NA, "Eng", "Aus", "Aus2"),
                  var2 = c(NA, "French", "Ger", "Ita"),
                  stringsAsFactors = FALSE)
gads <- import_DF(dat)

# create Lookup table for recoding
lookup <- createLookup(gads, recodeVars = c("var1", "var2"), sort_by = c("value", "variable"))

# create Lookup table for recoding by multiple recoders
lookup2 <- createLookup(gads, recodeVars = c("var1", "var2"), sort_by = c("value", "variable"),
                        addCols = c("value_recoder1", "value_recoder2"))

Create data.frame for specification of numerical plausibility checks.

Description

All numerical variables without value labels in a GADSdat are selected and a data.frame is created, which allows the specification of minima and maxima.

Usage

createNumCheck(GADSdat)
createNumCheck(GADSdat)

Arguments

GADSdat

A GADSdat object.

Details

This function is currently under development.

Value

A data.frame with the following variables:

`variable`	All numerical variables in the `GADSdat`
`varLabel`	Corresponding variable labels
`min`	Minimum value for the specific variable.
`max`	Maximum value for the specific variable.
`value_new`	Which value should be inserted if values exceed the specified range?

Examples

# tbd

# tbd

Create a variable.

Description

Create an empty variable as part of a GADSdat object.

Usage

createVariable(GADSdat, varName, checkVarName = TRUE)
createVariable(GADSdat, varName, checkVarName = TRUE)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Name of the variable to be cloned.
`checkVarName`	Logical. Should `varName` be checked by `checkVarNames`?

Value

Returns a GADSdat.

Examples

# create a new variable
pisa_new <- createVariable(pisa, varName = "new_variable")

# create a new variable
pisa_new <- createVariable(pisa, varName = "new_variable")

Drop duplicate IDs in a `GADSdat`.

Description

Drop rows with duplicate IDs in a GADSdat object based on numbers of missing values.

Usage

dropDuplicateIDs(GADSdat, ID, varNames = setdiff(namesGADS(GADSdat), ID))
dropDuplicateIDs(GADSdat, ID, varNames = setdiff(namesGADS(GADSdat), ID))

Arguments

`GADSdat`	A `GADSdat` object.
`ID`	Name of the ID variable.
`varNames`	Character vector of variable names: Sum of missing values on these variables decide which rows are kept. Per default, all variables except the ID variable are used.

Details

If duplicate IDs occur, it is often desirable to keep the row with the least missing information. Therefore, dropDuplicateIDs drops rows based on number of missing values on the specified variables (varNames).

If multiple rows have the same number of missing values, a warning is issued and the first of the respective rows is kept.

Value

Returns the GADSdat with duplicate ID rows removed.

Examples

# create example data set
gads_ori <- import_DF(data.frame(id_var = c(1, 2, 5, 4, 4),
  var1 = c(1, 2, -99, 1, -99)))
gads_ori <- changeMissings(gads_ori, varName = "var1",
  value = -99, missings = "miss")

# drop duplicate IDs
dropDuplicateIDs(gads_ori, ID = "id_var")

# create example data set
gads_ori <- import_DF(data.frame(id_var = c(1, 2, 5, 4, 4),
  var1 = c(1, 2, -99, 1, -99)))
gads_ori <- changeMissings(gads_ori, varName = "var1",
  value = -99, missings = "miss")

# drop duplicate IDs
dropDuplicateIDs(gads_ori, ID = "id_var")

Transform dummy variables to character variables.

Description

Convert a set of dummy variables into a set of character variables.

Usage

dummies2char(GADSdat, dummies, dummyValues, charNames, checkVarNames = TRUE)
dummies2char(GADSdat, dummies, dummyValues, charNames, checkVarNames = TRUE)

Arguments

`GADSdat`	A `GADSdat` object.
`dummies`	A character vector with the names of the dummy variables.
`dummyValues`	A vector with the values which the dummy variables represent.
`charNames`	A character vector containing the new variable names.
`checkVarNames`	Logical. Should `charNames` be checked by `checkVarNames`?

Details

A set of dummy variables is transformed to an equal number of character variables. The character variables are aligned to the left and the remaining character variables are set to NA. For each new variable the missing codes of the respective dummy variable are reused.

Value

Returns a GADSdat.

Examples

## create an example GADSdat
dummy_df <- data.frame(d1 = c("eng", "no eng", "eng"),
                      d2 = c("french", "french", "no french"),
                      d3 = c("no ger", "ger", "no ger"),
                      stringsAsFactors = TRUE)
dummy_g <- import_DF(dummy_df)

## transform dummy variables
dummy_g2 <- dummies2char(dummy_g, dummies = c("d1", "d2", "d3"),
                        dummyValues = c("english", "french", "german"),
                        charNames = c("char1", "char2", "char3"))


## create an example GADSdat
dummy_df <- data.frame(d1 = c("eng", "no eng", "eng"),
                      d2 = c("french", "french", "no french"),
                      d3 = c("no ger", "ger", "no ger"),
                      stringsAsFactors = TRUE)
dummy_g <- import_DF(dummy_df)

## transform dummy variables
dummy_g2 <- dummies2char(dummy_g, dummies = c("d1", "d2", "d3"),
                        dummyValues = c("english", "french", "german"),
                        charNames = c("char1", "char2", "char3"))

Set variables to `NA`.

Description

Set all values within one or multiple variables to NA.

Usage

emptyTheseVariables(GADSdat, vars, label_suffix = "")
emptyTheseVariables(GADSdat, vars, label_suffix = "")

Arguments

`GADSdat`	A `GADSdat` object.
`vars`	Character vector of variable names which should be set to `NA`.
`label_suffix`	Suffix added to variable labels for the affected variables in the `GADSdat`.

Value

Returns the recoded GADSdat.

Examples

# empty multiple variables
pisa2 <- emptyTheseVariables(pisa, vars = c("idstud", "idschool"))
# empty multiple variables
pisa2 <- emptyTheseVariables(pisa, vars = c("idstud", "idschool"))

Test if two `GADSdat` objects are (nearly) equal

Description

Run tests to check whether two GADSdat objects are (nearly) equal. Variable names, number of rows in the data, meta data and data differences are checked and reported as a list output.

Usage

equalGADS(
  target,
  current,
  id = NULL,
  metaExceptions = c("display_width", "labeled"),
  tolerance = sqrt(.Machine$double.eps)
)
equalGADS(
  target,
  current,
  id = NULL,
  metaExceptions = c("display_width", "labeled"),
  tolerance = sqrt(.Machine$double.eps)
)

Arguments

`target`	A `GADSdat` object.
`current`	A `GADSdat` object.
`id`	A character vector of length 1 containing the unique identifier column of both `GADSdat`. If specified, both `GADSdat` are ordered according to `ID` before comparing their data.
`metaExceptions`	Should certain meta data columns be excluded from the comparison?
`tolerance`	A numeric value greater than or equal to `0`. Differences smaller than `tolerance` are not reported. The default value is close to `1.5e-8`.

Details

More detailed checks for individual variables can be performed via inspectDifferences and inspectMetaDifferences.

Value

Returns a list.

Transform a `GADSdat` to a `tibble`

Description

haven's read_spss stores data together with meta data (e.g. value and variable labels) in a tibble with attributes on variable level. This function transforms a GADSdat object to such a tibble.

Usage

export_tibble(GADSdat)
export_tibble(GADSdat)

Arguments

GADSdat

GADSdat object imported via eatGADS.

Details

This function is mainly intended for internal use. For further documentation see also write_spss.

Value

Returns a tibble.

Examples

pisa_tbl <- export_tibble(pisa)

pisa_tbl <- export_tibble(pisa)

Extract Data

Description

Extract data.frame from a GADSdat object for analyses in R. Value labels can be selectively applied via defining convertLabels and covertVariables. For extracting meta data see extractMeta.

Usage

extractData(
  GADSdat,
  convertMiss = TRUE,
  convertLabels = c("character", "factor", "numeric"),
  convertVariables = NULL,
  dropPartialLabels = TRUE
)
extractData(
  GADSdat,
  convertMiss = TRUE,
  convertLabels = c("character", "factor", "numeric"),
  convertVariables = NULL,
  dropPartialLabels = TRUE
)

Arguments

`GADSdat`	A `GADSdat` object.
`convertMiss`	Should values tagged as missing values be recoded to `NA`?
`convertLabels`	If `"numeric"`, values remain as numerics. If `"factor"` or `"character"`, values are recoded to their labels. Corresponding variable type is applied.
`convertVariables`	Character vector of variables names, which labels should be applied to. All other variables remain as numeric variables in the data. If not specified [default], value labels are applied to all variables for which labels are available. Variable names not in the actual `GADS` are silently dropped.
`dropPartialLabels`	Should value labels for partially labeled variables be dropped? If `TRUE`, the partial labels will be dropped. If `FALSE`, the variable will be converted to the class specified in `convertLabels`.

Details

A GADSdat object includes actual data (GADSdat$dat) and the corresponding meta data information (GADSdat$labels). extractData extracts the data and applies relevant meta data on value level (missing conversion, value labels), so the data can be used for analyses in R. Variable labels are retained as label attributes on column level.

If factor are extracted via convertLabels == "factor", an attempt is made to preserve the underlying integers. If this is not possible, a warning is issued. As SPSS has almost no limitations regarding the underlying values of labeled integers and R's factor format is very strict (no 0, only integers increasing by + 1), this procedure can lead to frequent problems.

Value

Returns a data frame.

Examples

# Extract Data for Analysis
dat <- extractData(pisa)

# convert labeled variables to factors
dat <- extractData(pisa, convertLabels = "factor")

# convert only some variables to factor, all others remain numeric
dat <- extractData(pisa, convertLabels = "factor", convertVariables = c("schtype", "ganztag"))

# convert only some variables to character, all others remain numeric
dat <- extractData(pisa, convertLabels = "factor", convertVariables = c("schtype", "ganztag"))
# schtype is now character
table(dat$schtype)
# schtype remains numeric
table(dat$gender)

# Extract Data for Analysis
dat <- extractData(pisa)

# convert labeled variables to factors
dat <- extractData(pisa, convertLabels = "factor")

# convert only some variables to factor, all others remain numeric
dat <- extractData(pisa, convertLabels = "factor", convertVariables = c("schtype", "ganztag"))

# convert only some variables to character, all others remain numeric
dat <- extractData(pisa, convertLabels = "factor", convertVariables = c("schtype", "ganztag"))
# schtype is now character
table(dat$schtype)
# schtype remains numeric
table(dat$gender)

Extract Data 2

Description

Extract data.frame from a GADSdat object for analyses in R. Per default, missing codes are applied but value labels are dropped. Alternatively, value labels can be selectively applied via labels2character, labels2factor, and labels2ordered. For extracting meta data see extractMeta.

Usage

extractData2(
  GADSdat,
  convertMiss = TRUE,
  labels2character = NULL,
  labels2factor = NULL,
  labels2ordered = NULL,
  dropPartialLabels = TRUE
)
extractData2(
  GADSdat,
  convertMiss = TRUE,
  labels2character = NULL,
  labels2factor = NULL,
  labels2ordered = NULL,
  dropPartialLabels = TRUE
)

Arguments

`GADSdat`	A `GADSdat` object.
`convertMiss`	Should values tagged as missing values be recoded to `NA`?
`labels2character`	For which variables should values be recoded to their labels? The resulting variables are of type `character`.
`labels2factor`	For which variables should values be recoded to their labels? The resulting variables are of type `factor`.
`labels2ordered`	For which variables should values be recoded to their labels? The resulting variables are of type `ordered`.
`dropPartialLabels`	Should value labels for partially labeled variables be dropped? If `TRUE`, the partial labels will be dropped. If `FALSE`, the variable will be converted to the class specified in `labels2character`, `labels2factor`, or `labels2ordered`.

Details

A GADSdat object includes actual data (GADSdat$dat) and the corresponding meta data information (GADSdat$labels). extractData2 extracts the data and applies relevant meta data on value level (missing tags, value labels), so the data can be used for analyses in R. Variable labels are retained as label attributes on column level.

If factor are extracted via labels2factor or labels2ordered, an attempt is made to preserve the underlying integers. If this is not possible, a warning is issued. As SPSS has almost no limitations regarding the underlying values of labeled integers and R's factor format is very strict (no 0, only integers increasing by + 1), this procedure can lead to frequent problems.

If multiple values of the same variable are assigned the same value label and the variable should be transformed to character, factor, or ordered, a warning is issued and the transformation is correctly performed.

Value

Returns a data frame.

Examples

# Extract Data for Analysis
dat <- extractData2(pisa)

# convert only some variables to character, all others remain numeric
dat <- extractData2(pisa, labels2character = c("schtype", "ganztag"))

# convert only some variables to factor, all others remain numeric
dat <- extractData2(pisa, labels2factor = c("schtype", "ganztag"))

# convert all labeled variables to factors
dat <- extractData2(pisa, labels2factor = namesGADS(pisa))

# convert somme variables to factor, some to character
dat <- extractData2(pisa, labels2character = c("schtype", "ganztag"),
                          labels2factor = c("migration"))

# Extract Data for Analysis
dat <- extractData2(pisa)

# convert only some variables to character, all others remain numeric
dat <- extractData2(pisa, labels2character = c("schtype", "ganztag"))

# convert only some variables to factor, all others remain numeric
dat <- extractData2(pisa, labels2factor = c("schtype", "ganztag"))

# convert all labeled variables to factors
dat <- extractData2(pisa, labels2factor = namesGADS(pisa))

# convert somme variables to factor, some to character
dat <- extractData2(pisa, labels2character = c("schtype", "ganztag"),
                          labels2factor = c("migration"))

Extract Data while merging linking errors.

Description

Support for linking error data bases has been removed from eatGADS. extractDataOld provides (for the time being) backwards compatibility, so linking errors can still be merged automatically.

Usage

extractDataOld(
  GADSdat,
  convertMiss = TRUE,
  convertLabels = "character",
  dropPartialLabels = TRUE,
  convertVariables = NULL
)
extractDataOld(
  GADSdat,
  convertMiss = TRUE,
  convertLabels = "character",
  dropPartialLabels = TRUE,
  convertVariables = NULL
)

Arguments

`GADSdat`	A `GADSdat` object.
`convertMiss`	Should values coded as missing values be recoded to `NA`?
`convertLabels`	If `"numeric"`, values remain as numerics. If `"factor"` or `"character"`, values are recoded to their labels. Corresponding variable type is applied.
`dropPartialLabels`	Should value labels for partially labeled variables be dropped? If `TRUE`, the partial labels will be dropped. If `FALSE`, the variable will be converted to the class specified in `convertLabels`.
`convertVariables`	Character vector of variables names, which labels should be applied to. If not specified (default), value labels are applied to all variables for which labels are available. Variable names not in the actual GADS are silently dropped.

Details

See extractData for the current functionality.

Value

Returns a data frame.

Extract single `GADSdat` from `all_GADSdat`

Description

Function to extract a single GADSdat from an all_GADSdat object.

Usage

extractGADSdat(all_GADSdat, name)
extractGADSdat(all_GADSdat, name)

Arguments

`all_GADSdat`	`all_GADSdat` object
`name`	A character vector with length 1 with the name of the `GADSdat`

Details

GADSdat objects can be merged into a single all_GADSdat object via mergeLabels. This function, performs the reverse action, extracting a single GADSdat object.

Value

Returns an GADSdat object.

Examples

# see createGADS vignette

# see createGADS vignette

Get Meta Data

Description

Extract meta data (e.g. variable and values labels) from an eatGADS object. This can be a GADSdat, an all_GADSdat, a labels data.frame, or the path to an existing data base.

Usage

extractMeta(GADSobject, vars = NULL)
extractMeta(GADSobject, vars = NULL)

Arguments

`GADSobject`	Either a `GADSdat` object or a path to an existing `eatGADS` data base.
`vars`	A character vector containing variable names. If `NULL` (default), all available meta information is returned.

Details

Meta data is stored tidily in all GADSdat objects as a separate long format data frame. This information can be extracted for a single or multiple variables.

Value

Returns a long format data frame with meta information.

Examples

# Extract Meta data from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
extractMeta(db_path, vars = c("schtype", "sameteach"))

# Extract Meta data from loaded/imported GADS
extractMeta(pisa, vars = c("schtype", "sameteach"))

# Extract Meta data from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
extractMeta(db_path, vars = c("schtype", "sameteach"))

# Extract Meta data from loaded/imported GADS
extractMeta(pisa, vars = c("schtype", "sameteach"))

Extract or remove variables from a `GADSdat`.

Description

Extract or remove variables and their meta data from a GADSdat object.

Usage

extractVars(GADSdat, vars)

removeVars(GADSdat, vars)
extractVars(GADSdat, vars)

removeVars(GADSdat, vars)

Arguments

`GADSdat`	`GADSdat` object.
`vars`	A character vector containing the variables names in the `GADSdat`.

Details

Both functions simply perform the variable removal or extraction from the underlying data.frame in the GADSdat object followed by calling updateMeta.

Value

Returns a GADSdat object.

Examples

## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        age = c(12, 14, 16, 13),
                        citizenship1 = c("German", "English", "Polish", "Chinese"),
                        citizenship2 = c(NA, "German", "Chinese", "Polish"),
                        stringsAsFactors = TRUE)
gads <- import_DF(example_df)

## remove variables from GADSdat
gads2 <- removeVars(gads, vars = c("citizenship2", "age"))

## extract GADSdat with specific variables
gads3 <- extractVars(gads, vars = c("ID", "citizenship1"))
## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        age = c(12, 14, 16, 13),
                        citizenship1 = c("German", "English", "Polish", "Chinese"),
                        citizenship2 = c(NA, "German", "Chinese", "Polish"),
                        stringsAsFactors = TRUE)
gads <- import_DF(example_df)

## remove variables from GADSdat
gads2 <- removeVars(gads, vars = c("citizenship2", "age"))

## extract GADSdat with specific variables
gads3 <- extractVars(gads, vars = c("ID", "citizenship1"))

Transform a factor variable to dummy variables.

Description

Convert a factor variable with n levels to n dummy variables.

Usage

fac2dummies(GADSdat, var)
fac2dummies(GADSdat, var)

Arguments

`GADSdat`	A `data.frame` or `GADSdat` object.
`var`	A character vector with the name of the factor variable.

Details

Newly created variables are named as the original variable with the suffix "_a", "_b" and so on. Variable labels are created by using the original variable label (if available) and adding the value label of the corresponding level. All missing codes are forwarded to all dummy variables.

Value

Returns a GADSdat containing the newly computed variables.

Examples

## create an example GADSdat
suppressMessages(gads <- import_DF(iris))

## transform factor variable
gads2 <- fac2dummies(gads, var = "Species")


## create an example GADSdat
suppressMessages(gads <- import_DF(iris))

## transform factor variable
gads2 <- fac2dummies(gads, var = "Species")

Transform a complex factor variable to dummy variables.

Description

Convert a factor variable with complex factor levels (factor levels contain combinations of other factor levels) to dummy variables. Dummy variables are coded 1 ("yes") and 0 ("no").

Usage

fac2dummies_complex(GADSdat, var)
fac2dummies_complex(GADSdat, var)

Arguments

`GADSdat`	A `data.frame` or `GADSdat` object.
`var`	A character vector with the name of the factor variable.

Details

The basic functionality of this function is analogous to fac2dummies. However, the function expects factor levels to only go to 9. Higher numbers are treated as combinations of factor levels, for example "13" as "1" and "3".

Value

Returns a GADSdat containing the newly computed variables.

Examples

## create an example GADSdat
df_fac <- data.frame(id = 1:6, fac = c("Opt a", "Opt c, Opt b", "Opt c",
"Opt b", "Opt a, Opt b", "Opt a, Opt b, Opt c"), stringsAsFactors = TRUE)
g_fac <- import_DF(df_fac)
g_fac <- recodeGADS(g_fac, varName = "fac", oldValues = c(1, 2, 3, 4, 5, 6),
                     newValues = c(1, 12, 123, 2, 3, 23))

## transform factor variable
fac2dummies_complex(g_fac, "fac")


## create an example GADSdat
df_fac <- data.frame(id = 1:6, fac = c("Opt a", "Opt c, Opt b", "Opt c",
"Opt b", "Opt a, Opt b", "Opt a, Opt b, Opt c"), stringsAsFactors = TRUE)
g_fac <- import_DF(df_fac)
g_fac <- recodeGADS(g_fac, varName = "fac", oldValues = c(1, 2, 3, 4, 5, 6),
                     newValues = c(1, 12, 123, 2, 3, 23))

## transform factor variable
fac2dummies_complex(g_fac, "fac")

Fill imputed values.

Description

Fill imputed values in a imputed GADSdat_imp object with original, not imputed values from a GADSdat.

Usage

fillImputations(GADSdat, GADSdat_imp, varName, varName_imp = varName, id, imp)
fillImputations(GADSdat, GADSdat_imp, varName, varName_imp = varName, id, imp)

Arguments

`GADSdat`	A `GADSdat` object.
`GADSdat_imp`	A `GADSdat` object.
`varName`	A character vector of length 1 containing the variable name in `GADSdat`.
`varName_imp`	A character vector of length 1 containing the variable name in `GADSdat_imp`.
`id`	A character vector of length 1 containing the unique identifier column of both `GADSdat`.
`imp`	A character vector of length 1 containing the imputation number in `GADSdat_imp`.

Details

This function only fills in missing values in the imputed variable from the not imputed variable. It provides parts of the functionality of subImputations but does not check whether values have been mistakenly imputed. However, performance is increased substantially.

Value

The modified GADSdat_imp..

Examples

# tbd

# tbd

Remove special characters.

Description

Remove special characters from a character vector or a GADSdat object. Also suitable to fix encoding problems of a character vector or a GADSdat object. See details for available options.

Usage

fixEncoding(x, input = c("other", "ASCII", "windows1250", "BRISE"))
fixEncoding(x, input = c("other", "ASCII", "windows1250", "BRISE"))

Arguments

`x`	A character vector or `GADSdat` object.
`input`	Which encoding was used in `import_spss`.

Details

The option "other" replaces correctly encoded special signs. The option "ASCII" works for strings which were encoded presumably using UTF-8 and imported using ASCII encoding. The option "windows1250" works for strings which were encoded presumably using UTF-8 and imported using windows-1250 encoding. The option "BRISE" covers a unique case used at the FDZ at IQB.

If entries are all upper case, special characters are also transformed to all upper case (e.g., "AE" instead of "Ae").

Value

The modified character vector or GADSdat object.

Examples

fixEncoding(c("\U00C4pfel", "\U00C4PFEL", paste0("\U00DC", "ben"), paste0("\U00DC", "BEN")))

fixEncoding(c("\U00C4pfel", "\U00C4PFEL", paste0("\U00DC", "ben"), paste0("\U00DC", "BEN")))

Extract table for Meta Data Changes.

Description

Function to obtain a data frame from a GADSdat object for for changes to meta data on variable or on value level.

Usage

getChangeMeta(GADSdat, level = "variable")
getChangeMeta(GADSdat, level = "variable")

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`level`	`'variable'` or `'value'`.

Details

Changes on variable level include variable names (varName), variable labels (varLabel), SPSS format ((format)) and display width (display_width). Changes on value level include values (value), value labels (valLabel) and missing codes (missings).

Value

Returns the meta data sheet for all variables including the corresponding change columns.

Examples

# For changes on variable level
varChangeTable <- getChangeMeta(pisa, level = "variable")

# For changes on value level
valChangeTable <- getChangeMeta(pisa, level = "value")

# For changes on variable level
varChangeTable <- getChangeMeta(pisa, level = "variable")

# For changes on value level
valChangeTable <- getChangeMeta(pisa, level = "value")

Get data from GADS data base.

Description

Extracts variables from a GADS data base. Only the specified variables are extracted. Note that this selection determines the format of the data.frame that is extracted.

Usage

getGADS(vSelect = NULL, filePath)
getGADS(vSelect = NULL, filePath)

Arguments

`vSelect`	Character vector of variable names.
`filePath`	Path of the existing `eatGADS` data base file.

Details

See createDB and dbPull for further explanation of the query and merging processes.

Value

Returns a GADSdat object.

Examples

# Use data base within package
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
pisa_gads <- getGADS(db_path, vSelect = c("schtype", "sameteach"))

# Use data base within package
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
pisa_gads <- getGADS(db_path, vSelect = c("schtype", "sameteach"))

Get data from GADS data base fast from server directory.

Description

Extracts variables from a eatGADS data base. Only the specified variables are extracted. Note that this selection determines the format of the data.frame that is extracted. CAREFUL: This function uses a local temporary directory to speed up loading the data base from a server and caches the data base locally for a running R session. The temporary data base is removed automatically when the running R session is terminated.

Usage

getGADS_fast(vSelect = NULL, filePath, tempPath = tempdir())
getGADS_fast(vSelect = NULL, filePath, tempPath = tempdir())

Arguments

`vSelect`	Character vector of variable names.
`filePath`	Path of the existing `eatGADS` data base file.
`tempPath`	Local directory in which the data base can temporarily be stored. Using the default is recommended.

Details

A random temporary directory is used for caching the data base and is removed, when the R sessions terminates. See createDB and dbPull for further explanation of the query and merging processes.

Value

Returns a GADSdat object.

Get data for trend reports.

Description

Extracts variables from multiple eatGADS data bases. Data can then be extracted from the GADSdat object via extractData. For extracting meta data from a data base or a GADSdat object see extractMeta. To speed up the data loading, getGADS_fast is used per default.

Usage

getTrendGADS(
  filePaths,
  vSelect = NULL,
  years,
  fast = TRUE,
  tempPath = tempdir(),
  verbose = TRUE
)
getTrendGADS(
  filePaths,
  vSelect = NULL,
  years,
  fast = TRUE,
  tempPath = tempdir(),
  verbose = TRUE
)

Arguments

`filePaths`	Character vectors with paths to the `eatGADS` db files.
`vSelect`	Variables from all GADS to be selected (as character vector).
`years`	A numeric vector with identical length as `filePaths`.
`fast`	Should `getGADS_fast` be used for data loading instead of `getGADS`? Using the default is heavily recommended.
`tempPath`	The directory, in which both GADS will be temporarily stored. Using the default is heavily recommended.
`verbose`	Should the loading process be reported?

Details

This function extracts data from multiple GADS data bases. All data bases have to be created via createGADS. The data bases are joined via rbind() and a variable year is added, corresponding to the argument years. The GADSdat object can then further be used via extractData. See createDB and dbPull for further explanation of the querying and merging processes.

Value

Returns a GADSdat object.

Examples

# See getGADS vignette


# See getGADS vignette

Get data for trend reports.

Description

Support for linking error data bases has been removed from eatGADS. getGADSold provides (for the time being) backwards compatibility, so linking errors can still be extracted automatically.

Usage

getTrendGADSOld(
  filePath1,
  filePath2,
  lePath = NULL,
  vSelect = NULL,
  years,
  fast = TRUE,
  tempPath = tempdir()
)
getTrendGADSOld(
  filePath1,
  filePath2,
  lePath = NULL,
  vSelect = NULL,
  years,
  fast = TRUE,
  tempPath = tempdir()
)

Arguments

`filePath1`	Path of the first `eatGADS` db file.
`filePath2`	Path of the second `eatGADS` db file.
`lePath`	Path of the linking error db file. If `NULL`, no linking errors are added to the data.
`vSelect`	Variables from both GADS to be selected (as character vector).
`years`	A numeric vector of length 2. The first elements corresponds to `filePath1`, the second element to `filePath2`.
`fast`	Should `getGADS_fast` be used for data loading instead of `getGADS`? Using the default is heavily recommended.
`tempPath`	The directory, in which both GADS will be temporarily stored. Using the default is heavily recommended.

Details

See getGADS for the current functionality.

Value

Returns a GADSdat object.

Examples

# See getGADS vignette


# See getGADS vignette

Import an object imported via `convertLabel`

Description

Function to import a data.frame object created by convertLabel for use in eatGADS. If possible, importing data via import_spss should always be preferred.

Usage

import_convertLabel(df, checkVarNames = TRUE)
import_convertLabel(df, checkVarNames = TRUE)

Arguments

`df`	A `data.frame`.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?

Details

convertLabel from R package eatAnalysis converts an object imported via read.spss (from the foreign package) to a data.frame with factors and variable labels stored in variable attributes.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Import R `data.frame`

Description

Function to import a data.frame object for use in eatGADS while extracting value labels from factors.

Usage

import_DF(df, checkVarNames = TRUE)
import_DF(df, checkVarNames = TRUE)

Arguments

`df`	A `data.frame`.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?

Details

Factors are integers with labeled variable levels. import_DF extracts these labels and stores them in a separate meta data data.frame. See import_spss for detailed information.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Examples

dat <- import_DF(iris, checkVarNames = FALSE)

# Inspect Meta data
extractMeta(dat)

# Extract Data
dat <- extractData(dat, convertLabels = "character")

dat <- import_DF(iris, checkVarNames = FALSE)

# Inspect Meta data
extractMeta(dat)

# Extract Data
dat <- extractData(dat, convertLabels = "character")

Import R data frame with explicit meta data sheets

Description

Function to import a data.frame object for use in eatGADS while adding explicit variable and value meta information through separate data.frames.

Usage

import_raw(df, varLabels, valLabels = NULL, checkVarNames = TRUE)
import_raw(df, varLabels, valLabels = NULL, checkVarNames = TRUE)

Arguments

`df`	A `data.frame`.
`varLabels`	A `data.frame` containing the variable labels. All variables in the data have to have exactly one column in this data.frame.
`valLabels`	A `data.frame` containing the value labels. All referenced variables have to appear in the data, but not all variables in the data have to receive value labels. Can be omitted.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?

Details

The argument varLables has to contain exactly two variables, namely varName and varLabel. valLables has to contain exactly four variables, namely varName, value, valLabel and missings. The column value can only contain numerical values. The column missings can only contain the values "valid" and "miss". Variables of type factor are not supported in any of the data.frames.

Value

Returns a list with the actual data dat and with all meta information in long format labels.

Examples

dat <- data.frame(ID = 1:5, grade = c(1, 1, 2, 3, 1))
varLabels <- data.frame(varName = c("ID", "grade"),
                       varLabel = c("Person Identifier", "School grade Math"),
                       stringsAsFactors = FALSE)
valLabels <- data.frame(varName = c("grade", "grade", "grade"),
                       value = c(1, 2, 3),
                       valLabel = c("very good", "good", "sufficient"),
                       missings = c("valid", "valid", "valid"),
                       stringsAsFactors = FALSE)

gads <- import_raw(df = dat, varLabels = varLabels, valLabels = valLabels, checkVarNames = FALSE)

# Inspect Meta data
extractMeta(gads)

# Extract Data
dat <- extractData(gads, convertLabels = "character")

dat <- data.frame(ID = 1:5, grade = c(1, 1, 2, 3, 1))
varLabels <- data.frame(varName = c("ID", "grade"),
                       varLabel = c("Person Identifier", "School grade Math"),
                       stringsAsFactors = FALSE)
valLabels <- data.frame(varName = c("grade", "grade", "grade"),
                       value = c(1, 2, 3),
                       valLabel = c("very good", "good", "sufficient"),
                       missings = c("valid", "valid", "valid"),
                       stringsAsFactors = FALSE)

gads <- import_raw(df = dat, varLabels = varLabels, valLabels = valLabels, checkVarNames = FALSE)

# Inspect Meta data
extractMeta(gads)

# Extract Data
dat <- extractData(gads, convertLabels = "character")

Import R data frame with a explicit meta data sheet

Description

Function to create a GADSdat object based on a dat data.frame and a labels data.frame.

Usage

import_raw2(dat, labels)
import_raw2(dat, labels)

Arguments

`dat`	A `dat` `data.frame` containing all actual data.
`labels`	A `labels` `data.frame` containing all meta data.

Details

A GADSdat is basically a list with two elements: a dat and a labels data.frame. If these elements are separated, they can be cleanly tied together again by import_raw2. The function performs extensive checks on the integrity of the resulting GADSdat object. See import_spss and import_raw for further details.

Value

Returns a GADSdat object.

Examples

dat <- data.frame(ID = 1:5, grade = c(1, 1, 2, 3, 1))
varLabels <- data.frame(varName = c("ID", "grade"),
                       varLabel = c("Person Identifier", "School grade Math"),
                       stringsAsFactors = FALSE)
valLabels <- data.frame(varName = c("grade", "grade", "grade"),
                       value = c(1, 2, 3),
                       valLabel = c("very good", "good", "sufficient"),
                       missings = c("valid", "valid", "valid"),
                       stringsAsFactors = FALSE)

gads <- import_raw(df = dat, varLabels = varLabels, valLabels = valLabels, checkVarNames = FALSE)

# separate the GADSdat object
dat <- gads$dat
labels <- gads$labels

# rejoin it
dat <- import_raw2(dat, labels)

dat <- data.frame(ID = 1:5, grade = c(1, 1, 2, 3, 1))
varLabels <- data.frame(varName = c("ID", "grade"),
                       varLabel = c("Person Identifier", "School grade Math"),
                       stringsAsFactors = FALSE)
valLabels <- data.frame(varName = c("grade", "grade", "grade"),
                       value = c(1, 2, 3),
                       valLabel = c("very good", "good", "sufficient"),
                       missings = c("valid", "valid", "valid"),
                       stringsAsFactors = FALSE)

gads <- import_raw(df = dat, varLabels = varLabels, valLabels = valLabels, checkVarNames = FALSE)

# separate the GADSdat object
dat <- gads$dat
labels <- gads$labels

# rejoin it
dat <- import_raw2(dat, labels)

Import `RDS` file

Description

Function to import a data.frame stored as a .RDS file while extracting value labels from factors.

Usage

import_RDS(filePath, checkVarNames = TRUE)
import_RDS(filePath, checkVarNames = TRUE)

Arguments

`filePath`	Source file location, ending on `.RDS`.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?

Details

Factors are integers with labeled variable levels. import_RDS extracts these labels and stores them in a separate meta data data.frame. See import_DF for detailed information. This function is a wrapper around import_DF.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Import SPSS data

Description

Function to import .sav files while extracting meta information, e.g. variable and value labels.

Usage

import_spss(
  filePath,
  checkVarNames = TRUE,
  labeledStrings = c("drop", "keep", "transform"),
  encoding = NULL
)
import_spss(
  filePath,
  checkVarNames = TRUE,
  labeledStrings = c("drop", "keep", "transform"),
  encoding = NULL
)

Arguments

`filePath`	Source file location, ending on `.sav`.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?
`labeledStrings`	Should strings as labeled values be allowed? If `"drop"` (default), all labeled strings are dropped and `NAs` occur in the meta data. If `"transform"`, all underlying values are transformed to numeric. If `"keep"`, value labels stay untouched. However, the latter possibly corrupts all labeled values.
`encoding`	The character encoding used for the file. The default, `NULL`, use the encoding specified in the file, but sometimes this value is incorrect and it is useful to be able to override it.

Details

SPSS files (.sav) store variable and value labels and assign specific formatting to variables. import_spss imports data from SPSS, while storing this meta-information separately in a long format data frame. Value labels and missing labels are used to identify missing values (see checkMissings). Time and date variables are converted to character.

In some special cases, .sav files seem to consist of a mix of different encoding types. In such cases, haven might throw an error if the encoding argument is not specified or UTF-8 is selected as encoding. To circumvent this problem we recommend using encoding = "ASCII" and fixing the resulting issues manually. For example, fixEncoding provides some fixes for encoding issues specific to the German language.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Examples

# Use spss data from within package
spss_path <- system.file("extdata", "pisa.zsav", package = "eatGADS")
pisa_gads <- import_spss(spss_path)

# Use spss data from within package
spss_path <- system.file("extdata", "pisa.zsav", package = "eatGADS")
pisa_gads <- import_spss(spss_path)

Import `Stata` data

Description

Function to import .dta files while extracting meta information, e.g. variable and value labels.

Usage

import_stata(filePath, checkVarNames = TRUE, labeledStrings = FALSE)
import_stata(filePath, checkVarNames = TRUE, labeledStrings = FALSE)

Arguments

`filePath`	Source file location, ending on `.dta`.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?
`labeledStrings`	Should strings as labeled values be allowed? This possibly corrupts all labeled values.

Details

Stata files (.dta) store variable and value labels and assign specific formatting to variables. import_stata imports data from Stata, while storing this meta-information separately in a long format data frame. Time and date variables are converted to character.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Import `tibble`

Description

Function to import a tibble while extracting meta information, e.g. variable and value labels.

Usage

import_tibble(
  tibble,
  checkVarNames = TRUE,
  labeledStrings = c("drop", "keep", "transform")
)
import_tibble(
  tibble,
  checkVarNames = TRUE,
  labeledStrings = c("drop", "keep", "transform")
)

Arguments

`tibble`	A `tibble` object.
`checkVarNames`	Should variable names be checked for violations of `SQLite` and `R` naming rules?
`labeledStrings`	Should strings as labeled values be allowed? If `"drop"` (default), all labeled strings are dropped and `NAs` occur in the meta data. If `"transform"`, all underlying values are transformed to numeric. If `"keep"`, value labels stay untouched. However, the latter possibly corrupts all labeled values.

Details

Tibbles may store variable and value labels as well as missing tags via the labelled class. import_tibble restructures this meta information separately in a long format data.frame. Value labels and missing tags are used to identify missing tags (see checkMissings). Time and date variables are converted to character.

Value

Returns a list with the actual data dat and a data frame with all meta information in long format labels.

Examples

# Use spss data from within package
spss_path <- system.file("extdata", "pisa.zsav", package = "eatGADS")
pisa_gads <- import_spss(spss_path)

# Use spss data from within package
spss_path <- system.file("extdata", "pisa.zsav", package = "eatGADS")
pisa_gads <- import_spss(spss_path)

Reorder a single variable in a `GADSdat`.

Description

Deprecated. Please use relocateVariable instead.

Usage

insertVariable(GADSdat, var, after = NULL)
insertVariable(GADSdat, var, after = NULL)

Arguments

`GADSdat`	A `GADSdat` object.
`var`	Character string of the variable name which should be sorted.
`after`	Character string of the variable name after which `var` should be inserted. If `NULL`, `var` is inserted at the beginning of the `GADSdat`.

Inspect differences in a variable.

Description

Inspect differences within a single GADSdat or between two GADSdat objects for a specific variable.

Usage

inspectDifferences(
  GADSdat,
  varName,
  other_GADSdat = GADSdat,
  other_varName = varName,
  id
)
inspectDifferences(
  GADSdat,
  varName,
  other_GADSdat = GADSdat,
  other_varName = varName,
  id
)

Arguments

`GADSdat`	A `GADSdat` object.
`varName`	A character vector of length 1 containing the variable name.
`other_GADSdat`	A second `GADSdat` object. If omitted, it is assumed that both variables are part of the first `GADSdat`.
`other_varName`	A character vector of length 1 containing the other variable name. If omitted, it is assumed that both variables have identical names (as supplied in `varName`).
`id`	A character vector of length 1 containing the unique identifier column of both `GADSdat`.

Details

Two GADSdat objects can be compared using equalGADS. If differences in the data for specific variables in the two objects occur, these variables can be further inspected using inspectDifferences. Differences on meta data-level can be inspected via inspectMetaDifferences.

Value

A list.

Examples

# create a second GADS with different data
pisa2 <- pisa
pisa2$dat$age[400:nrow(pisa$dat)] <- sample(pisa2$dat$age[400:nrow(pisa$dat)])

# inspect via equalGADS()
equalGADS(pisa, pisa2)

# inspect via inspectDifferences()
inspectDifferences(GADSdat = pisa, varName = "age", other_GADSdat = pisa2, id = "idstud")

# create a second GADS with different data
pisa2 <- pisa
pisa2$dat$age[400:nrow(pisa$dat)] <- sample(pisa2$dat$age[400:nrow(pisa$dat)])

# inspect via equalGADS()
equalGADS(pisa, pisa2)

# inspect via inspectDifferences()
inspectDifferences(GADSdat = pisa, varName = "age", other_GADSdat = pisa2, id = "idstud")

Inspect meta data differences in a variable.

Description

Inspect meta data differences within a single GADSdat or between two GADSdat objects or GADSdat data bases regarding a specific variable.

Usage

inspectMetaDifferences(
  GADSdat,
  varName,
  other_GADSdat = GADSdat,
  other_varName = varName
)
inspectMetaDifferences(
  GADSdat,
  varName,
  other_GADSdat = GADSdat,
  other_varName = varName
)

Arguments

`GADSdat`	A `GADSdat` object.
`varName`	A character vector of length 1 containing the variable name.
`other_GADSdat`	A second `GADSdat` object. If omitted, it is assumed that both variables are part of the first `GADSdat`.
`other_varName`	A character vector of length 1 containing the other variable name. If omitted, it is assumed that both variables have identical names (as supplied in `varName`).

Details

Two GADSdat objects can be compared using equalGADS. If meta data differences for specific variables in the two objects occur, these variables can be further inspected using inspectMetaDifferences. For data-level differences for a specific variable, see inspectDifferences.

Value

A list.

Examples

# create a second GADS with different meta data
pisa2 <- pisa
pisa2 <- changeVarLabels(pisa2, varName = "sameteach", varLabel = "Same math teacher")
pisa2 <- recodeGADS(pisa2, varName = "sameteach", oldValues = c(1, 2), newValues = c(0, 1))

# inspect via equalGADS()
equalGADS(pisa, pisa2)

# inspect via inspectMetaDifferences()
inspectMetaDifferences(GADSdat = pisa, varName = "sameteach", other_GADSdat = pisa2)

# create a second GADS with different meta data
pisa2 <- pisa
pisa2 <- changeVarLabels(pisa2, varName = "sameteach", varLabel = "Same math teacher")
pisa2 <- recodeGADS(pisa2, varName = "sameteach", oldValues = c(1, 2), newValues = c(0, 1))

# inspect via equalGADS()
equalGADS(pisa, pisa2)

# inspect via inspectMetaDifferences()
inspectMetaDifferences(GADSdat = pisa, varName = "sameteach", other_GADSdat = pisa2)

Labels from relational `eatGADS` data base.

Description

Returns the variable and value labels of all variables in the eatGADS data base.

Usage

labelsGADS(filePath)
labelsGADS(filePath)

Arguments

filePath

Path of the existing eatGADS data base.

Details

Variable, value and missing labels as stored in the original SPSS-files and factors from R files are converted to long format for storage in the data base. labelsGADS returns them as a long format data frame.

Value

Returns a long format data frame including variable names, labels, values, value labels and missing labels.

Examples

# Extract Meta data from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
metaData <- labelsGADS(db_path)

# Extract Meta data from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
metaData <- labelsGADS(db_path)

Match regular expressions and variable names.

Description

Using variable labels, matchValues_varLabels matches a vector of regular expressions to a set of variable names.

Usage

matchValues_varLabels(GADSdat, mc_vars, values, label_by_hand = character(0))
matchValues_varLabels(GADSdat, mc_vars, values, label_by_hand = character(0))

Arguments

`GADSdat`	A `GADSdat` object.
`mc_vars`	A vector containing the names of the variables, which should be matched according to their variable labels.
`values`	A character vector containing the regular expressions for which the `varLabel` column should be searched.
`label_by_hand`	Additional value - `mc_var` pairs. Necessary, if for some `mc_vars` no value exists.

Details

Multiple choice items can be stored as multiple dichotomous variables with the information about the variable stored in the variable labels. The function collapseMultiMC_Text can be used to collapse such dichotomous variables and a character variable, but requires a character vector with variables names of the multiple choice variables. matchValues_varLabels creates such a vector based on matching regular expressions (values) to variable labels.

Note that all variables in mc_vars have to be assigned exactly one value (and vice versa). If a variable name is missing in the output, an error will be thrown. In this case, the label_by_hand argument should be used to specify the regular expression variable name pair manually.

Value

Returns a named character vector. Values of the vector are the variable names in the GADSdat, names of the vector are the regular expressions.

Examples

# Prepare example data
mt2 <- data.frame(ID = 1:4, mc1 = c(1, 0, 0, 0), mc2 = c(0, 0, 0, 0), mc3 = c(0, 1, 1, 0),
                  text1 = c(NA, "Eng", "Aus", "Aus2"), text2 = c(NA, "Franz", NA, NA),
                  stringsAsFactors = FALSE)

mt2_gads <- import_DF(mt2)

mt3_gads <- changeVarLabels(mt2_gads, varName = c("mc1", "mc2", "mc3"),
                           varLabel = c("Lang: Eng", "Aus spoken", "other"))

out <- matchValues_varLabels(mt3_gads, mc_vars = c("mc1", "mc2", "mc3"),
                            values = c("Aus", "Eng", "Eng"),
                            label_by_hand = c("other" = "mc3"))

# Prepare example data
mt2 <- data.frame(ID = 1:4, mc1 = c(1, 0, 0, 0), mc2 = c(0, 0, 0, 0), mc3 = c(0, 1, 1, 0),
                  text1 = c(NA, "Eng", "Aus", "Aus2"), text2 = c(NA, "Franz", NA, NA),
                  stringsAsFactors = FALSE)

mt2_gads <- import_DF(mt2)

mt3_gads <- changeVarLabels(mt2_gads, varName = c("mc1", "mc2", "mc3"),
                           varLabel = c("Lang: Eng", "Aus spoken", "other"))

out <- matchValues_varLabels(mt3_gads, mc_vars = c("mc1", "mc2", "mc3"),
                            values = c("Aus", "Eng", "Eng"),
                            label_by_hand = c("other" = "mc3"))

Merge two `GADSdat` objects into a single `GADSdat` object.

Description

Is a secure way to merge the data and the meta data of two GADSdat objects. Currently, only limited merging options are supported.

Usage

## S3 method for class 'GADSdat'
merge(
  x,
  y,
  by,
  all = TRUE,
  all.x = all,
  all.y = all,
  missingValue = NULL,
  missingValLabel = NULL,
  ...
)
## S3 method for class 'GADSdat'
merge(
  x,
  y,
  by,
  all = TRUE,
  all.x = all,
  all.y = all,
  missingValue = NULL,
  missingValLabel = NULL,
  ...
)

Arguments

`x`	`GADSdat` object imported via `eatGADS`.
`y`	`GADSdat` object imported via `eatGADS`.
`by`	A character vector.
`all`	A character vector (either a full join or an inner join).
`all.x`	See merge.
`all.y`	See merge.
`missingValue`	A numeric value that is used to replace missing values introduced through the merge.
`missingValLabel`	The value label that is assigned to all variables into which `missingValue` is inserted.
`...`	Further arguments are currently not supported but have to be included for `R CMD` checks.

Details

If there are duplicate variables (except the variables specified in the by argument), these variables are removed from y. The meta data is joined for the remaining variables via rbind.

The function supports automatically recoding missing values created through merging with a designated missing code (missingValue) and a value label (missingValLabel).

Value

Returns a GADSdat object.

Prepare data and metadata

Description

Transform multiple GADSdat objects into a list ready for data base creation.

Usage

mergeLabels(...)
mergeLabels(...)

Arguments

...

GADSdat objects, as named arguments in the correct merge order.

Details

The function createGADS takes multiple GADSdat objects as input. The function preserves the ordering in which the objects are supplied, which is then used for the merging order in createGADS. Additionally, the separate lists of meta information for each GADSdat are merged and a data frame unique identifier is added.

Value

Returns an all_GADSdat object, which consists of list with a list of all data frames "datList" and a single data frame containing all meta data information "allLabels".

Examples

# see createGADS vignette

# see createGADS vignette

Recode Missings to `NA`

Description

Recode Missings to NA according to missing labels in label data.frame.

Usage

miss2NA(GADSdat)
miss2NA(GADSdat)

Arguments

GADSdat

A GADSdat object.

Details

Missings are imported as their values via import_spss. Using the value labels in the labels data.frame, miss2NA recodes these missings codes to NA. This function is mainly intended for internal use.

Value

Returns a data.frame with NA instead of missing codes.

Transform one or multiple character variables to factor.

Description

Convert one or multiple character variables to factors. If multiple variables are converted, a common set of value labels is created, which is identical across variables. Existing value labels are preserved.

Usage

multiChar2fac(
  GADSdat,
  vars,
  var_suffix = "_r",
  label_suffix = "(recoded)",
  convertCases = NULL
)
multiChar2fac(
  GADSdat,
  vars,
  var_suffix = "_r",
  label_suffix = "(recoded)",
  convertCases = NULL
)

Arguments

`GADSdat`	A `data.frame` or `GADSdat` object.
`vars`	A character vector with all variables that should be transformed to factor.
`var_suffix`	Variable suffix for the newly created `GADSdat`. If an empty character, the existing variables are overwritten.
`label_suffix`	Suffix added to variable label for the newly created variable in the `GADSdat`.
`convertCases`	Should cases be transformed for all variables? Default `NULL` leaves cases as they are. Available options for converting cases are all lower case (`'lower'`), all upper case (`'upper'`), or first letter upper case, everything else lower case (`'upperFirst'`).

Details

If a set of variables has the same possible values, it is desirable that these variables share the same value labels, even if some of the values do not occur on the individual variables. This function allows the transformation of multiple character variables to factors while assimilating the value labels. The SPSS format of the newly created variables is set to F10.0.

A current limitation of the function is that prior to the conversion, all variables specified in vars must have identical meta data on value level (value labels and missing tags).

If necessary, missing codes can be set after transformation via checkMissings for setting missing codes depending on value labels for all variables or changeMissings for setting missing codes for specific values in a specific variable.

The argument convertCases uses the function convertCase internally. See the respective documentation for more details.

Value

Returns a GADSdat containing the newly computed variable.

Examples

## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        citizenship1 = c("German", "English", "missing by design", "Chinese"),
                        citizenship2 = c("missing", "German", "missing by design", "Polish"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## transform one character variable
gads2 <- multiChar2fac(gads, vars = "citizenship1")

## transform multiple character variables
gads2 <- multiChar2fac(gads, vars = c("citizenship1", "citizenship2"))

## set values to missings
gads3 <- checkMissings(gads2, missingLabel = "missing")

## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        citizenship1 = c("German", "English", "missing by design", "Chinese"),
                        citizenship2 = c("missing", "German", "missing by design", "Polish"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## transform one character variable
gads2 <- multiChar2fac(gads, vars = "citizenship1")

## transform multiple character variables
gads2 <- multiChar2fac(gads, vars = c("citizenship1", "citizenship2"))

## set values to missings
gads3 <- checkMissings(gads2, missingLabel = "missing")

Variables names of a GADS.

Description

Variables names of a GADSdat object, a all_GADSdat object or a eatGADS data base.

Usage

namesGADS(GADS)
namesGADS(GADS)

Arguments

GADS

A GADSdat object, a all_GADSdat or the path to an existing eatGADS data base.

Details

If the function is applied to a GADSdat object, a character vector with all variable names is returned. If the function is applied to a all_GADSdat object or to the path of a eatGADS data base, a named list is returned. Each list entry represents a data table in the object.

Value

Returns a character vector or a named list of character vectors.

Examples

# Extract variable names from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
namesGADS(db_path)

# Extract variable names  from loaded/imported GADS
namesGADS(pisa)

# Extract variable names from data base
db_path <- system.file("extdata", "pisa.db", package = "eatGADS")
namesGADS(db_path)

# Extract variable names  from loaded/imported GADS
namesGADS(pisa)

Order the variables in a `GADSdat`.

Description

Order the variables in a GADSdat according to a character vector. If there are discrepancies between the two sets, a warning is issued.

Usage

orderLike(GADSdat, newOrder)
orderLike(GADSdat, newOrder)

Arguments

`GADSdat`	A `GADSdat` object.
`newOrder`	A character vector containing the order of variables.

Details

The variables in the dat and in the labels section are ordered. Variables not contained in the character vector are moved to the end of the data.

Value

Returns a GADSdat object.

PISA Plus Example Data

Description

A small example data set from the German PISA Plus campus files as distributed by the Forschungsdatenzentrum, IQB.

Usage

pisa
pisa

Format

A data.frame with 500 rows and 133 variables, including:

idstud: Person ID variable
idschool: School ID variable
schtype: School type
...

Source

Research Data Center at the Institute for Educational Quality Improvement (2020). Programme for International Student Assessment - Plus 2012, 2013 (PISA Plus 2012-2013) - Campus File (Version 1) [Data set]. Berlin: Institute for Educational Quality Improvement. doi:10.5159/IQB_PISA_Plus_2012-13_CF_v1

Recode values to `NA`.

Description

Recode multiple values in multiple variables in a GADSdat to NA.

Usage

recode2NA(GADSdat, recodeVars = namesGADS(GADSdat), value = "")
recode2NA(GADSdat, recodeVars = namesGADS(GADSdat), value = "")

Arguments

`GADSdat`	A `GADSdat` object.
`recodeVars`	Character vector of variable names which should be recoded.
`value`	Which values should be recoded to `NA`?

Details

If there are value labels given to the specified value, a warning is issued. Number of recodes per variable are reported.

If a data set is imported from .sav, character variables frequently contain empty strings. Especially if parts of the data are written to .xlsx, this can cause problems (e.g. as look up tables from createLookup), as most function which write to .xlsx convert empty strings to NAs. recodeString2NA can be used to recode all empty strings to NA beforehand.

Value

Returns the recoded GADSdat.

Examples

# create example GADS
dat <- data.frame(ID = 1:4, var1 = c("", "Eng", "Aus", "Aus2"),
                  var2 = c("", "French", "Ger", "Ita"),
                  stringsAsFactors = FALSE)
gads <- import_DF(dat)

# recode empty strings
gads2 <- recode2NA(gads)

# recode numeric value
gads3 <- recode2NA(gads, recodeVars = "ID", value = 1:3)


# create example GADS
dat <- data.frame(ID = 1:4, var1 = c("", "Eng", "Aus", "Aus2"),
                  var2 = c("", "French", "Ger", "Ita"),
                  stringsAsFactors = FALSE)
gads <- import_DF(dat)

# recode empty strings
gads2 <- recode2NA(gads)

# recode numeric value
gads3 <- recode2NA(gads, recodeVars = "ID", value = 1:3)

Recode variables.

Description

Recode one or multiple variables as part of a GADSdat or all_GADSdat object.

Usage

recodeGADS(
  GADSdat,
  varName,
  oldValues,
  newValues,
  existingMeta = c("stop", "value", "value_new", "drop", "ignore")
)
recodeGADS(
  GADSdat,
  varName,
  oldValues,
  newValues,
  existingMeta = c("stop", "value", "value_new", "drop", "ignore")
)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Name of the variable to be recoded.
`oldValues`	Vector containing the old values.
`newValues`	Vector containing the new values (in the respective order as `oldValues`).
`existingMeta`	If values are recoded, which meta data should be used (see details)?

Details

Applied to a GADSdat or all_GADSdat object, this function is a wrapper of getChangeMeta and applyChangeMeta. Beyond that, unlabeled variables and values are recoded as well. oldValues and newValues are matched by ordering in the function call.

If changes are performed on value levels, recoding into existing values can occur. In these cases, existingMeta determines how the resulting meta data conflicts are handled, either raising an error if any occur ("stop"), keeping the original meta data for the value ("value"), using the meta data in the changeTable and, if incomplete, from the recoded value ("value_new"), or leaving the respective meta data untouched ("ignore").

Missing values (NA) are supported in oldValues but not in newValues. For recoding values to NA see recode2NA instead. For recoding character variables, using lookup tables via createLookup is recommended. For changing value labels see changeValLabels.

Value

Returns a GADSdat.

Examples

# Example gads
example_df <- data.frame(ID = 1:5, color = c("blue", "blue", "green", "other", "other"),
                        animal = c("dog", "Dog", "cat", "hors", "horse"),
                        age = c(NA, 16, 15, 23, 50),
                        stringsAsFactors = FALSE)
example_df$animal <- as.factor(example_df$animal)
gads <- import_DF(example_df)

# simple recode
gads2 <- recodeGADS(gads, varName = "animal",
                   oldValues = c(3, 4), newValues = c(7, 8))
# Example gads
example_df <- data.frame(ID = 1:5, color = c("blue", "blue", "green", "other", "other"),
                        animal = c("dog", "Dog", "cat", "hors", "horse"),
                        age = c(NA, 16, 15, 23, 50),
                        stringsAsFactors = FALSE)
example_df$animal <- as.factor(example_df$animal)
gads <- import_DF(example_df)

# simple recode
gads2 <- recodeGADS(gads, varName = "animal",
                   oldValues = c(3, 4), newValues = c(7, 8))

Recode `NAs` to Missing.

Description

Recode NAs in multiple variables in a GADSdat to a numeric value with a value label and a missing tag.

Usage

recodeNA2missing(
  GADSdat,
  recodeVars = namesGADS(GADSdat),
  value = -99,
  valLabel = "missing"
)
recodeNA2missing(
  GADSdat,
  recodeVars = namesGADS(GADSdat),
  value = -99,
  valLabel = "missing"
)

Arguments

`GADSdat`	A `GADSdat` object.
`recodeVars`	Character vector of variable names which should be recoded.
`value`	Which value should `NAs` be recoded to?
`valLabel`	Which value label should `value` be assigned?

Details

The value label and missing tag are only added to variables which contain NAs and which have been recoded. If a variable has an existing value label for value, the existing value label is overwritten and a missing tag is added. A corresponding warning is issued.

Value

Returns the recoded GADSdat.

Examples

# create example GADS
dat <- data.frame(ID = 1:4, age = c(NA, 18, 21, 23),
                  siblings = c(0, 2, NA, NA))
gads <- import_DF(dat)

# recode NAs
gads2 <- recodeNA2missing(gads)
# create example GADS
dat <- data.frame(ID = 1:4, age = c(NA, 18, 21, 23),
                  siblings = c(0, 2, NA, NA))
gads <- import_DF(dat)

# recode NAs
gads2 <- recodeNA2missing(gads)

Recode a string to `NA`.

Description

Deprecated, use recode2NA instead..

Usage

recodeString2NA(GADSdat, recodeVars = namesGADS(GADSdat), string = "")
recodeString2NA(GADSdat, recodeVars = namesGADS(GADSdat), string = "")

Arguments

`GADSdat`	A `GADSdat` object.
`recodeVars`	Character vector of variable names which should be recoded.
`string`	Which string should be recoded to `NA`?

Value

Returns the recoded GADSdat.

Reorder a single variable in a `GADSdat`.

Description

Reorder a single variable in a GADSdat. The variable (var) can be inserted right after another variable (after) or at the beginning of the GADSdat via after = NULL.

Usage

relocateVariable(GADSdat, var, after = NULL)
relocateVariable(GADSdat, var, after = NULL)

Arguments

`GADSdat`	A `GADSdat` object.
`var`	Character string of the variable name which should be sorted.
`after`	Character string of the variable name after which `var` should be inserted. If `NULL`, `var` is inserted at the beginning of the `GADSdat`.

Details

The variables in the dat and in the labels section are ordered. For reordering the whole GADSdat, see orderLike.

Value

Returns a GADSdat object.

Examples

# Insert variable 'migration' after variable 'idclass'
pisa2 <- relocateVariable(pisa, var = "migration", after = "idclass")

# Insert variable 'idclass' at the beginning of the data set
pisa2 <- relocateVariable(pisa, var = "idclass", after = NULL)
# Insert variable 'migration' after variable 'idclass'
pisa2 <- relocateVariable(pisa, var = "migration", after = "idclass")

# Insert variable 'idclass' at the beginning of the data set
pisa2 <- relocateVariable(pisa, var = "idclass", after = NULL)

Shorten multiple text variables while giving NA codes.

Description

Shorten text variables from a certain number on while coding overflowing answers as complete missings.

Usage

remove2NAchar(GADSdat, vars, max_num = 2, na_value, na_label)
remove2NAchar(GADSdat, vars, max_num = 2, na_value, na_label)

Arguments

`GADSdat`	A `GADSdat` object.
`vars`	A character vector with the names of the text variables.
`max_num`	Maximum number of text variables. Additional text variables will be removed and NA codes given accordingly.
`na_value`	Which NA value should be given in cases of too many values on text variables.
`na_label`	Which value label should be given to the `na_value`.

Details

In some cases, multiple text variables contain the information of one variable (e.g. multiple answers to an open item). If this is a case, sometimes the number text variables displaying this variable should be limited. remove2NAchar allows shortening multiple character variables, this means character variables after max_num are removed from the GADSdat. Cases, which had valid responses on these removed variables are coded as missings (using na_value and na_label).

Value

Returns the modified GADSdat.

Examples

## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        citizenship1 = c("German", "English", "missing by design", "Chinese"),
                        citizenship2 = c(NA, "German", "missing by design", "Polish"),
                        citizenship3 = c(NA, NA, NA, "German"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## shorten character variables
gads2 <- remove2NAchar(gads, vars = c("citizenship1", "citizenship2", "citizenship3"),
                      na_value = -99, na_label = "missing: too many answers")


## create an example GADSdat
example_df <- data.frame(ID = 1:4,
                        citizenship1 = c("German", "English", "missing by design", "Chinese"),
                        citizenship2 = c(NA, "German", "missing by design", "Polish"),
                        citizenship3 = c(NA, NA, NA, "German"),
                        stringsAsFactors = FALSE)
gads <- import_DF(example_df)

## shorten character variables
gads2 <- remove2NAchar(gads, vars = c("citizenship1", "citizenship2", "citizenship3"),
                      na_value = -99, na_label = "missing: too many answers")

Remove unused value labels and missing tags.

Description

Remove unused value labels and missing tags of a variable as part of a GADSdat object.

Usage

removeEmptyValLabels(GADSdat, vars, whichValLabels = c("miss", "valid", "all"))
removeEmptyValLabels(GADSdat, vars, whichValLabels = c("miss", "valid", "all"))

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`vars`	Character string of variable names.
`whichValLabels`	Should unused missing value tags and labels (`"miss"`), unused value labels for non-missing values (`"valid"`), or both (`"all"`) be removed?

Value

Returns the GADSdat object with changed meta data.

Examples

gads <- import_DF(data.frame(v1 = 1))
gads <- changeMissings(gads, varName = "v1", value = c(-99, -98), missings = c("miss", "miss"))
gads <- changeValLabels(gads, varName = "v1", value = c(-99), valLabel = c("not reached"))

gads2 <- removeEmptyValLabels(gads, vars = "v1")
gads <- import_DF(data.frame(v1 = 1))
gads <- changeMissings(gads, varName = "v1", value = c(-99, -98), missings = c("miss", "miss"))
gads <- changeValLabels(gads, varName = "v1", value = c(-99), valLabel = c("not reached"))

gads2 <- removeEmptyValLabels(gads, vars = "v1")

Remove value labels and missing tags.

Description

Remove meta data for specific values (value) of a single variable (varName). This includes value labels and missings tags.

Usage

removeValLabels(GADSdat, varName, value, valLabel = NULL)
removeValLabels(GADSdat, varName, value, valLabel = NULL)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character string of a variable name.
`value`	Numeric values.
`valLabel`	[optional] Regular expressions in the value labels corresponding to `value`.

Details

If the argument valLabel is provided the function checks for value and valLabel pairs in the meta data that match both arguments.

Value

Returns the GADSdat object with changed meta data.

Examples

# Remove a label based on value
extractMeta(pisa, "schtype")
pisa2 <- removeValLabels(pisa, varName = "schtype", value = 1)
extractMeta(pisa2, "schtype")

# Remove multiple labels based on value
extractMeta(pisa, "schtype")
pisa3 <- removeValLabels(pisa, varName = "schtype", value = 1:3)
extractMeta(pisa3, "schtype")

# Remove multiple labels based on value - valLabel combination
extractMeta(pisa, "schtype")
pisa4 <- removeValLabels(pisa, varName = "schtype",
                        value = 1:3, valLabel = c("Gymnasium", "other", "several courses"))
extractMeta(pisa4, "schtype")

# Remove a label based on value
extractMeta(pisa, "schtype")
pisa2 <- removeValLabels(pisa, varName = "schtype", value = 1)
extractMeta(pisa2, "schtype")

# Remove multiple labels based on value
extractMeta(pisa, "schtype")
pisa3 <- removeValLabels(pisa, varName = "schtype", value = 1:3)
extractMeta(pisa3, "schtype")

# Remove multiple labels based on value - valLabel combination
extractMeta(pisa, "schtype")
pisa4 <- removeValLabels(pisa, varName = "schtype",
                        value = 1:3, valLabel = c("Gymnasium", "other", "several courses"))
extractMeta(pisa4, "schtype")

Use meta data for variables from another `GADSdat`.

Description

Transfer meta information from one GADSdat to another for one or multiple variables.

Usage

reuseMeta(
  GADSdat,
  varName,
  other_GADSdat,
  other_varName = NULL,
  missingLabels = NULL,
  addValueLabels = FALSE
)
reuseMeta(
  GADSdat,
  varName,
  other_GADSdat,
  other_varName = NULL,
  missingLabels = NULL,
  addValueLabels = FALSE
)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character vector with the names of the variables that should get the new meta data.
`other_GADSdat`	`GADSdat` object imported via `eatGADS` including the desired meta information. Can either be a `GADSdat`, an `eatGADS` data base or an `all_GADSdat` object.
`other_varName`	Character vector with the names of the variables in `other_GADSdat` that contain the meta data which should be copied.
`missingLabels`	How should meta data for missing values be treated? If `NULL`, missing values are transferred as all other labels. If `"drop"`, missing labels are dropped (useful for imputed data). If `"leave"`, missing labels remain untouched. If `"only"`, all valid value labels are dropped.
`addValueLabels`	Should only value labels be added and all other meta information retained?

Details

Transfer of meta information can mean substituting the complete meta information, only adding value labels, adding only "valid" or adding only "miss" missing labels. See the arguments missingLabels and addValueLabels for further details.

Value

Returns the original object with updated meta data.

Examples

# see createGADS vignette

# see createGADS vignette

Split `GADSdat` into hierarchy levels.

Description

Split a GADSdat into multiple, specified hierarchical levels.

Usage

splitGADS(GADSdat, nameList)
splitGADS(GADSdat, nameList)

Arguments

`GADSdat`	A `GADSdat` object.
`nameList`	A list of character vectors. The names in the list correspond the the hierarchy levels.

Details

The function takes a GADSdat object and splits it into its desired hierarchical levels (a all_GADSdat object). Hierarchy level of a variable is also accessible in the meta data via the column data_table. If not all variable names are included in the nameList, the missing variables will be dropped.

Value

Returns an all_GADSdat object, which consists of list with a list of all data frames "datList" and a single data frame containing all meta data information "allLabels". For more details see also mergeLabels.

Examples

# see createGADS vignette

# see createGADS vignette

Transform string to numeric.

Description

Transform a string variable within a GADSdat or all_GADSdat object to a numeric variable.

Usage

stringAsNumeric(GADSdat, varName)
stringAsNumeric(GADSdat, varName)

Arguments

`GADSdat`	`GADSdat` object imported via `eatGADS`.
`varName`	Character string of a variable name.

Details

Applied to a GADSdat or all_GADSdat object, this function uses asNumericIfPossible to change the variable class and changes the format column in the meta data.

Value

Returns the GADSdat object with with the changed variable.

Substitute imputed values.

Description

Substitute imputed values in a imputed GADSdat_imp object with original, not imputed values from a GADSdat.

Usage

subImputations(GADSdat, GADSdat_imp, varName, varName_imp = varName, id, imp)
subImputations(GADSdat, GADSdat_imp, varName, varName_imp = varName, id, imp)

Arguments

`GADSdat`	A `GADSdat` object.
`GADSdat_imp`	A `GADSdat` object.
`varName`	A character vector of length 1 containing the variable name in `GADSdat`.
`varName_imp`	A character vector of length 1 containing the variable name in `GADSdat_imp`.
`id`	A character vector of length 1 containing the unique identifier column of both `GADSdat`.
`imp`	A character vector of length 1 containing the imputation number in `GADSdat_imp`.

Details

There are two cases in which values are substituted: (a) there are missings in varName_imp, (b) values have been imputed even though there is valid information in varName.

Value

The modified GADSdat_imp..

Examples

# tbd

# tbd

Update meta data.

Description

Update the meta data of a GADSdat or all_GADSdat object according to the variables in a new data object.

Usage

updateMeta(GADSdat, newDat, checkVarNames = TRUE)
updateMeta(GADSdat, newDat, checkVarNames = TRUE)

Arguments

`GADSdat`	`GADSdat` or `all_GADSdat` object.
`newDat`	`data.frame` or list of `data.frames` with the modified data. `tibbles` and `data.tables` are currently not supported and need to be transformed to `data.frames` beforehand.
`checkVarNames`	Logical. Should new variable names be checked by `checkVarNames`?

Details

If the data of a GADSdat or a all_GADSdat has changed (supplied via newDat), updateMeta assimilates the corresponding meta data set. If variables have been removed, the corresponding meta data is also removed. If variables have been added, empty meta data is added for these variables. Factors are transformed to numerical and their levels added to the meta data set.

Value

Returns the original object with updated meta data (and removes factors from the data).

Examples

# see createGADS vignette

# see createGADS vignette

Write a `GADSdat` object to a file

Description

Write a GADSdat object, which contains meta information as value and variable labels to an SPSS file (sav) or Stata file (dta). See 'details' for some important limitations.

Usage

write_spss(GADSdat, filePath)

write_stata(GADSdat, filePath)
write_spss(GADSdat, filePath)

write_stata(GADSdat, filePath)

Arguments

`GADSdat`	A `GADSdat` object.
`filePath`	Path of `sav` file to write.

Details

The provided functionality relies on havens write_sav and write_dta functions.

Currently known limitations for write_spss are:

a) value labels for long character variables (> A10) are dropped,
b) under specific conditions very long character variables (> A254) are incorrectly displayed as multiple character variables in SPSS,
c) exporting date or time variables is currently not supported,
d) missing tags are slightly incompatible between SPSS and eatGADS as eatGADS supports unlimited discrete missing tags (but no range of missing tags) and SPSS only supports up to three discrete missing tags or ranges of missing tags. For this purpose, if a variable is assigned more than three discrete missing tags, write_spss() (more precisely export_tibble) performs a silent conversion of the discrete missing tags into a missing range. If this conversion affects other value labels or values in the data not tagged as missing, an error is issued.

Currently known limitations for write_stata are:

a) Variable format is dropped,
b) missing codes are dropped.

Value

Writes file to disc, returns NULL.

Examples


# write to spss
tmp <- tempfile(fileext = ".sav")
write_spss(pisa, tmp)

# write to stata
tmp <- tempfile(fileext = ".dta")
write_stata(pisa, tmp)

# write to spss
tmp <- tempfile(fileext = ".sav")
write_spss(pisa, tmp)

# write to stata
tmp <- tempfile(fileext = ".dta")
write_stata(pisa, tmp)

Write a `GADSdat` object to `txt` and `SPSS` syntax

Description

Write a GADSdat object to a text file (txt) and an accompanying SPSS syntax file containing all meta information (e.g. value and variable labels).

Usage

write_spss2(
  GADSdat,
  txtPath,
  spsPath = NULL,
  savPath = NULL,
  dec = ".",
  fileEncoding = "UTF-8",
  chkFormat = TRUE,
  ...
)
write_spss2(
  GADSdat,
  txtPath,
  spsPath = NULL,
  savPath = NULL,
  dec = ".",
  fileEncoding = "UTF-8",
  chkFormat = TRUE,
  ...
)

Arguments

`GADSdat`	A `GADSdat` object.
`txtPath`	Path of `.txt` file to write, including file name and ending `.txt`. No default.
`spsPath`	Path of `.sps` file to write, including file name and ending `.sps`. Default Path is `txtPath`.
`savPath`	Path of `.sav` file to write, including file name and ending `.sav`. Default Path is `spsPath`.
`dec`	Decimal delimiter for your SPSS version. Other values for `dec` than `","` or `"."` are not implemented yet.
`fileEncoding`	Data file encoding for SPSS. Default is `"UTF-8"`.
`chkFormat`	Whether format checks via `checkFormat` should be performed.
`...`	Arguments to pass to `checkFormat` in particular `changeFormat=FALSE` if needed.

Details

This function is based on eatPreps writeSpss function and is currently under development.

Value

Writes a txt and an sav file to disc, returns nothing.

Examples


# write to spss
tmp_txt <- tempfile(fileext = ".txt")
write_spss2(pisa, txtPath = tmp_txt)

# write to spss
tmp_txt <- tempfile(fileext = ".txt")
write_spss2(pisa, txtPath = tmp_txt)

Package 'eatGADS'

Help Index

Apply Meta Data Changes.

Description

Usage

Arguments

Details

Value

Examples

Recode via lookup table.

Description

Usage

Arguments

Details

Value

Examples

Recode via lookup table into multiple variables.

Description

Usage

Arguments

Details

Value

Examples

Apply recodes according to a numerical check data.frame.

Description

Usage

Arguments

Details

Value

Examples

Assimilate value labels.

Description

Usage

Arguments

Details

Value

Examples

Auto recode a variable in a GADSdat.

Description

Usage

Arguments

Details

Value

Examples

Calculate a scale.

Description

Usage

Arguments

Details

Value

Examples

Bind two GADSdat objects into a single GADSdat object by columns.

Description

Usage

Arguments

Details

Value

Change missing code.

Description

Usage

Arguments

Details

Value

Examples

Change SPSS format.

Description

Usage

Arguments

Details

Value

Examples

Change value labels.

Description

Usage

Arguments

Details

Value

Examples

Change variable labels.

Description

Auto recode a variable in a `GADSdat`.

Bind two `GADSdat` objects into a single `GADSdat` object by columns.

Check `SPSS` Compliance of Meta Data

Checks compatibility of two `eatGADS` data bases.

Check names for `SQLite` column name conventions.