Package 'dsmisc' reference manual

Title:	Data Science Box of Pandora Miscellaneous
Description:	Tool collection for common and not so common data science use cases. This includes custom made algorithms for data management as well as value calculations that are hard to find elsewhere because of their specificity but would be a waste to get lost nonetheless. Currently available functionality: find sub-graphs in an edge list data.frame, find mode or modes in a vector of values, extract (a) specific regular expression group(s), generate ISO time stamps that play well with file names, or generate URL parameter lists by expanding value combinations.
Authors:	Peter Meissner [aut, cre]
Maintainer:	Peter Meissner <[email protected]>
License:	GPL (>= 2)
Version:	0.4.0
Built:	2025-02-18 04:41:27 UTC
Source:	https://github.com/petermeissner/dsmisc

df_defactorize

Description

df_defactorize

Usage

df_defactorize(df)
df_defactorize(df)

Arguments

`df`	a data.frame like object

Value

returns the same data.frame except that factor columns have been transformed into character columns

Examples


df <- 
  data.frame(
    a = 1:2, 
    b = factor(c("a", "b")), 
    c = as.character(letters[3:4]), 
    stringsAsFactors = FALSE
 )
vapply(df, class, "")

df_df <- df_defactorize(df)
vapply(df_df, class, "")

df <- 
  data.frame(
    a = 1:2, 
    b = factor(c("a", "b")), 
    c = as.character(letters[3:4]), 
    stringsAsFactors = FALSE
 )
vapply(df, class, "")

df_df <- df_defactorize(df)
vapply(df_df, class, "")

Subgraphs in Undirected Graphs/Networks

Description

Finding and indexing subgraphs in undirected graph.

Usage

graphs_find_subgraphs(id_1, id_2, verbose = 1L)
graphs_find_subgraphs(id_1, id_2, verbose = 1L)

Arguments

`id_1`	vector of integers indicating ids
`id_2`	vector of integers indicating ids
`verbose`	in integer indicating the amount of verbosity; good for long running tasks or to get more information about the workings of the algorithm; currently accepted values: 0, 1, 2

Details

Input is given as two vectors where each pair of node ids 'id_1[i]' - 'id_2[i]' indicates an edge between two nodes.

Value

An integer vector with subgraph ids such that each distinct subgraph - i.e. all nodes are reachable within the graph and no node outside the subgraph is reachable - gets a distinct integer value. Integer values are assigned via

Examples


graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 0)
graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 2)

graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 0)
graphs_find_subgraphs(c(1,2,1,5,6,6), c(2,3,3,4,5,4), verbose = 2)

Mode

Description

Function calculating the mode.

Usage

stats_mode(x, multimodal = FALSE, warn = TRUE)
stats_mode(x, multimodal = FALSE, warn = TRUE)

Arguments

`x`	vector to get mode for
`multimodal`	wether or not all modes should be returned in case of more than one
`warn`	should the function warn about multimodal outcomes?

Value

vector of mode or modes

Mode Allowing for Multi Modal Mode

Description

Function calculating the mode, allowing for multiple modes in case of equal frequencies.

Usage

stats_mode_multi(x)
stats_mode_multi(x)

Arguments

`x`	vector to get mode for

Value

vector with all modes

Extract Regular Expression Groups

Description

Extract Regular Expression Groups

Usage

str_group_extract(string, pattern, group = NULL, nas = TRUE)
str_group_extract(string, pattern, group = NULL, nas = TRUE)

Arguments

`string`	string to extract from
`pattern`	pattern with groups to match
`group`	groups to extract
`nas`	return NA values (TRUE) or filter them out (FALSE)

Value

string vector or string matrix

Examples


strings <- paste(LETTERS, seq_along(LETTERS), sep = "_")
str_group_extract(strings, "([\\w])_(\\d+)")
str_group_extract(strings, "([\\w])_(\\d+)", 1)
str_group_extract(strings, "([\\w])_(\\d+)", 2)

strings <- paste(LETTERS, seq_along(LETTERS), sep = "_")
str_group_extract(strings, "([\\w])_(\\d+)")
str_group_extract(strings, "([\\w])_(\\d+)", 1)
str_group_extract(strings, "([\\w])_(\\d+)", 2)

Time Stamps for File Names

Description

Generating file name ready iso time stamps.

Usage

time_stamp(ts = Sys.time(), sep = c("-", "_", "_"))
time_stamp(ts = Sys.time(), sep = c("-", "_", "_"))

Arguments

`ts`	one or more POSIX time stamp
`sep`	separators to be used for formatting

Value

Returns timestamp string in format yyyy-mm-dd_HH_MM_SS ready to be used safely in file names on various operating systems.

Examples


time_stamp()
time_stamp( Sys.time() - 10000 )


time_stamp()
time_stamp( Sys.time() - 10000 )

Fit number into length of index

Description

Fit number into length of index

Usage

tool_i_fit_index(i, index)
tool_i_fit_index(i, index)

Arguments

`i`	number to make sure to fit into index range
`index`	size of index

Value

number that fits into index by circularly mapping i to range of 1 to size of index

Examples


tool_i_fit_index(-2:6, 3)

tool_i_fit_index(-2:6, 3)

Subset object even if index is out of range

Description

Subset object even if index is out of range

Usage

tool_i_fit_obj(i, obj)
tool_i_fit_obj(i, obj)

Arguments

`i`	number to make sure to fit into index range
`obj`	object to subset data from

Value

elements of object circularly indexed by mapping i to range of 1 to size of object

Examples


tool_i_fit_obj(-2:6, 3)

tool_i_fit_obj(-2:6, 3)

URL Parameter Combinations

Description

Generate URL parameter combinations from sets of parameter values.

Usage

web_gen_param_list_expand(..., sep_1 = "=", sep_2 = "&")
web_gen_param_list_expand(..., sep_1 = "=", sep_2 = "&")

Arguments

`...`	multiple vectors passed on as named arguments or a single list or a data.frame
`sep_1`	first separator to use between key and value
`sep_2`	second separator to use between key-value pairs

Value

string vector with assembled query string parameter combinations

Examples


web_gen_param_list_expand(q = "beluga", lang = c("de", "en"))

web_gen_param_list_expand(q = "beluga", lang = c("de", "en"))

Package 'dsmisc'

Help Index

df_defactorize

Description

Usage

Arguments

Value

Examples

Subgraphs in Undirected Graphs/Networks

Description

Usage

Arguments

Details

Value

Examples

Mode

Description

Usage

Arguments

Value

Mode Allowing for Multi Modal Mode

Description

Usage

Arguments

Value

Extract Regular Expression Groups

Description

Usage

Arguments

Value

Examples

Time Stamps for File Names

Description

Usage

Arguments

Value

Examples

Fit number into length of index

Description

Usage

Arguments

Value

See Also

Examples

Subset object even if index is out of range

Description

Usage

Arguments

Value

See Also

Examples

URL Parameter Combinations

Description

Usage

Arguments

Value

Examples