trojai.datagen package¶
Submodules¶
trojai.datagen.common_label_behaviors module¶
-
class
trojai.datagen.common_label_behaviors.
StaticTarget
(target)[source]¶ Bases:
trojai.datagen.label_behavior.LabelBehavior
Sets label to a defined value
-
class
trojai.datagen.common_label_behaviors.
WrappedAdd
(add_val: int, max_num_classes: int = None)[source]¶ Bases:
trojai.datagen.label_behavior.LabelBehavior
Adds a defined amount to each input label, with an optional maximum value around which labels are wrapped
-
trojai.datagen.common_label_behaviors.
logger
= <Logger trojai.datagen.common_label_behaviors (WARNING)>¶ Defines some common behaviors which are used to modify labels when designing an experiment with triggered and clean data
trojai.datagen.config module¶
-
class
trojai.datagen.config.
TrojAICleanDataConfig
(sign_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, bg_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, merge_obj: trojai.datagen.merge_interface.Merge = None, combined_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None)[source]¶ Bases:
object
-
class
trojai.datagen.config.
ValidInsertLocationsConfig
(algorithm: str = 'brute_force', min_val: Union[int, Sequence[int]] = 0, threshold_val: Union[float, Sequence[float]] = 5.0, num_boxes: int = 5, allow_overlap: Union[bool, Sequence[bool]] = False)[source]¶ Bases:
object
Specifies which algorithm to use for determining the valid spots for trigger insertion on an image and all relevant parameters
-
class
trojai.datagen.config.
XFormMergePipelineConfig
(trigger_list: Sequence[trojai.datagen.entity.Entity] = None, trigger_sampling_prob: Sequence[float] = None, trigger_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, trigger_bg_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, trigger_bg_merge: trojai.datagen.merge_interface.Merge = None, trigger_bg_merge_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, overall_bg_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, overall_bg_triggerbg_merge: trojai.datagen.merge_interface.Merge = None, overall_bg_triggerbg_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None, merge_type: str = 'insert', per_class_trigger_frac: float = None, triggered_classes: Union[str, Sequence[Any]] = 'all')[source]¶ Bases:
object
Defines all configuration items necessary to run the XFormMerge Pipeline, and associated configuration validation.
NOTE: the argument list can be condensed into lists of lists, but that becomes a bit less intuitive to use. We need to think about how best we want to specify these argument lists.
-
trojai.datagen.config.
logger
= <Logger trojai.datagen.config (WARNING)>¶ Contains classes which define configuration used for transforming and modifying objects, as well as the associated validation routines. Ideally, a configuration class should be defined for every pipeline that is defined.
trojai.datagen.constants module¶
-
trojai.datagen.constants.
RANDOM_STATE_DRAW_LIMIT
= 4294967295¶ In the data generation process, every new entity that is generated gets a new random seed by drawing from np.random.RandomState.randint(), where the RandomState object comes from a master RandomState created at the beginning of the data generation process. The constant RANDOM_STATE_DRAW_LIMIT defines the argument passed into the randint(…) call.
The reason we create a new seed for every Entity is to enable reproducibility. Each Entity that is created may go through a series of transformations that include randomness at various stages. As such, having a seed associated with each Entity will enable us to reproduce those specific random variations easily.
trojai.datagen.datatype_xforms module¶
-
class
trojai.datagen.datatype_xforms.
ToTensorXForm
(num_dims: int = 3)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Transformation which defines the conversion of an input array to a tensor of a specified # of dimensions
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the actual to->tensor conversion :param input_obj: the input Entity to be transformed :param random_state_obj: ignored :return: the transformed Entity
-
-
trojai.datagen.datatype_xforms.
logger
= <Logger trojai.datagen.datatype_xforms (WARNING)>¶ Defines data type transformations that may need to occur when processing different data sources
trojai.datagen.entity module¶
-
class
trojai.datagen.entity.
Entity
[source]¶ Bases:
abc.ABC
An Entity is a generalization of a synthetic object. It could stand alone, or a composition of multiple entities. An Entity is composed of some data.See the README for further details on how Entity objects are intended to be used in the TrojAI pipeline.
-
trojai.datagen.entity.
logger
= <Logger trojai.datagen.entity (WARNING)>¶ Defines a generic Entity object, and an Entity convenience wrapper for creating Entities from numpy arrays.
trojai.datagen.experiment module¶
-
class
trojai.datagen.experiment.
ClassicExperiment
(data_root_dir: str, trigger_label_xform: trojai.datagen.label_behavior.LabelBehavior, stratify_split: bool = True)[source]¶ Bases:
object
Defines a classic experiment, which consists of: 1) a specification of the clean data 2) a specification of the modified (triggered) data, and 3) a specification of the split of triggered/clean data for training/testing the model
-
create_experiment
(clean_data_csv: str, experiment_data_folder: str, mod_filename_filter: str = '*', split_clean_trigger: bool = False, trigger_frac: float = 0.2, triggered_classes: Union[str, Sequence[Any]] = 'all', random_state_obj: numpy.random.mtrand.RandomState = RandomState(MT19937) at 0x7FD5BF71F5A0) → Union[Tuple, pandas.core.frame.DataFrame][source]¶ - Creates an “experiment,” which is a dataframe defining the data that should be used, and whether that data is
triggered or not, and the true & actual label associated with that data point.
- TODO:
- [] - Have ability to accept multiple mod_data_folders such that we can sample from them all at a specified
probability to have different triggers
- Parameters
clean_data_csv – path to file which contains a CSV specification of the clean data. The CSV file is expected to have the following columns: [file, label]
experiment_data_folder – the folder which contains the data to mix with for the experiment.
mod_filename_filter – a string filter for determining which files in the folder to consider, if only a a subset is to be considered for sampling
split_clean_trigger – if True, then we return a list of DataFrames, where the triggered & non-triggered data are combined into one DataFrame, if False, we concatenate the triggered and non-triggered data into one DataFrame
trigger_frac – the fraction of data which which should be triggered
triggered_classes – either the string ‘all’, or a Sequence of labels which are to be triggered. If this parameter is ‘all’, then all classes will be triggered in the created experiment. Otherwise, only the classes in the list will be triggered at the percentage requested in the trigger_frac argument of the create_experiment function.
random_state_obj – random state object
- Returns
a dataframe of the data which consists of the experiment. The DataFrame has the following columns: file, true_label, train_label, triggered file - the file path of the data true_label - the actual label of the data train_label - the label of the data the model should be trained on.
This will be equal to true_label if triggered==False
triggered - a boolean value indicating whether this particular sample has a Trigger or not
-
-
trojai.datagen.experiment.
logger
= <Logger trojai.datagen.experiment (WARNING)>¶ Module which contains functionality for generating experiments
trojai.datagen.image_affine_xforms module¶
-
class
trojai.datagen.image_affine_xforms.
PerspectiveXForm
(xform_matrix)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Shifts the perspective of an input Entity
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Performs the perspective shift on the input Entity. :param input_obj: the Entity to be transformed according to the specified perspective shift in the constructor. :param random_state_obj: ignored :return: the transformed Entity
-
-
class
trojai.datagen.image_affine_xforms.
RandomPerspectiveXForm
(perspectives: Sequence[str] = None)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Randomly shifts perspective of input Entity in available perspectives.
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Samples from the possible perspectives according to the sampler specification and then applies that perspective to the input object :param input_obj: Entity to be randomly perspective shifted :param random_state_obj: allows for reprodcible sampling of random perspectives :return: the transformed Entity
-
-
class
trojai.datagen.image_affine_xforms.
RandomRotateXForm
(angle_choices: Sequence[float] = None, angle_sampler_prob: Sequence[float] = None, rotator_kwargs: Dict = None)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Implements a rotation of a random amount of degrees.
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Samples from the possible angles according to the sampler specification and then applies that rotation to the input object :param input_obj: Entity to be randomly rotated :param random_state_obj: a random state used to maintain reproducibility through transformations :return: the transformed Entity
-
-
class
trojai.datagen.image_affine_xforms.
RotateXForm
(angle: int = 90, args: tuple = (), kwargs: dict = None)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Implements a rotation of an Entity by a specified angle amount.
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Performs the rotation specified by the RotateXForm object on an input :param input_obj: The Entity to be rotated :param random_state_obj: ignored :return: the transformed Entity
-
-
class
trojai.datagen.image_affine_xforms.
UniformScaleXForm
(scale_factor: float = 1, kwargs: dict = None)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Implements a uniform scale of a specified amount to an Entity
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Performs the scaling on an input Entity using skimage.transform.rescale :param input_obj: the input object to be scaled :param random_state_obj: ignored :return: the transformed Entity
-
-
trojai.datagen.image_affine_xforms.
get_predefined_perspective_xform_matrix
(xform_str: str, rows: int, cols: int) → numpy.ndarray[source]¶ Returns an affine transform matrix for a string specification of a perspective transformation :param xform_str: a string specification of the perspective to transform
the object into.
- Parameters
rows – the number of rows of the image to be transformed to the specified perspective
cols – the number of cols of the image to be transformed to the specified perspective
- Returns
a numpy array of shape (2,3) which specifies the affine transformation.
See:https://docs.opencv.org/2.4/modules/imgproc/doc/geometric_transformations.html?highlight=getaffinetransform for more information
-
trojai.datagen.image_affine_xforms.
logger
= <Logger trojai.datagen.image_affine_xforms (WARNING)>¶ Module defines several affine transforms using various libraries to perform the actual transformation operation specified.
trojai.datagen.image_conversion_utils module¶
-
trojai.datagen.image_conversion_utils.
gray_to_rgb
(img: numpy.ndarray) → numpy.ndarray[source]¶ Convert given grayscale image to RGB :param img: 1-channel grayscale image :return: image converted to RGB
-
trojai.datagen.image_conversion_utils.
logger
= <Logger trojai.datagen.image_conversion_utils (WARNING)>¶ Contains general utilities for dealing with channel formats
-
trojai.datagen.image_conversion_utils.
normalization_from_rgb
(rgb_img: numpy.ndarray, alpha_ch: Optional[numpy.ndarray], normalize: bool, original_n_chan: int, name: str) → numpy.ndarray[source]¶ Guard for output from rgb-only xforms :param rgb_img: 3-channel RGB image result from calling xform :param alpha_ch: alpha channel extracted at beginning of calling xform or None :param normalize: whether to convert rgb_img back to its original channel format :param original_n_chan: number of channels in its original channel format :param name: name of calling xform :return: if normalize is True the image corresponding to rgb_img converted to its original channel format, otherwise rgb_img unmodified, additional conversions can be added below, currently only RGB to RGBA is implemented
-
trojai.datagen.image_conversion_utils.
normalization_to_rgb
(img: numpy.ndarray, normalize: bool, name: str) → Tuple[numpy.ndarray, Optional[numpy.ndarray]][source]¶ Guard for input to RGB only xforms :param img: input image with variable number of channels :param normalize: whether to attempt to convert img from original channel format to 3-channel RGB :param name: name of calling xform :return: a 3-channel RGB array converted from img, additional conversions can be added below, currently only RGBA to RGB is implemented
-
trojai.datagen.image_conversion_utils.
rgb_to_rgba
(img, alpha_ch: Optional[numpy.ndarray] = None) → numpy.ndarray[source]¶ Converts given image to RGBA, with optionally provided alpha_ch as its alpha channel :param img: 3-channel RGB image or 4-channel RGBA image :param alpha_ch: 1-channel array to be used as alpha value (optional), if img is RGBA this value is ignored :return: if img is 4-channel it is returned unmodified, if img is 3-channel this will return a new RGBA image with img as its RGB channels and either alpha_ch as its alpha channel if provided or a fully opaque alpha channel (max value for its datatype)
-
trojai.datagen.image_conversion_utils.
rgba_to_rgb
(img: numpy.ndarray) → Tuple[numpy.ndarray, Optional[numpy.ndarray]][source]¶ Split given 4-channel RGBA array into a 3-channel RGB array and a 1-channel alpha array :param img: given image to split, must be 3-channel or 4-channel :return: the first three channels of data as a 3-channel RGB image and the fourth channel of img as either a 1-channel alpha array, or None if img has only 3 channels
trojai.datagen.image_entity module¶
-
class
trojai.datagen.image_entity.
GenericImageEntity
(data: numpy.ndarray, mask: numpy.ndarray = None)[source]¶ Bases:
trojai.datagen.image_entity.ImageEntity
A class which allows one to easily instantiate an ImageEntity object with an image and associated mask
-
class
trojai.datagen.image_entity.
ImageEntity
[source]¶ Bases:
trojai.datagen.entity.Entity
-
trojai.datagen.image_entity.
logger
= <Logger trojai.datagen.image_entity (WARNING)>¶ Defines a generic Entity object, and an Entity convenience wrapper for creating Entities from numpy arrays.
trojai.datagen.image_insert_utils module¶
-
trojai.datagen.image_insert_utils.
pattern_fit
(chan_img: numpy.ndarray, chan_pattern: numpy.ndarray, chan_location: Sequence[Any]) → bool[source]¶ Returns True if the pattern at the desired location can fit into the image channel without wrap, and False otherwise
- Parameters
chan_img – a numpy.ndarray of shape (nrows, ncols) which represents an image channel
chan_pattern – a numpy.ndarray of shape (prows, pcols) which represents a channel of the pattern
chan_location – a Sequence of length 2, which contains the x/y coordinate of the top left corner of the pattern to be inserted for this specific channel
- Returns
True/False depending on whether the pattern will fit into the image
-
trojai.datagen.image_insert_utils.
valid_locations
(img: numpy.ndarray, pattern: numpy.ndarray, algo_config: trojai.datagen.config.ValidInsertLocationsConfig, protect_wrap: bool = True) → numpy.ndarray[source]¶ Returns a list of locations per channel which the pattern can be inserted into the img_channel with an overlap algorithm dictated by the appropriate inputs
- Parameters
img – a numpy.ndarray which represents the image of shape: (nrows, ncols, nchans)
pattern – the pattern to be inserted into the image of shape: (prows, pcols, nchans)
algo_config – The provided configuration object specifying the algorithm to use and necessary parameters
protect_wrap – if True, ensures that pattern to be inserted can fit without wrapping and raises an Exception otherwise
- Returns
A boolean mask of the same shape as the input image, with True indicating that that pixel is a valid location for placement of the specified pattern
trojai.datagen.image_size_xforms module¶
-
class
trojai.datagen.image_size_xforms.
Pad
(pad_amounts: tuple = (0, 0, 0, 0), mode: str = 'constant', pad_value: int = 0)[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Resizes an Entity
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the resizing :param img_obj: The input object to be resized according the specified configuration :param random_state_obj: ignored :return: The resized object
-
-
class
trojai.datagen.image_size_xforms.
RandomPadToSize
(new_size: tuple = (200, 200), mode: str = 'constant', pad_value: int = 0)[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Resizes an Entity
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the resizing :param img_obj: The input object to be resized according the specified configuration :param random_state_obj: ignored :return: The resized object
-
-
class
trojai.datagen.image_size_xforms.
RandomResize
(new_size_minimum: tuple = (200, 200), new_size_maximum: tuple = (300, 300), interpolation: int = 2)[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Resizes an Entity
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the resizing :param img_obj: The input object to be resized according the specified configuration :param random_state_obj: ignored :return: The resized object
-
-
class
trojai.datagen.image_size_xforms.
RandomSubCrop
(new_size: tuple = (200, 200))[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Resizes an Entity
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the resizing :param img_obj: The input object to be cropped according the specified configuration :param random_state_obj: ignored :return: The cropped object
-
-
class
trojai.datagen.image_size_xforms.
Resize
(new_size: tuple = (200, 200), interpolation: int = 2)[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Resizes an Entity
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the resizing :param img_obj: The input object to be resized according the specified configuration :param random_state_obj: ignored :return: The resized object
-
-
trojai.datagen.image_size_xforms.
logger
= <Logger trojai.datagen.image_size_xforms (WARNING)>¶ Module contains various classes that relate to size transformations of input objects
trojai.datagen.image_triggers module¶
-
class
trojai.datagen.image_triggers.
RandomRectangularPattern
(num_rows: int, num_cols: int, num_chan: int, color_algorithm: str = 'channel_assign', color_options: dict = None, pattern_style='graffiti', dtype=<class 'numpy.uint8'>, random_state_obj: numpy.random.mtrand.RandomState = RandomState(MT19937) at 0x7FD5B287A160)[source]¶ Bases:
trojai.datagen.image_entity.ImageEntity
Defines a random rectangular pattern
-
class
trojai.datagen.image_triggers.
RectangularPattern
(num_rows: int, num_cols: int, num_chan: int, cval: int, dtype=<class 'numpy.uint8'>)[source]¶ Bases:
trojai.datagen.image_entity.ImageEntity
Define a rectangular pattern
-
class
trojai.datagen.image_triggers.
ReverseLambdaPattern
(num_rows: int, num_cols: int, num_chan: int, trigger_cval: Union[int, Sequence[int]], bg_cval: Union[int, Sequence[int]] = 0, thickness: int = 1, pattern_style: str = 'graffiti', dtype=<class 'numpy.uint8'>)[source]¶ Bases:
trojai.datagen.image_entity.ImageEntity
Defines an alpha pattern
-
trojai.datagen.image_triggers.
logger
= <Logger trojai.datagen.image_triggers (WARNING)>¶ Defines various Trigger Entity objects
trojai.datagen.insert_merges module¶
-
class
trojai.datagen.insert_merges.
FixedInsertTextMerge
(location: int)[source]¶ Bases:
trojai.datagen.merge_interface.TextMerge
-
do
(obj1: trojai.datagen.text_entity.TextEntity, obj2: trojai.datagen.text_entity.TextEntity, random_state_obj: numpy.random.mtrand.RandomState)[source]¶ Perform the actual merge operation :param obj1: the first Entity to be merged :param obj2: the second Entity to be merged :param random_state_obj: a numpy.random.RandomState object to ensure reproducibility :return: the merged Entity
-
-
class
trojai.datagen.insert_merges.
InsertAtLocation
(location: numpy.ndarray, protect_wrap: bool = True)[source]¶ Bases:
trojai.datagen.merge_interface.ImageMerge
Inserts a provided pattern at a specified location
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, pattern_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Inserts a pattern into an image, using the mask of the pattern to determine which specific pixels are modifiable :param img_obj: The background image into which the pattern is inserted :param pattern_obj: The pattern to be inserted. The mask associated with the pattern is used to determine which
specific pixes of the pattern are inserted into the img_obj
- Parameters
random_state_obj – ignored
- Returns
The merged object
-
-
class
trojai.datagen.insert_merges.
InsertAtRandomLocation
(method: str, algo_config: trojai.datagen.config.ValidInsertLocationsConfig, protect_wrap: bool = True)[source]¶ Bases:
trojai.datagen.merge_interface.ImageMerge
Inserts a provided pattern at a random location, where valid locations are determined according to a provided algorithm specification
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, pattern_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the specified merge on the input Entities and return the merged Entity :param img_obj: the image object into which the pattern is to be inserted :param pattern_obj: the pattern object to be inserted :param random_state_obj: used to sample from the possible valid locations, by providing a random state,
we ensure reproducibility of the data
- Returns
the merged Entity
-
-
class
trojai.datagen.insert_merges.
InsertRandomLocationNonzeroAlpha
[source]¶ Bases:
trojai.datagen.merge_interface.ImageMerge
Inserts a defined pattern into an image in a randomly selected location where the alpha channel is non-zero
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, pattern_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the described merge operation :param img_obj: The input object into which the pattern is to be inserted :param pattern_obj: The pattern object which is to be inserted into the image :param random_state_obj: used to sample from the possible valid locations, by providing a random state,
we ensure reproducibility of the data
- Returns
the merged object
-
-
class
trojai.datagen.insert_merges.
InsertRandomWithMask
[source]¶ Bases:
trojai.datagen.merge_interface.ImageMerge
Inserts a defined pattern into an image in a randomly selected location where the specified mask is True
-
do
(img_obj: trojai.datagen.image_entity.ImageEntity, pattern_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the described merge operation :param img_obj: The input object into which the pattern is to be inserted :param pattern_obj: The pattern object which is to be inserted into the image :param random_state_obj: used to sample from the possible valid locations, by providing a random state,
we ensure reproducibility of the data
- Returns
the merged object
-
-
class
trojai.datagen.insert_merges.
RandomInsertTextMerge
[source]¶ Bases:
trojai.datagen.merge_interface.TextMerge
-
do
(obj1: trojai.datagen.text_entity.TextEntity, obj2: trojai.datagen.text_entity.TextEntity, random_state_obj: numpy.random.mtrand.RandomState)[source]¶ Perform the actual merge operation :param obj1: the first Entity to be merged :param obj2: the second Entity to be merged :param random_state_obj: a numpy.random.RandomState object to ensure reproducibility :return: the merged Entity
-
-
trojai.datagen.insert_merges.
logger
= <Logger trojai.datagen.insert_merges (WARNING)>¶ Module which defines several insert style merge operations.
trojai.datagen.instagram_xforms module¶
-
class
trojai.datagen.instagram_xforms.
FilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.transform_interface.ImageTransform
Create filter xform, if no channel order is specified it is assumed to be in BGR order (opencv default), this refers only to the first 3 channels of input data as the alpha channel is handled independently
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Compresses 3-channel image input image as a specified filetype and stores in memory, passes to into wand and applies filter, stores filtered image as specified filetype again in memory, which is then decompressed back into 3-channel image :param input_obj: entity to be transformed :param random_state_obj: object to hold random state and enable reproducibility :return:new entity with transform applied
-
-
class
trojai.datagen.instagram_xforms.
GothamFilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
Class implementing Instagram’s Gotham filter
-
filter
(image: wand.image.Image) → wand.image.Image[source]¶ modified from https://github.com/acoomans/instagram-filters/blob/master/instagram_filters/filters/gotham.py :param image: provided image :return: new filtered image
-
-
class
trojai.datagen.instagram_xforms.
KelvinFilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
Class implementing Instagram’s Kelvin filter
-
filter
(image: wand.image.Image) → wand.image.Image[source]¶ modified from https://github.com/acoomans/instagram-filters/blob/master/instagram_filters/filters/kelvin.py :param image: provided image :return: new filtered image
-
-
class
trojai.datagen.instagram_xforms.
LomoFilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
Class implementing Instagram’s Lomo filter
-
filter
(image: wand.image.Image) → wand.image.Image[source]¶ modified from https://github.com/acoomans/instagram-filters/blob/master/instagram_filters/filters/lomo.py :param image: provided image :return: new filtered image
-
-
class
trojai.datagen.instagram_xforms.
NashvilleFilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
Class implementing Instagram’s Nashville filter
-
filter
(image: wand.image.Image) → wand.image.Image[source]¶ modified from https://github.com/acoomans/instagram-filters/blob/master/instagram_filters/filters/nashville.py :param image: :return: new filtered image
-
-
class
trojai.datagen.instagram_xforms.
NoOpFilterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
No operation Transform for testing purposes
-
class
trojai.datagen.instagram_xforms.
ToasterXForm
(channel_order: str = 'BGR', pre_normalize: bool = True, post_normalize: bool = True)[source]¶ Bases:
trojai.datagen.instagram_xforms.FilterXForm
Class implementing Instagram’s Toaster filter
-
filter
(image: wand.image.Image) → wand.image.Image[source]¶ modified from https://github.com/acoomans/instagram-filters/blob/master/instagram_filters/filters/toaster.py :param image: provided image :return: new filtered image
-
trojai.datagen.label_behavior module¶
trojai.datagen.merge_interface module¶
-
class
trojai.datagen.merge_interface.
ImageMerge
[source]¶ Bases:
trojai.datagen.merge_interface.Merge
Subclass of merges for image entities. Prevents the usage of a text merge on an image entity, which has a distinct underlying data structure.
-
abstract
do
(obj1: trojai.datagen.image_entity.ImageEntity, obj2: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the actual merge operation :param obj1: the first Entity to be merged :param obj2: the second Entity to be merged :param random_state_obj: a numpy.random.RandomState object to ensure reproducibility :return: the merged Entity
-
abstract
-
class
trojai.datagen.merge_interface.
Merge
[source]¶ Bases:
abc.ABC
A Merge is defined as an operation on two Entities and returns a single Entity
-
abstract
do
(obj1: trojai.datagen.entity.Entity, obj2: trojai.datagen.entity.Entity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.entity.Entity[source]¶ Perform the actual merge operation :param obj1: the first Entity to be merged :param obj2: the second Entity to be merged :param random_state_obj: a numpy.random.RandomState object to ensure reproducibility :return: the merged Entity
-
abstract
-
class
trojai.datagen.merge_interface.
TextMerge
[source]¶ Bases:
trojai.datagen.merge_interface.Merge
Subclass of merges for text entities. Prevents the usage of an image merge on a text entity, which has a distinct underlying data structure.
-
abstract
do
(obj1: trojai.datagen.text_entity.TextEntity, obj2: trojai.datagen.text_entity.TextEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.text_entity.TextEntity[source]¶ Perform the actual merge operation :param obj1: the first Entity to be merged :param obj2: the second Entity to be merged :param random_state_obj: a numpy.random.RandomState object to ensure reproducibility :return: the merged Entity
-
abstract
trojai.datagen.pipeline module¶
-
class
trojai.datagen.pipeline.
Pipeline
[source]¶ Bases:
object
A pipeline is a composition of Entities, Transforms, and Merges to produce an output Entity
-
abstract
process
(imglist: Iterable[trojai.datagen.entity.Entity], random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.entity.Entity[source]¶ The method which executes the pipeline, moving data through each of Transform & Merge objects, with data flow being defined by the implementation. :param imglist: A list of Entity objects to be processed by the Pipeline :param random_state_obj: a random state to pass to the transforms and merge operation to ensure
reproducibility of Entities produced by the pipeline
- Returns
The output of the pipeline
-
abstract
trojai.datagen.static_color_xforms module¶
-
class
trojai.datagen.static_color_xforms.
GrayscaleToRGBXForm
[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Converts an 3-channel grayscale image to RGB
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Convert the input object from 3-channel grayscale to RGB :param input_obj: Entity to be colorized :param random_state_obj: ignored :return: The colorized entity
-
-
class
trojai.datagen.static_color_xforms.
RGBAtoRGB
[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Converts input Entity from RGBA to RGB
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the RGBA to RGB transformation :param input_obj: the Entity to be transformed :param random_state_obj: ignored :return: the transformed Entity
-
-
class
trojai.datagen.static_color_xforms.
RGBtoRGBA
[source]¶ Bases:
trojai.datagen.transform_interface.Transform
Converts input Entity from RGB to RGBA
-
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the RGBA to RGB transformation :param input_obj: the Entity to be transformed :param random_state_obj: ignored :return: the transformed Entity
-
-
trojai.datagen.static_color_xforms.
logger
= <Logger trojai.datagen.static_color_xforms (WARNING)>¶ Defines several transformations related to static (non-random) color manipulation
trojai.datagen.transform_interface module¶
-
class
trojai.datagen.transform_interface.
ImageTransform
[source]¶ Bases:
trojai.datagen.transform_interface.Transform
A Transform specific to ImageEntity objects
-
abstract
do
(input_obj: trojai.datagen.image_entity.ImageEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.image_entity.ImageEntity[source]¶ Perform the specified transformation :param input_obj: the input ImageEntity to be transformed :param random_state_obj: a random state used to maintain reproducibility through transformations :return: the transformed ImageEntity
-
abstract
-
class
trojai.datagen.transform_interface.
TextTransform
[source]¶ Bases:
trojai.datagen.transform_interface.Transform
A Transform specific to TextEntity objects
-
abstract
do
(input_obj: trojai.datagen.text_entity.TextEntity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.text_entity.TextEntity[source]¶ Perform the specified transformation :param input_obj: the input TextEntity to be transformed :param random_state_obj: a random state used to maintain reproducibility through transformations :return: the transformed TextEntity
-
abstract
-
class
trojai.datagen.transform_interface.
Transform
[source]¶ Bases:
abc.ABC
A Transform is defined as an operation on an Entity.
-
abstract
do
(input_obj: trojai.datagen.entity.Entity, random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.entity.Entity[source]¶ Perform the specified transformation :param input_obj: the input Entity to be transformed :param random_state_obj: a random state used to maintain reproducibility through transformations :return: the transformed Entity
-
abstract
trojai.datagen.utils module¶
-
trojai.datagen.utils.
logger
= <Logger trojai.datagen.utils (WARNING)>¶ Contains general utilities helpful for data generation
-
trojai.datagen.utils.
process_xform_list
(input_obj: trojai.datagen.entity.Entity, xforms: Iterable[trojai.datagen.transform_interface.Transform], random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.entity.Entity[source]¶ Processes a list of transformations in a serial fashion on a copy of the input X :param input_obj: input object which should be transformed by the list of
transformations
- Parameters
xforms – a list of Transform objects
random_state_obj –
- Returns
The transformed object
trojai.datagen.xform_merge_pipeline module¶
-
class
trojai.datagen.xform_merge_pipeline.
XFormMerge
(xform_list: Sequence[Sequence[Sequence[trojai.datagen.transform_interface.Transform]]], merge_list: Sequence[trojai.datagen.merge_interface.Merge], final_xforms: Sequence[trojai.datagen.transform_interface.Transform] = None)[source]¶ Bases:
trojai.datagen.pipeline.Pipeline
Implements a pipeline which is a series of cascading transform and merge operations. The following diagram shows 4 objects as a series of serial transforms + merges. Each pair of transformations is considered a “stage”, and stages are processed in serial fashion. In the diagram below, the data that each stage processes is:
Stage1: obj1, obj2 Stage2: Stage1_output, obj3 Stage3: Stage2_output, obj4
This extends in the obvious way to more objects, depending on how deep the pipeline is.
- obj1 –> xform obj3 –> xform obj4 –> xform
+ –> xform –> + –> xform –> + –> xform output /
obj2 –> xform
-
process
(imglist: Sequence[trojai.datagen.entity.Entity], random_state_obj: numpy.random.mtrand.RandomState) → trojai.datagen.entity.Entity[source]¶ Processes the provided objects according to the Xform->Merge->Xform paradigm. :param imglist: a sequence of Entity objects to be processed according to the pipeline :param random_state_obj: a random state to pass to the transforms and merge operation to ensure
reproducibility of Entities produced by the pipeline
- Returns
the modified & combined Entity object
-
trojai.datagen.xform_merge_pipeline.
logger
= <Logger trojai.datagen.xform_merge_pipeline (WARNING)>¶ Defines all functions and classes related to the transform+merge pipeline & data movement paradigm.
-
trojai.datagen.xform_merge_pipeline.
modify_clean_image_dataset
(clean_dataset_rootdir: str, clean_csv_file: str, output_rootdir: str, output_subdir: str, mod_cfg: trojai.datagen.config.XFormMergePipelineConfig, method: str = 'insert', random_state_obj: numpy.random.mtrand.RandomState = RandomState(MT19937) at 0x7FD5B25C4AF0) → None[source]¶ Modifies a clean dataset given a configuration
- Parameters
clean_dataset_rootdir – root directory where the clean data lives
clean_csv_file – filename of the CSV file which contains information about the clean data The modification method determines which columns and information are expected in the CSV file.
output_rootdir – root directory where the modified data will be stored
output_subdir –
subdirectory where the modified data will be stored. This is expected to be one level below the root-directory, and can prove useful if different types of modifications are stored in different subdirectories under the main root directory. An example tree structure might be: root_data
- modification_1
… data …
- modification_2
… data …
mod_cfg – A configuration object for creating a modified dataset
method – Can be “insert” only/ In the insert method, the function takes the clean image, and inserts a specified Entity (likely, a pattern) into the clean image. Additional modes to be added!
random_state_obj – RandomState object to ensure reproduciblity of dataset
- Returns
None
-
trojai.datagen.xform_merge_pipeline.
modify_clean_text_dataset
(clean_dataset_rootdir: str, clean_csv_file: str, output_rootdir: str, output_subdir: str, mod_cfg: trojai.datagen.config.XFormMergePipelineConfig, method='insert', random_state_obj: numpy.random.mtrand.RandomState = RandomState(MT19937) at 0x7FD5B25C4C00) → None[source]¶ Modifies a clean image dataset given a configuration
- Parameters
clean_dataset_rootdir – root directory where the clean data lives
clean_csv_file – filename of the CSV file which contains information about the clean data The modification method determines which columns and information are expected in the CSV file.
output_rootdir – root directory where the modified data will be stored
output_subdir –
subdirectory where the modified data will be stored. This is expected to be one level below the root-directory, and can prove useful if different types of modifications are stored in different subdirectories under the main root directory. An example tree structure might be: root_data
- modification_1
… data …
- modification_2
… data …
mod_cfg – A configuration object for creating a modified dataset
method – Can only be “insert” In the insert method, the function takes the clean text blurb, and inserts a specified TextEntity (likely, a pattern) into the first text input object.
random_state_obj – RandomState object to ensure reproduciblity of dataset
- Returns
None
-
trojai.datagen.xform_merge_pipeline.
subset_clean_df_by_labels
(df, labels_to_include)[source]¶ Subsets a dataframe with an expected column ‘label’, to only keep rows which are in that list of labels to include :param df: the dataframe to subset :param labels_to_include: a list of labels to include, or a string ‘all’ indicating that everything should be kept :return: the subsetted data frame