butterfree.hooks package

Subpackages

Submodules

Hook abstract class entity.

class butterfree.hooks.hook.Hook

Bases: abc.ABC

Definition of a hook function to call on a Dataframe.

abstract run(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run interface for Hook.

Parameters

dataframe – dataframe to use in the Hook.

Returns

dataframe result from the Hook.

Definition of hookable component.

class butterfree.hooks.hookable_component.HookableComponent

Bases: object

Defines a component with the ability to hold pre and post hook functions.

All main module of Butterfree have a common object that enables their integration: dataframes. Spark’s dataframe is the glue that enables the transmission of data between the main modules. Hooks have a simple interface, they are functions that accepts a dataframe and outputs a dataframe. These Hooks can be triggered before or after the main execution of a component.

Components from Butterfree that inherit HookableComponent entity, are components that can define a series of steps to occur before or after the execution of their main functionality.

pre_hooks

function steps to trigger before component main functionality.

post_hooks

function steps to trigger after component main functionality.

enable_pre_hooks

property to indicate if the component can define pre_hooks.

enable_post_hooks

property to indicate if the component can define post_hooks.

add_post_hook(*hooks: butterfree.hooks.hook.Hook)butterfree.hooks.hookable_component.HookableComponent

Add a post-hook steps to the component.

Parameters

hooks – Hook steps to add to post_hook list.

Returns

Component with the Hook inserted in post_hook list.

Raises

ValueError – if the component does not accept post-hooks.

add_pre_hook(*hooks: butterfree.hooks.hook.Hook)butterfree.hooks.hookable_component.HookableComponent

Add a pre-hook steps to the component.

Parameters

hooks – Hook steps to add to pre_hook list.

Returns

Component with the Hook inserted in pre_hook list.

Raises

ValueError – if the component does not accept pre-hooks.

property enable_post_hooks

Property to indicate if the component can define post_hooks.

property enable_pre_hooks

Property to indicate if the component can define pre_hooks.

property post_hooks

Function steps to trigger after component main functionality.

property pre_hooks

Function steps to trigger before component main functionality.

run_post_hooks(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run all defined post-hook steps from a given dataframe.

Parameters

dataframe – data to input in the defined post-hook steps.

Returns

dataframe after passing for all defined post-hooks.

run_pre_hooks(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run all defined pre-hook steps from a given dataframe.

Parameters

dataframe – data to input in the defined pre-hook steps.

Returns

dataframe after passing for all defined pre-hooks.

Module contents

Holds Hooks definitions.

class butterfree.hooks.Hook

Bases: abc.ABC

Definition of a hook function to call on a Dataframe.

abstract run(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run interface for Hook.

Parameters

dataframe – dataframe to use in the Hook.

Returns

dataframe result from the Hook.

class butterfree.hooks.HookableComponent

Bases: object

Defines a component with the ability to hold pre and post hook functions.

All main module of Butterfree have a common object that enables their integration: dataframes. Spark’s dataframe is the glue that enables the transmission of data between the main modules. Hooks have a simple interface, they are functions that accepts a dataframe and outputs a dataframe. These Hooks can be triggered before or after the main execution of a component.

Components from Butterfree that inherit HookableComponent entity, are components that can define a series of steps to occur before or after the execution of their main functionality.

pre_hooks

function steps to trigger before component main functionality.

post_hooks

function steps to trigger after component main functionality.

enable_pre_hooks

property to indicate if the component can define pre_hooks.

enable_post_hooks

property to indicate if the component can define post_hooks.

add_post_hook(*hooks: butterfree.hooks.hook.Hook)butterfree.hooks.hookable_component.HookableComponent

Add a post-hook steps to the component.

Parameters

hooks – Hook steps to add to post_hook list.

Returns

Component with the Hook inserted in post_hook list.

Raises

ValueError – if the component does not accept post-hooks.

add_pre_hook(*hooks: butterfree.hooks.hook.Hook)butterfree.hooks.hookable_component.HookableComponent

Add a pre-hook steps to the component.

Parameters

hooks – Hook steps to add to pre_hook list.

Returns

Component with the Hook inserted in pre_hook list.

Raises

ValueError – if the component does not accept pre-hooks.

property enable_post_hooks

Property to indicate if the component can define post_hooks.

property enable_pre_hooks

Property to indicate if the component can define pre_hooks.

property post_hooks

Function steps to trigger after component main functionality.

property pre_hooks

Function steps to trigger before component main functionality.

run_post_hooks(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run all defined post-hook steps from a given dataframe.

Parameters

dataframe – data to input in the defined post-hook steps.

Returns

dataframe after passing for all defined post-hooks.

run_pre_hooks(dataframe: pyspark.sql.dataframe.DataFrame) → pyspark.sql.dataframe.DataFrame

Run all defined pre-hook steps from a given dataframe.

Parameters

dataframe – data to input in the defined pre-hook steps.

Returns

dataframe after passing for all defined pre-hooks.