hevslib.pandas module¶
Pandas Functions
-
class
hevslib.pandas.
DataframeList
(filename=None)¶ Bases:
object
Create a new dataframeList
- Parameters
filename – filepath/name of an existing .h5 file if we want to load
- Returns
DataframeList
- Raises
None –
-
addDf
(name=None, description=None, df=None, log=None)¶ Add a new dataframe to the dataframeList
- Parameters
name – name of the DataFrame
description – description of the DataFrame
df – the DataFrame itself
- Returns
None
- Raises
None –
-
exportToH5
(filename, verbose=True)¶ Export to a .h5 file
- Parameters
filename – filepath and name of where we want to export
- Returns
None
- Raises
None –
-
getDfFromName
(name, log=None)¶ Get a specific dataframe given its name
- Parameters
name – name of the dataframe
- Returns
the dataframe
- Return type
Pandas Dataframe
- Raises
None –
-
getInfo
(verbose=False)¶ Get/Display info about this dataframeList
- Parameters
None –
- Returns
dataframe containing info about the dataframeList
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
absoluteNegValues
(df, columns)¶ Absolute value for specified columns
- Parameters
df – pandas dataframe
columns – list of column to process
- Returns
dataframe with absolute values
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
areTwins
(item1, item2, columnsToCompare=None, verbose=False, log=None)¶ Compare to series object and return 1 if they are twins, otherwise 0
- Parameters
item1 – pandas series to compare
item2 – pandas series to compare
columnsToCompare – columns to use for the comparison
- Returns
variable indicating if items are twins
- Return type
boolean
- Raises
None –
-
hevslib.pandas.
cleanDf
(df, verbose=True)¶ - Cleans pandas dataframe
Removes Duplicates
Removes Finite Columns
Removes NaN
- Parameters
df – pandas dataframe
verbose – bool give some informational output
- Returns
cleaned dataframe
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
convertSecToTimeDuration
(df, columns, verbose=False, log=None)¶ Convert second to time duration in dataframe
- Parameters
df – pandas dataframe
columns – list of existing columns that we want to process
- Returns
dataframe with new converted columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
convertTimeDurationToSec
(df, columns, verbose=True, log=None)¶ Convert time duration to sec in dataframe
- Parameters
df – pandas dataframe
columns – list of existing columns that we want to process
- Returns
dataframe with new converted columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
countNaN
(df, verbose=False)¶ Count all NaN cells in a dataframe
- Parameters
df – pandas dataframe to analyse
verbose – bool give some informational output
- Returns
number of cells with a NaN Value
- Return type
int
- Raises
None –
-
hevslib.pandas.
countRowWithNaN
(df, verbose=False)¶ Counts row with at least 1 NaN value
- Parameters
df – pandas dataframe to analyse
verbose – bool give some informational output
- Returns
number of rows with a NaN Value
- Return type
int
- Raises
None –
-
hevslib.pandas.
dfInfo
(df_name, df_description, df, verbose=True)¶ Display info about one dataframe
- Parameters
df_name – string shortname of the dataframe
df_description – string description of the dataframe
df – pandas dataframe
verbose – bool give some informational output
- Returns
None
- Raises
None –
-
hevslib.pandas.
dfInfoAppend
(df_information, df_name, df_description, df, verbose=False)¶ Append dataframe informations to pandas information table
- Parameters
df_informations – existing information table
df_name – string shortname of the dataframe
df_description – string description of the dataframe
dfs – pandas dataframe
verbose – bool give some informational output
- Returns
None
- Raises
None –
-
hevslib.pandas.
dfsInfo
(dfs_name, dfs_description, dfs, verbose=False)¶ Display info about multiple dataframes
- Parameters
dfs_name – list of strings with shortname of the dataframe
dfs_description – list of strings with description of the dataframe
dfs – list of pandas dataframe
verbose – bool give some informational output
- Returns
None
- Raises
None –
-
hevslib.pandas.
displayDiff
(df, index1, index2, columns=None)¶ Display difference between two elements(rows) of the dataframe
- Parameters
df – pandas dataframe
index1 – index of the first item we want to compare
index2 – index of the second item we want to compare
columns – list of colums to display
- Returns
None
- Raises
None –
-
hevslib.pandas.
displayEntryOccurences
(df, columns=None, showOccurencesWhen=None)¶ Display occurences of selected columns
- Parameters
df – pandas dataframe
columns – list of colums to display, None for all columns of df
showOccurencesWhen – int filer to selected on which number to display, None for remove filtering
- Returns
None
- Raises
None –
-
hevslib.pandas.
displayNegTimes
(df, column_t1, column_t2, column_deltatime)¶ Display negative times values of given columns
- Parameters
df – pandas dataframe
columns_t1 – datetime first time to display
columns_t2 – datetime second time to display
column_deltatime – deltatime to search for negative values
- Returns
dataframe of only selected columns and negative times
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
displaySummary
(df, columns=None, verbose=True)¶ Display a summary of the dataframe
- Parameters
df – pandas dataframe
columns – list of colums to display
- Returns
df_summary
- Raises
None –
-
hevslib.pandas.
fillNaNToZero
(df, columns, verbose=True)¶ Fill all NaN values with
0
for given columns- Parameters
df – pandas dataframe with NaN values
columns – list of df columns to search for NaN
verbose – bool give some informational output
- Returns
dataframe with NaN filled by Zeros for selected columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
fillNegTime
(df, column, verbose=True)¶ Fill all negative time values with zero time for given columns
- Parameters
df – pandas dataframe with negative time values
columns – list of df columns to search for negative times
verbose – bool give some informational output
- Returns
dataframe with negative times filled by zeros for selected columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
fillZeroToNaN
(df, columns, verbose=True)¶ Fill all
0
values with NaN for given columns- Parameters
df – pandas dataframe with zero values
columns – list of df columns to search for zeros
verbose – bool give some informational output
- Returns
dataframe with zeros filled by NaN for selected columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
filterByMonth
(df, column, addMonth=0, date=None, verbose=False)¶ Filter a Dataframe by a month in a given column :param df: pandas dataframe :param column: column with datetime entries :param addMonth: int jump month in the past or future :param date: datetime object where year and month is used :param verbose: bool give some informational output
- Returns
dataframe with filtered data
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
filterByWeek
(df, column, addWeek=0, date=None, verbose=False)¶ Filter a Dataframe by a week in a given column :param df: pandas dataframe :param column: column with datetime entries :param addWeek: int jump week in the past or future :param date: datetime object where year and month is used :param verbose: bool give some informational output
- Returns
dataframe with filtered data
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
filterRows
(df, filter, type='eq', verbose=False, log=None)¶ Filter dataframe by filter criteria (keep values defined in filter criteria)
- Parameters
df – pandas dataframe
filter – list [“<column>”,[<filtervalue_1>, <filtervalue_2>]]
type – string (“eq”|”neq”|”lt”|”lte”|”gt”|”gte”)
verbose – bool give some informational output
- Returns
dataframe with filtered data
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
findTwins
(df, columns=None, verbose=False)¶ Find twins in a dataframe (rows with same values)
- Parameters
df – pandas dataframe
columns – list of dataframe columns to use to find twins
- Returns
dataframe with new column containing the list of twins
list of the index of all the twins in the dataframe
- Return type
tuple(Pandas Dataframe, list)
- Raises
None –
-
hevslib.pandas.
fixTypes
(df, dtypes, verbose=True, log=None)¶ Changes types of columns
- Parameters
df – pandas input table with set of given columns
dtypes – dictionaries of types for table columns
verbose – bool give some informational output
- Returns
dataframe with changed columns types
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
getSummary
(df, columns=None)¶ - Compute a summary of the dataframe
like describe() but transposed and with differents columns
- Parameters
df – pandas dataframe
columns – list of colums to display
- Returns
summary
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
keepColumns
(df, columns_keep, text=None, verbose=True)¶ Only keeps all specified columns
- Parameters
df – pandas dataframe
columns – list of columns to keep
text – string to print
verbose – bool give some informational output
- Returns
dataframe with selected columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
listUniqueValues
(df, columns)¶ Display unique values of given columns
- Parameters
df – pandas dataframe
columns – list of colums to display
- Returns
None
- Raises
None –
-
hevslib.pandas.
removeColumns
(df, columns, text=None, verbose=True, log=None)¶ Removes all specified columns
- Parameters
df – pandas dataframe
columns – list of columns to remove
text – string to print
verbose – bool give some informational output
- Returns
dataframe without the specified columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
removeColumnsLessThan
(df, minRowNbr=1000, verbose=True)¶ Removes all Columns with less than x non NaN Values
- Parameters
df – pandas dataframe
verbose – bool give some informational output
- Returns
dataframe without these columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
removeDuplicates
(df, verbose=True)¶ Removes all duplicated from a table
- Parameters
df – pandas dataframe
verbose – bool give some informational output
- Returns
dataframe without duplicates
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
removeFiniteColumns
(df, verbose=True)¶ Removes all columns with only one value
- Parameters
df – pandas dataframe
verbose – bool give some informational output
- Returns
dataframe without the finite columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
removeNaN
(df, column=None, verbose=True)¶ Removes all NaN values from a pandas dataframe
- Parameters
df – pandas dataframe with NaN values
column – list of column in which the NaN will be removed, if none: remove NaN in all columns
verbose – give some informational output
- Returns
dataframe without any NaN values
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
reorderColumns
(df, columns, log=None)¶ Reorder columns of a dataframe
- Parameters
df – pandas dataframe
columns – list of dataframe columns in the order we want them to appear in the df
- Returns
dataframe with ordered columns
- Return type
Pandas Dataframe
- Raises
None –
-
hevslib.pandas.
saveDfCsv
(df, name, outputDir)¶ Export a dataframe to a csv file
- Parameters
df – dataframe to export
name – name of the csv file
outputDir – directory path where we want to export
- Returns
None
- Raises
None –
-
hevslib.pandas.
testNaT
(df, columns, verbose=1)¶ Check if not a time (NaT) values in pandas dataframe exist
- Parameters
df – pandas dataframe
columns – list of colums to search for
verbose – bool give some informational output (1|2)
- Returns
variable indicating if there is NaT values
- Return type
Bool
- Raises
None –
-
hevslib.pandas.
testNegTime
(df, columns, verbose=1)¶ Check if negative time values in pandas dataframe exist
- Parameters
df – pandas dataframe
columns – list of colums to search for
verbose – int give some informational output (1|2|3)
- Returns
variable indicating if there is negative time value
- Return type
Bool
- Raises
None –
-
hevslib.pandas.
testNull
(df, columns, verbose=1, log=None)¶ Check if Null (NaN) values in pandas dataframe exist
- Parameters
df – pandas dataframe
columns – list of colums to search for
verbose – bool give some informational output
- Returns
variable indicating if there is null values
- Return type
Bool
- Raises
None –