data module

This module allows for the importing of participant data for use in fitting

Author:Dominic Hunt
class data.Data(participants, participantID='ID', choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, process_data_function=None)[source]

Bases: list

extend(iterable)[source]

Combines two Data instances into one

Parameters:iterable (Data instance or list of participant dicts) –
classmethod from_csv(folder='./', file_name_filter=None, terminal_ID=True, split_by=None, participantID=None, choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, group_by=None, extra_processing=None, csv_read_options=None)[source]

Import data from a folder full of .csv files, where each file contains the information of one participant

Parameters:
  • folder (string, optional) – The folder where the data can be found. Default is the current folder.
  • file_name_filter (callable, string, list of strings or None, optional) – A function to process the file names or a list of possible prefixes as strings or a single string. Default None, no file names removed
  • terminal_ID (bool, optional) – Is there an ID number at the end of the filename? If not then a more general search will be performed. Default True
  • split_by (string or list of strings, optional) – If multiple participants datasets are in one file sheet, this specifies the column or columns that can distinguish and identify the rows for each participant. Default None
  • participantID (string, optional) – The dict key where the participant ID can be found. Default None, which results in the file name being used.
  • choices (string, optional) – The dict key where the participant choices can be found. Default 'actions'
  • feedbacks (string, optional) – The dict key where the feedbacks the participant received can be found. Default 'feedbacks'
  • stimuli (string or list of strings, optional) – The dict keys where the stimulus cues for each trial can be found. Default 'None'
  • action_options (string or list of strings or None or one element list with a list, optional) – If a string or list of strings these are treated as dict keys where the valid actions for each trial can be found. If None then all trials will use all available actions. If the list contains one list then it will be treated as a list of valid actions for each trialstep. Default 'None'
  • group_by (list of strings, optional) – A list of parts of filenames that are repeated across participants, identifying all the files that should be grouped together to form one participants data. The rest of the filename is assumed to identify the participant. Default is None
  • extra_processing (callable, optional) – A function that modifies the dictionary of data read for each participant in such that it is appropriate for fitting. Default is None
  • csv_read_options (dict, optional) – The keyword arguments for pandas.read_csv. Default {}
Returns:

Data

Return type:

Data class instance

See also

pandas.read_csv()

classmethod from_mat(folder='./', file_name_filter=None, terminal_ID=True, participantID=None, choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, group_by=None, extra_processing=None)[source]

Import data from a folder full of .mat files, where each file contains the information of one participant

Parameters:
  • folder (string, optional) – The folder where the data can be found. Default is the current folder.
  • file_name_filter (callable, string, list of strings or None, optional) – A function to process the file names or a list of possible prefixes as strings or a single string. Default None, no file names removed
  • terminal_ID (bool, optional) – Is there an ID number at the end of the filename? If not then a more general search will be performed. Default True
  • participantID (string, optional) – The dict key where the participant ID can be found. Default None, which results in the file name being used.
  • choices (string, optional) – The dict key where the participant choices can be found. Default 'actions'
  • feedbacks (string, optional) – The dict key where the feedbacks the participant received can be found. Default 'feedbacks'
  • stimuli (string or list of strings, optional) – The dict keys where the stimulus cues for each trial can be found. Default 'None'
  • action_options (string or list of strings or None or one element list with a list, optional) – If a string or list of strings these are treated as dict keys where the valid actions for each trial can be found. If None then all trials will use all available actions. If the list contains one list then it will be treated as a list of valid actions for each trialstep. Default 'None'
  • group_by (list of strings, optional) – A list of parts of filenames that are repeated across participants, identifying all the files that should be grouped together to form one participants data. The rest of the filename is assumed to identify the participant. Default is None
  • extra_processing (callable, optional) – A function that modifies the dictionary of data read for each participant in such that it is appropriate for fitting. Default is None
Returns:

Data

Return type:

Data class instance

See also

scipy.io.loadmat()

classmethod from_pkl(folder='./', file_name_filter=None, terminal_ID=True, participantID=None, choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, group_by=None, extra_processing=None)[source]

Import data from a folder full of .pkl files, where each file contains the information of one participant. This will principally be used to import data stored by task simulations

Parameters:
  • folder (string, optional) – The folder where the data can be found. Default is the current folder.
  • file_name_filter (callable, string, list of strings or None, optional) – A function to process the file names or a list of possible prefixes as strings or a single string. Default None, no file names removed
  • terminal_ID (bool, optional) – Is there an ID number at the end of the filename? If not then a more general search will be performed. Default True
  • participantID (string, optional) – The dict key where the participant ID can be found. Default None, which results in the file name being used.
  • choices (string, optional) – The dict key where the participant choices can be found. Default 'actions'
  • feedbacks (string, optional) – The dict key where the feedbacks the participant received can be found. Default 'feedbacks'
  • stimuli (string or list of strings, optional) – The dict keys where the stimulus cues for each trial can be found. Default 'None'
  • action_options (string or list of strings or None or one element list with a list, optional) – If a string or list of strings these are treated as dict keys where the valid actions for each trial can be found. If None then all trials will use all available actions. If the list contains one list then it will be treated as a list of valid actions for each trialstep. Default 'None'
  • group_by (list of strings, optional) – A list of parts of filenames that are repeated across participants, identifying all the files that should be grouped together to form one participants data. The rest of the filename is assumed to identify the participant. Default is None
  • extra_processing (callable, optional) – A function that modifies the dictionary of data read for each participant in such that it is appropriate for fitting. Default is None
Returns:

Data

Return type:

Data class instance

classmethod from_xlsx(folder='./', file_name_filter=None, terminal_ID=True, split_by=None, participantID=None, choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, group_by=None, extra_processing=None, xlsx_read_options=None)[source]

Import data from a folder full of .xlsx files, where each file contains the information of one participant

Parameters:
  • folder (string, optional) – The folder where the data can be found. Default is the current folder.
  • file_name_filter (callable, string, list of strings or None, optional) – A function to process the file names or a list of possible prefixes as strings or a single string. Default None, no file names removed
  • terminal_ID (bool, optional) – Is there an ID number at the end of the filename? If not then a more general search will be performed. Default True
  • split_by (string or list of strings, optional) – If multiple participants datasets are in one file sheet, this specifies the column or columns that can distinguish and identify the rows for each participant. Default None
  • participantID (string, optional) – The dict key where the participant ID can be found. Default None, which results in the file name being used.
  • choices (string, optional) – The dict key where the participant choices can be found. Default 'actions'
  • feedbacks (string, optional) – The dict key where the feedbacks the participant received can be found. Default 'feedbacks'
  • stimuli (string or list of strings, optional) – The dict keys where the stimulus cues for each trial can be found. Default 'None'
  • action_options (string or list of strings or None or one element list with a list, optional) – If a string or list of strings these are treated as dict keys where the valid actions for each trial can be found. If None then all trials will use all available actions. If the list contains one list then it will be treated as a list of valid actions for each trialstep. Default 'None'
  • group_by (list of strings, optional) – A list of parts of filenames that are repeated across participants, identifying all the files that should be grouped together to form one participants data. The rest of the filename is assumed to identify the participant. Default is None
  • extra_processing (callable, optional) – A function that modifies the dictionary of data read for each participant in such that it is appropriate for fitting. Default is None
  • xlsx_read_options (dict, optional) – The keyword arguments for pandas.read_excel
Returns:

Data

Return type:

Data class instance

See also

pandas.read_excel()

classmethod load_data(file_type='csv', folders='./', file_name_filter=None, terminal_ID=True, split_by=None, participantID=None, choices='actions', feedbacks='feedbacks', stimuli=None, action_options=None, group_by=None, extra_processing=None, data_read_options=None)[source]

Import data from a folder. This is a wrapper function for the other import methods

Parameters:
  • file_type (string, optional) – The file type of the data, from mat, csv, xlsx and pkl. Default is csv
  • folders (string or list of strings, optional) – The folder or folders where the data can be found. Default is the current folder.
  • file_name_filter (callable, string, list of strings or None, optional) – A function to process the file names or a list of possible prefixes as strings or a single string. Default None, no file names removed
  • terminal_ID (bool, optional) – Is there an ID number at the end of the filename? If not then a more general search will be performed. Default True
  • split_by (string or list of strings, optional) – If multiple participant datasets are in one file sheet, this specifies the column or columns that can distinguish and identify the rows for each participant. Default None
  • participantID (string, optional) – The dict key where the participant ID can be found. Default None, which results in the file name being used.
  • choices (string, optional) – The dict key where the participant choices can be found. Default 'actions'
  • feedbacks (string, optional) – The dict key where the feedbacks the participant received can be found. Default 'feedbacks'
  • stimuli (string or list of strings, optional) – The dict keys where the stimulus cues for each trial can be found. Default 'None'
  • action_options (string or list of strings or None or one element list with a list, optional) – If a string or list of strings these are treated as dict keys where the valid actions for each trial can be found. If None then all trials will use all available actions. If the list contains one list then it will be treated as a list of valid actions for each trialstep. Default 'None'
  • group_by (list of strings, optional) – A list of parts of filenames that are repeated across participants, identifying all the files that should be grouped together to form one participants data. The rest of the filename is assumed to identify the participant. Default is None
  • extra_processing (callable, optional) – A function that modifies the dictionary of data read for each participant in such that it is appropriate for fitting. Default is None
  • data_read_options (dict, optional) – The keyword arguments for the data importing method chosen
Returns:

Data

Return type:

Data class instance

exception data.DimentionError[source]

Bases: Exception

exception data.FileError[source]

Bases: Exception

exception data.FileFilterError[source]

Bases: Exception

exception data.FileTypeError[source]

Bases: Exception

exception data.FoldersError[source]

Bases: Exception

exception data.IDError[source]

Bases: Exception

exception data.LengthError[source]

Bases: Exception

exception data.ProcessingError[source]

Bases: Exception

data.sort_by_last_number(dataFiles)[source]