Free Recall Analysis¶
Utilities for working with free recall data.
-
psifr.fr.
block_index
(list_labels)¶ Get index of each block in a list.
- Parameters
list_labels (list or numpy.ndarray) – Position labels that define the blocks.
- Returns
block – Block index of each position.
- Return type
numpy.ndarray
Examples
>>> import numpy as np >>> from psifr import fr >>> list_labels = [2, 2, 3, 3, 3, 1, 1] >>> fr.block_index(list_labels) array([1, 1, 2, 2, 2, 3, 3])
-
psifr.fr.
category_crp
(df, category_key, item_query=None, test_key=None, test=None)¶ Conditional response probability of within-category transitions.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled.
category_key (str) – Name of column with category labels.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
results – Has fields:
- subjecthashable
Results are separated by each subject.
- probfloat
Probability of each lag transition.
- actualint
Total of actual made transitions at each lag.
- possibleint
Total of times each lag was possible, given the prior input position and the remaining items to be recalled.
- Return type
pandas.DataFrame
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw, study_keys=['category']) >>> cat_crp = fr.category_crp(data, 'category') >>> cat_crp.head() prob actual possible subject 1 0.801147 419 523 2 0.733456 399 544 3 0.763158 377 494 4 0.814882 449 551 5 0.877273 579 660
-
psifr.fr.
check_data
(df)¶ Run checks on free recall data.
- Parameters
df (pandas.DataFrame) –
- Contains one row for each trial (study and recall). Must have fields:
- subjectnumber or str
Subject identifier.
- listnumber
List identifier. This applies to both study and recall trials.
- trial_typestr
Type of trial; may be ‘study’ or ‘recall’.
- positionnumber
Position within the study list or recall sequence.
- itemstr
Item that was either presented or recalled on this trial.
Examples
>>> from psifr import fr >>> import pandas as pd >>> raw = pd.DataFrame( ... {'subject': [1, 1], 'list': [1, 1], 'position': [1, 2], 'item': ['a', 'b']} ... ) >>> fr.check_data(raw) Traceback (most recent call last): File "psifr/fr.py", line 253, in check_data assert col in df.columns, f'Required column {col} is missing.' AssertionError: Required column trial_type is missing.
-
psifr.fr.
distance_crp
(df, index_key, distances, edges, centers=None, count_unique=False, item_query=None, test_key=None, test=None)¶ Conditional response probability by distance bin.
- Parameters
df (pandas.DataFrame) – Merged free recall data.
index_key (str) – Name of column containing the index of each item in the distances matrix.
distances (numpy.array) – Items x items matrix of pairwise distances or similarities.
edges (array-like) – Edges of bins to apply to the distances.
centers (array-like, optional) – Centers to label each bin with. If not specified, the center point between edges will be used.
count_unique (bool, optional) – If true, possible transitions to a given distance bin will only count once for a given transition.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
crp – Has fields:
- subjecthashable
Results are separated by each subject.
- binint
Distance bin.
- probfloat
Probability of each distance bin.
- actualint
Total of actual transitions for each distance bin.
- possibleint
Total of times each distance bin was possible, given the prior input position and the remaining items to be recalled.
- Return type
pandas.DataFrame
See also
pool_index()
Given a list of presented items and an item pool, look up the pool index of each item.
distance_rank()
Calculate rank of transition distances.
Examples
>>> from scipy.spatial.distance import squareform >>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> items, distances = fr.sample_distances('Morton2013') >>> data['item_index'] = fr.pool_index(data['item'], items) >>> edges = np.percentile(squareform(distances), np.linspace(1, 99, 10)) >>> fr.distance_crp(data, 'item_index', distances, edges) bin prob actual possible subject center 1 0.467532 (0.352, 0.583] 0.085456 151 1767 0.617748 (0.583, 0.653] 0.067916 87 1281 0.673656 (0.653, 0.695] 0.062500 65 1040 0.711075 (0.695, 0.727] 0.051836 48 926 0.742069 (0.727, 0.757] 0.050633 44 869 ... ... ... ... ... 47 0.742069 (0.727, 0.757] 0.062822 61 971 0.770867 (0.757, 0.785] 0.030682 27 880 0.800404 (0.785, 0.816] 0.040749 37 908 0.834473 (0.816, 0.853] 0.046651 39 836 0.897275 (0.853, 0.941] 0.028868 25 866 [360 rows x 4 columns]
-
psifr.fr.
distance_rank
(df, index_key, distances, item_query=None, test_key=None, test=None)¶ Calculate rank of transition distances in free recall lists.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
index_key (str) – Name of column containing the index of each item in the distances matrix.
distances (numpy.array) – Items x items matrix of pairwise distances or similarities.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
stat – Has fields ‘subject’ and ‘rank’.
- Return type
pandas.DataFrame
See also
pool_index()
Given a list of presented items and an item pool, look up the pool index of each item.
distance_crp()
Conditional response probability by distance bin.
Examples
>>> from scipy.spatial.distance import squareform >>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> items, distances = fr.sample_distances('Morton2013') >>> data['item_index'] = fr.pool_index(data['item'], items) >>> dist_rank = fr.distance_rank(data, 'item_index', distances) >>> dist_rank.head() rank subject 1 0.635571 2 0.571457 3 0.627282 4 0.637596 5 0.646181
-
psifr.fr.
filter_data
(data, subjects=None, lists=None, trial_type=None, positions=None, inputs=None, outputs=None)¶ Filter data to get a subset of trials.
- Parameters
data (pandas.DataFrame) – Raw or merged data to filter.
subjects (hashable or list of hashable) – Subject or subjects to include.
lists (hashable or list of hashable) – List or lists to include.
trial_type ({'study', 'recall'}) – Trial type to include.
positions (int or list of int) – Position or positions to include.
inputs (int or list of int) – Input position or positions to include.
outputs (int or list of int) – Output position or positions to include.
- Returns
filtered – The filtered subset of data.
- Return type
pandas.DataFrame
Examples
>>> from psifr import fr >>> subjects_list = [1, 1, 2, 2] >>> study_lists = [['a', 'b'], ['c', 'd'], ['e', 'f'], ['g', 'h']] >>> recall_lists = [['b'], ['d', 'c'], ['f', 'e'], []] >>> raw = fr.table_from_lists(subjects_list, study_lists, recall_lists) >>> fr.filter_data(raw, subjects=1, trial_type='study') subject list trial_type position item 0 1 1 study 1 a 1 1 1 study 2 b 3 1 2 study 1 c 4 1 2 study 2 d
>>> data = fr.merge_free_recall(raw) >>> fr.filter_data(data, subjects=2) subject list item input output study recall repeat intrusion prior_list prior_input 4 2 1 e 1 2.0 True True 0 False NaN NaN 5 2 1 f 2 1.0 True True 0 False NaN NaN 6 2 2 g 1 NaN True False 0 False NaN NaN 7 2 2 h 2 NaN True False 0 False NaN NaN
-
psifr.fr.
lag_crp
(df, lag_key='input', count_unique=False, item_query=None, test_key=None, test=None)¶ Lag-CRP for multiple subjects.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
lag_key (str, optional) – Name of column to use when calculating lag between recalled items. Default is to calculate lag based on input position.
count_unique (bool, optional) – If true, possible transitions of the same lag will only be incremented once per transition.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
results – Has fields:
- subjecthashable
Results are separated by each subject.
- lagint
Lag of input position between two adjacent recalls.
- probfloat
Probability of each lag transition.
- actualint
Total of actual made transitions at each lag.
- possibleint
Total of times each lag was possible, given the prior input position and the remaining items to be recalled.
- Return type
pandas.DataFrame
See also
lag_rank()
Rank of the absolute lags in recall sequences.
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> fr.lag_crp(data) prob actual possible subject lag 1 -23.0 0.020833 1 48 -22.0 0.035714 3 84 -21.0 0.026316 3 114 -20.0 0.024000 3 125 -19.0 0.014388 2 139 ... ... ... ... 47 19.0 0.061224 3 49 20.0 0.055556 2 36 21.0 0.045455 1 22 22.0 0.071429 1 14 23.0 0.000000 0 6 [1880 rows x 3 columns]
-
psifr.fr.
lag_rank
(df, item_query=None, test_key=None, test=None)¶ Calculate rank of the absolute lags in free recall lists.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
stat – Has fields ‘subject’ and ‘rank’.
- Return type
pandas.DataFrame
See also
lag_crp()
Conditional response probability by input lag.
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> lag_rank = fr.lag_rank(data) >>> lag_rank.head() rank subject 1 0.610953 2 0.635676 3 0.612607 4 0.667090 5 0.643923
-
psifr.fr.
merge_free_recall
(data, **kwargs)¶ Score free recall data by matching up study and recall events.
- Parameters
data (pandas.DataFrame) – Free recall data in Psifr format. Must have subject, list, trial_type, position, and item columns.
merge_keys (list, optional) – Columns to use to designate events to merge. Default is [‘subject’, ‘list’, ‘item’], which will merge events related to the same item, but only within list.
list_keys (list, optional) – Columns that apply to both study and recall events.
study_keys (list, optional) – Columns that only apply to study events.
recall_keys (list, optional) – Columns that only apply to recall events.
position_key (str, optional) – Column indicating the position of each item in either the study list or the recall sequence.
- Returns
merged – Merged information about study and recall events. Each row corresponds to one unique input/output pair.
The following columns will be added:
- inputint
Position of each item in the input list (i.e., serial position).
- outputint
Position of each item in the recall sequence.
- studybool
True for rows corresponding to a unique study event.
- recallbool
True for rows corresponding to a unique recall event.
- repeatint
Number of times this recall event has been repeated (0 for the first recall of an item).
- intrusionbool
True for recalls that do not correspond to any study event.
- prior_listint
For prior-list intrusions, the list the item was presented.
- prior_positionint
For prior-list intrusions, the position the item was presented.
- Return type
pandas.DataFrame
See also
merge_lists()
Flexibly merge study events with recall events. Useful for recall phases that don’t match the typical free recall setup, like final free recall of all lists.
Examples
>>> from psifr import fr >>> study = [['absence', 'hollow'], ['fountain', 'piano']] >>> recall = [['absence'], ['piano', 'hollow']] >>> raw = fr.table_from_lists([1, 1], study, recall) >>> raw subject list trial_type position item 0 1 1 study 1 absence 1 1 1 study 2 hollow 2 1 1 recall 1 absence 3 1 2 study 1 fountain 4 1 2 study 2 piano 5 1 2 recall 1 piano 6 1 2 recall 2 hollow
Score the data to create a table with matched study and recall events.
>>> data = fr.merge_free_recall(raw) >>> data subject list item input output study recall repeat intrusion prior_list prior_input 0 1 1 absence 1.0 1.0 True True 0 False NaN NaN 1 1 1 hollow 2.0 NaN True False 0 False NaN NaN 2 1 2 fountain 1.0 NaN True False 0 False NaN NaN 3 1 2 piano 2.0 1.0 True True 0 False NaN NaN 4 1 2 hollow NaN 2.0 False True 0 True 1.0 2.0
You can also include non-standard columns. Information that only applies to study events (here, the encoding task used) can be indicated using the
study_keys
input.>>> raw['task'] = np.array([1, 2, np.nan, 2, 1, np.nan, np.nan]) >>> fr.merge_free_recall(raw, study_keys=['task']) subject list item input output study recall repeat intrusion task prior_list prior_input 0 1 1 absence 1.0 1.0 True True 0 False 1.0 NaN NaN 1 1 1 hollow 2.0 NaN True False 0 False 2.0 NaN NaN 2 1 2 fountain 1.0 NaN True False 0 False 2.0 NaN NaN 3 1 2 piano 2.0 1.0 True True 0 False 1.0 NaN NaN 4 1 2 hollow NaN 2.0 False True 0 True NaN 1.0 2.0
Information that only applies to recall onsets (here, the time in seconds after the start of the recall phase that a recall attempt was made), can be indicated using the
recall_keys
input.>>> raw['onset'] = np.array([np.nan, np.nan, 1.1, np.nan, np.nan, 1.4, 3.8]) >>> fr.merge_free_recall(raw, recall_keys=['onset']) subject list item input output study recall repeat intrusion onset prior_list prior_input 0 1 1 absence 1.0 1.0 True True 0 False 1.1 NaN NaN 1 1 1 hollow 2.0 NaN True False 0 False NaN NaN NaN 2 1 2 fountain 1.0 NaN True False 0 False NaN NaN NaN 3 1 2 piano 2.0 1.0 True True 0 False 1.4 NaN NaN 4 1 2 hollow NaN 2.0 False True 0 True 3.8 1.0 2.0
Use
list_keys
to indicate columns that apply to both study and recall events. Iflist_keys
do not match for a pair of study and recall events, they will not be matched in the output.>>> raw['condition'] = np.array([1, 1, 1, 2, 2, 2, 2]) >>> fr.merge_free_recall(raw, list_keys=['condition']) subject list item input output study recall repeat intrusion condition prior_list prior_input 0 1 1 absence 1.0 1.0 True True 0 False 1 NaN NaN 1 1 1 hollow 2.0 NaN True False 0 False 1 NaN NaN 2 1 2 fountain 1.0 NaN True False 0 False 2 NaN NaN 3 1 2 piano 2.0 1.0 True True 0 False 2 NaN NaN 4 1 2 hollow NaN 2.0 False True 0 True 2 1.0 2.0
-
psifr.fr.
merge_lists
(study, recall, merge_keys=None, list_keys=None, study_keys=None, recall_keys=None, position_key='position')¶ Merge study and recall events together for each list.
- Parameters
study (pandas.DataFrame) – Information about all study events. Should have one row for each study event.
recall (pandas.DataFrame) – Information about all recall events. Should have one row for each recall attempt.
merge_keys (list, optional) – Columns to use to designate events to merge. Default is [‘subject’, ‘list’, ‘item’], which will merge events related to the same item, but only within list.
list_keys (list, optional) – Columns that apply to both study and recall events.
study_keys (list, optional) – Columns that only apply to study events.
recall_keys (list, optional) – Columns that only apply to recall events.
position_key (str, optional) – Column indicating the position of each item in either the study list or the recall sequence.
- Returns
merged – Merged information about study and recall events. Each row corresponds to one unique input/output pair.
The following columns will be added:
- inputint
Position of each item in the input list (i.e., serial position).
- outputint
Position of each item in the recall sequence.
- studybool
True for rows corresponding to a unique study event.
- recallbool
True for rows corresponding to a unique recall event.
- repeatint
Number of times this recall event has been repeated (0 for the first recall of an item).
- intrusionbool
True for recalls that do not correspond to any study event.
- Return type
pandas.DataFrame
See also
merge_free_recall()
Score standard free recall data.
Examples
>>> import pandas as pd >>> from psifr import fr >>> study = pd.DataFrame( ... {'subject': [1, 1], 'list': [1, 1], 'position': [1, 2], 'item': ['a', 'b']} ... ) >>> recall = pd.DataFrame( ... {'subject': [1], 'list': [1], 'position': [1], 'item': ['b']} ... ) >>> fr.merge_lists(study, recall) subject list item input output study recall repeat intrusion 0 1 1 a 1 NaN True False 0 False 1 1 1 b 2 1.0 True True 0 False
-
psifr.fr.
pli_list_lag
(df, max_lag)¶ List lag of prior-list intrusions.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_free_recall. Must have fields: subject, list, intrusion, prior_list. Lists must be numbered starting from 1 and all lists must be included.
max_lag (int) – Maximum list lag to consider. The intial
max_lag
lists for each subject will be excluded so that all considered lags are possible for all included lists.
- Returns
results – For each subject and list lag, the proportion of intrusions at that lag, in the
results['prob']
column.- Return type
pandas.DataFrame
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> fr.pli_list_lag(data, 3) count per_list prob subject list_lag 1 1 7 0.155556 0.259259 2 5 0.111111 0.185185 3 0 0.000000 0.000000 2 1 9 0.200000 0.191489 2 2 0.044444 0.042553 ... ... ... ... 46 2 1 0.022222 0.100000 3 0 0.000000 0.000000 47 1 5 0.111111 0.277778 2 1 0.022222 0.055556 3 0 0.000000 0.000000 [120 rows x 3 columns]
-
psifr.fr.
plot_distance_crp
(crp, min_samples=None, **facet_kws)¶ Plot response probability by distance bin.
- Parameters
crp (pandas.DataFrame) – Results from fr.distance_crp.
min_samples (int) – Minimum number of samples a bin must have per subject to include in the plot.
**facet_kws – Additional inputs to pass to seaborn.relplot.
-
psifr.fr.
plot_lag_crp
(recall, max_lag=5, split=True, **facet_kws)¶ Plot conditional response probability by lag.
Additional arguments are passed to seaborn.FacetGrid.
- Parameters
recall (pandas.DataFrame) – Results from calling lag_crp.
max_lag (int) – Maximum absolute lag to plot.
split (bool, optional) – If true, will plot as two separate lines with a gap at lag 0.
-
psifr.fr.
plot_raster
(df, hue='input', palette=None, marker='s', intrusion_color=None, orientation='horizontal', length=6, aspect=None, legend='auto', **facet_kws)¶ Plot recalls in a raster plot.
- Parameters
df (pandas.DataFrame) – Scored free recall data.
hue (str or None, optional) – Column to use to set marker color.
palette (optional) – Palette specification supported by Seaborn.
marker (str, optional) – Marker code supported by Seaborn.
intrusion_color (optional) – Color of intrusions.
orientation ({'horizontal', 'vertical'}, optional) – Whether lists should be stacked horizontally or vertically in the plot.
length (float, optional) – Size of the plot dimension along which list varies.
aspect (float, optional) – Aspect ratio of plot for lists over items.
legend (str, optional) – Legend setting. See seaborn.scatterplot for details.
facet_kws (optional) – Additional key words to pass to seaborn.FacetGrid.
-
psifr.fr.
plot_spc
(recall, **facet_kws)¶ Plot a serial position curve.
Additional arguments are passed to seaborn.relplot.
- Parameters
recall (pandas.DataFrame) – Results from calling spc.
-
psifr.fr.
plot_swarm_error
(data, x=None, y=None, swarm_color=None, swarm_size=5, point_color='k', **facet_kws)¶ Plot points as a swarm plus mean with error bars.
- Parameters
data (pandas.DataFrame) – DataFrame with statistics to plot.
x (str) – Name of variable to plot on x-axis.
y (str) – Name of variable to plot on y-axis.
swarm_color – Color for swarm plot points. May use any specification supported by seaborn.
swarm_size (float) – Size of swarm plot points.
point_color – Color for the point plot (error bars).
facet_kws – Additional keywords for the FacetGrid.
-
psifr.fr.
pnr
(df, item_query=None, test_key=None, test=None)¶ Probability of recall by serial position and output position.
Calculate probability of Nth recall, where N is each output position. Invalid recalls (repeats and intrusions) are ignored and not counted toward output position.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, study, recall. Input position must be defined such that the first serial position is 1, not 0.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.
- Returns
prob – Analysis results. Has fields: subject, output, input, prob, actual, possible. The prob column for output x and input y indicates the probability of recalling input position y at output position x. The actual and possible columns give the raw tallies for how many times an event actually occurred and how many times it was possible given the recall sequence.
- Return type
pandas.DataFrame
See also
plot_spc()
Plot recall probability as a function of serial position.
spc()
Overall recall probability by serial position.
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> fr.pnr(data) prob actual possible subject output input 1 1 1 0.000000 0 48 2 0.020833 1 48 3 0.000000 0 48 4 0.000000 0 48 5 0.000000 0 48 ... ... ... ... 47 24 20 NaN 0 0 21 NaN 0 0 22 NaN 0 0 23 NaN 0 0 24 NaN 0 0 [23040 rows x 3 columns]
-
psifr.fr.
pool_index
(trial_items, pool_items_list)¶ Get the index of each item in the full pool.
- Parameters
trial_items (pandas.Series) – The item presented on each trial.
pool_items_list (list or numpy.ndarray) – List of items in the full pool.
- Returns
item_index – Index of each item in the pool. Trials with items not in the pool will be <NA>.
- Return type
pandas.Series
Examples
>>> import pandas as pd >>> from psifr import fr >>> trial_items = pd.Series(['b', 'a', 'z', 'c', 'd']) >>> pool_items_list = ['a', 'b', 'c', 'd', 'e', 'f'] >>> fr.pool_index(trial_items, pool_items_list) 0 1 1 0 2 <NA> 3 2 4 3 dtype: Int64
-
psifr.fr.
reset_list
(df)¶ Reset list index in a DataFrame.
- Parameters
df (pandas.DataFrame) – Raw or merged data. Must have subject and list fields.
- Returns
Data with a renumbered list field, starting from 1.
- Return type
pandas.DataFrame
Examples
>>> from psifr import fr >>> subjects_list = [1, 1] >>> study_lists = [['a', 'b'], ['c', 'd']] >>> recall_lists = [['b'], ['c', 'd']] >>> list_nos = [3, 4] >>> raw = fr.table_from_lists(subjects_list, study_lists, recall_lists, lists=list_nos) >>> raw subject list trial_type position item 0 1 3 study 1 a 1 1 3 study 2 b 2 1 3 recall 1 b 3 1 4 study 1 c 4 1 4 study 2 d 5 1 4 recall 1 c 6 1 4 recall 2 d
>>> fr.reset_list(raw) subject list trial_type position item 0 1 1 study 1 a 1 1 1 study 2 b 2 1 1 recall 1 b 3 1 2 study 1 c 4 1 2 study 2 d 5 1 2 recall 1 c 6 1 2 recall 2 d
-
psifr.fr.
sample_data
(study)¶ Read sample data.
-
psifr.fr.
sample_distances
(study)¶ Read sample distances.
-
psifr.fr.
spc
(df)¶ Serial position curve.
- Parameters
df (pandas.DataFrame) – Merged study and recall data. See merge_lists.
- Returns
recall – Index includes:
- subjecthashable
Subject identifier.
- inputint
Serial position in the list.
Values are:
- recallfloat
Recall probability for each serial position.
- Return type
pandas.Series
See also
plot_spc()
Plot serial position curve results.
pnr()
Probability of nth recall.
Examples
>>> from psifr import fr >>> raw = fr.sample_data('Morton2013') >>> data = fr.merge_free_recall(raw) >>> fr.spc(data) recall subject input 1 1.0 0.541667 2.0 0.458333 3.0 0.625000 4.0 0.333333 5.0 0.437500 ... ... 47 20.0 0.500000 21.0 0.770833 22.0 0.729167 23.0 0.895833 24.0 0.958333 [960 rows x 1 columns]
-
psifr.fr.
split_lists
(frame, phase, keys=None, names=None, item_query=None, as_list=False)¶ Convert free recall data from one phase to split format.
- Parameters
frame (pandas.DataFrame) – Free recall data with separate study and recall events.
phase ({'study', 'recall', 'raw'}) – Phase of recall to split. If ‘raw’, all trials will be included.
keys (list of str, optional) – Data columns to include in the split data. If not specified, all columns will be included.
names (list of str, optional) – Name for each column in the returned split data. Default is to use the same names as the input columns.
item_query (str, optional) – Query string to select study trials to include. See pandas.DataFrame.query for allowed format.
as_list (bool, optional) – If true, each column will be output as a list; otherwise, outputs will be numpy.ndarray.
- Returns
split – Data in split format. Each included column will be a key in the dictionary, with a list of either numpy.ndarray (default) or lists, containing the values for that column.
- Return type
dict of str: list
See also
table_from_lists()
Convert list-format data to a table.
Examples
>>> from psifr import fr >>> study = [['absence', 'hollow'], ['fountain', 'piano']] >>> recall = [['absence'], ['piano', 'fountain']] >>> raw = fr.table_from_lists([1, 1], study, recall) >>> data = fr.merge_free_recall(raw) >>> data subject list item input output study recall repeat intrusion prior_list prior_input 0 1 1 absence 1 1.0 True True 0 False NaN NaN 1 1 1 hollow 2 NaN True False 0 False NaN NaN 2 1 2 fountain 1 2.0 True True 0 False NaN NaN 3 1 2 piano 2 1.0 True True 0 False NaN NaN
Get study events split by list, just including the list and item fields.
>>> fr.split_lists(data, 'study', keys=['list', 'item'], as_list=True) {'list': [[1, 1], [2, 2]], 'item': [['absence', 'hollow'], ['fountain', 'piano']]}
Export recall events, split by list.
>>> fr.split_lists(data, 'recall', keys=['item'], as_list=True) {'item': [['absence'], ['piano', 'fountain']]}
Raw events (i.e., events that haven’t been scored) can also be exported to list format.
>>> fr.split_lists(raw, 'raw', keys=['position']) {'position': [array([1, 2, 1]), array([1, 2, 1, 2])]}
-
psifr.fr.
table_from_lists
(subjects, study, recall, lists=None, **kwargs)¶ Create table format data from list format data.
- Parameters
subjects (list of hashable) – Subject identifier for each list.
study (list of list of hashable) – List of items for each study list.
recall (list of list of hashable) – List of recalled items for each study list.
lists (list of hashable, optional) – List of list numbers. If not specified, lists for each subject will be numbered sequentially starting from one.
- Returns
data – Data in table format.
- Return type
pandas.DataFrame
See also
split_lists()
Split a table into list format.
Examples
>>> from psifr import fr >>> subjects_list = [1, 1, 2, 2] >>> study_lists = [['a', 'b'], ['c', 'd'], ['e', 'f'], ['g', 'h']] >>> recall_lists = [['b'], ['d', 'c'], ['f', 'e'], []] >>> fr.table_from_lists(subjects_list, study_lists, recall_lists) subject list trial_type position item 0 1 1 study 1 a 1 1 1 study 2 b 2 1 1 recall 1 b 3 1 2 study 1 c 4 1 2 study 2 d 5 1 2 recall 1 d 6 1 2 recall 2 c 7 2 1 study 1 e 8 2 1 study 2 f 9 2 1 recall 1 f 10 2 1 recall 2 e 11 2 2 study 1 g 12 2 2 study 2 h
>>> subjects_list = [1, 1] >>> study_lists = [['a', 'b'], ['c', 'd']] >>> recall_lists = [['b'], ['d', 'c']] >>> col1 = ([[1, 2], [1, 2]], [[2], [2, 1]]) >>> col2 = ([[1, 1], [2, 2]], None) >>> fr.table_from_lists(subjects_list, study_lists, recall_lists, col1=col1, col2=col2) subject list trial_type position item col1 col2 0 1 1 study 1 a 1 1.0 1 1 1 study 2 b 2 1.0 2 1 1 recall 1 b 2 NaN 3 1 2 study 1 c 1 2.0 4 1 2 study 2 d 2 2.0 5 1 2 recall 1 d 2 NaN 6 1 2 recall 2 c 1 NaN