Free Recall Analysis¶

Utilities for working with free recall data.

psifr.fr.block_index(list_labels)¶

Get index of each block in a list.

Parameters: list_labels (list or numpy.ndarray) – Position labels that define the blocks.
Returns: block – Block index of each position.
Return type: numpy.ndarray

Examples

>>> import numpy as np
>>> from psifr import fr
>>> list_labels = [2, 2, 3, 3, 3, 1, 1]
>>> fr.block_index(list_labels)
array([1, 1, 2, 2, 2, 3, 3])

psifr.fr.category_crp(df, category_key, item_query=None, test_key=None, test=None)¶

Conditional response probability of within-category transitions.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled.
category_key (str) – Name of column with category labels.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

results – Has fields:

subjecthashable: Results are separated by each subject.
probfloat: Probability of each lag transition.
actualint: Total of actual made transitions at each lag.
possibleint: Total of times each lag was possible, given the prior input position and the remaining items to be recalled.

Return type

pandas.DataFrame

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw, study_keys=['category'])
>>> cat_crp = fr.category_crp(data, 'category')
>>> cat_crp.head()
             prob  actual  possible
subject
1        0.801147     419       523
2        0.733456     399       544
3        0.763158     377       494
4        0.814882     449       551
5        0.877273     579       660

psifr.fr.check_data(df)¶

Run checks on free recall data.

Parameters

df (pandas.DataFrame) –

Contains one row for each trial (study and recall). Must have fields:

subjectnumber or str: Subject identifier.
listnumber: List identifier. This applies to both study and recall trials.
trial_typestr: Type of trial; may be ‘study’ or ‘recall’.
positionnumber: Position within the study list or recall sequence.
itemstr: Item that was either presented or recalled on this trial.

Examples

>>> from psifr import fr
>>> import pandas as pd
>>> raw = pd.DataFrame(
...     {'subject': [1, 1], 'list': [1, 1], 'position': [1, 2], 'item': ['a', 'b']}
... )
>>> fr.check_data(raw)
Traceback (most recent call last):
  File "psifr/fr.py", line 253, in check_data
    assert col in df.columns, f'Required column {col} is missing.'
AssertionError: Required column trial_type is missing.

psifr.fr.distance_crp(df, index_key, distances, edges, centers=None, count_unique=False, item_query=None, test_key=None, test=None)¶

Conditional response probability by distance bin.

Parameters

df (pandas.DataFrame) – Merged free recall data.
index_key (str) – Name of column containing the index of each item in the distances matrix.
distances (numpy.array) – Items x items matrix of pairwise distances or similarities.
edges (array-like) – Edges of bins to apply to the distances.
centers (array-like, optional) – Centers to label each bin with. If not specified, the center point between edges will be used.
count_unique (bool, optional) – If true, possible transitions to a given distance bin will only count once for a given transition.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

crp – Has fields:

subjecthashable: Results are separated by each subject.
binint: Distance bin.
probfloat: Probability of each distance bin.
actualint: Total of actual transitions for each distance bin.
possibleint: Total of times each distance bin was possible, given the prior input position and the remaining items to be recalled.

Return type

pandas.DataFrame

See also

pool_index(): Given a list of presented items and an item pool, look up the pool index of each item.
distance_rank(): Calculate rank of transition distances.

Examples

>>> from scipy.spatial.distance import squareform
>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> items, distances = fr.sample_distances('Morton2013')
>>> data['item_index'] = fr.pool_index(data['item'], items)
>>> edges = np.percentile(squareform(distances), np.linspace(1, 99, 10))
>>> fr.distance_crp(data, 'item_index', distances, edges)
                             bin      prob  actual  possible
subject center
1       0.467532  (0.352, 0.583]  0.085456     151      1767
        0.617748  (0.583, 0.653]  0.067916      87      1281
        0.673656  (0.653, 0.695]  0.062500      65      1040
        0.711075  (0.695, 0.727]  0.051836      48       926
        0.742069  (0.727, 0.757]  0.050633      44       869
...                          ...       ...     ...       ...
47      0.742069  (0.727, 0.757]  0.062822      61       971
        0.770867  (0.757, 0.785]  0.030682      27       880
        0.800404  (0.785, 0.816]  0.040749      37       908
        0.834473  (0.816, 0.853]  0.046651      39       836
        0.897275  (0.853, 0.941]  0.028868      25       866

[360 rows x 4 columns]

psifr.fr.distance_rank(df, index_key, distances, item_query=None, test_key=None, test=None)¶

Calculate rank of transition distances in free recall lists.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
index_key (str) – Name of column containing the index of each item in the distances matrix.
distances (numpy.array) – Items x items matrix of pairwise distances or similarities.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

stat – Has fields ‘subject’ and ‘rank’.

Return type

pandas.DataFrame

See also

pool_index(): Given a list of presented items and an item pool, look up the pool index of each item.
distance_crp(): Conditional response probability by distance bin.

Examples

>>> from scipy.spatial.distance import squareform
>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> items, distances = fr.sample_distances('Morton2013')
>>> data['item_index'] = fr.pool_index(data['item'], items)
>>> dist_rank = fr.distance_rank(data, 'item_index', distances)
>>> dist_rank.head()
             rank
subject
1        0.635571
2        0.571457
3        0.627282
4        0.637596
5        0.646181

psifr.fr.filter_data(data, subjects=None, lists=None, trial_type=None, positions=None, inputs=None, outputs=None)¶

Filter data to get a subset of trials.

Parameters

data (pandas.DataFrame) – Raw or merged data to filter.
subjects (hashable or list of hashable) – Subject or subjects to include.
lists (hashable or list of hashable) – List or lists to include.
trial_type ({'study', 'recall'}) – Trial type to include.
positions (int or list of int) – Position or positions to include.
inputs (int or list of int) – Input position or positions to include.
outputs (int or list of int) – Output position or positions to include.

Returns

filtered – The filtered subset of data.

Return type

pandas.DataFrame

Examples

>>> from psifr import fr
>>> subjects_list = [1, 1, 2, 2]
>>> study_lists = [['a', 'b'], ['c', 'd'], ['e', 'f'], ['g', 'h']]
>>> recall_lists = [['b'], ['d', 'c'], ['f', 'e'], []]
>>> raw = fr.table_from_lists(subjects_list, study_lists, recall_lists)
>>> fr.filter_data(raw, subjects=1, trial_type='study')
   subject  list trial_type  position item
0        1     1      study         1    a
1        1     1      study         2    b
3        1     2      study         1    c
4        1     2      study         2    d

>>> data = fr.merge_free_recall(raw)
>>> fr.filter_data(data, subjects=2)
   subject  list item  input  output  study  recall  repeat  intrusion  prior_list  prior_input
4        2     1    e      1     2.0   True    True       0      False         NaN          NaN
5        2     1    f      2     1.0   True    True       0      False         NaN          NaN
6        2     2    g      1     NaN   True   False       0      False         NaN          NaN
7        2     2    h      2     NaN   True   False       0      False         NaN          NaN

psifr.fr.lag_crp(df, lag_key='input', count_unique=False, item_query=None, test_key=None, test=None)¶

Lag-CRP for multiple subjects.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
lag_key (str, optional) – Name of column to use when calculating lag between recalled items. Default is to calculate lag based on input position.
count_unique (bool, optional) – If true, possible transitions of the same lag will only be incremented once per transition.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

results – Has fields:

subjecthashable: Results are separated by each subject.
lagint: Lag of input position between two adjacent recalls.
probfloat: Probability of each lag transition.
actualint: Total of actual made transitions at each lag.
possibleint: Total of times each lag was possible, given the prior input position and the remaining items to be recalled.

Return type

pandas.DataFrame

See also

lag_rank(): Rank of the absolute lags in recall sequences.

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> fr.lag_crp(data)
                   prob  actual  possible
subject lag
1       -23.0  0.020833       1        48
        -22.0  0.035714       3        84
        -21.0  0.026316       3       114
        -20.0  0.024000       3       125
        -19.0  0.014388       2       139
...                 ...     ...       ...
47       19.0  0.061224       3        49
         20.0  0.055556       2        36
         21.0  0.045455       1        22
         22.0  0.071429       1        14
         23.0  0.000000       0         6

[1880 rows x 3 columns]

psifr.fr.lag_rank(df, item_query=None, test_key=None, test=None)¶

Calculate rank of the absolute lags in free recall lists.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, recalled. Input position must be defined such that the first serial position is 1, not 0.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

stat – Has fields ‘subject’ and ‘rank’.

Return type

pandas.DataFrame

See also

lag_crp(): Conditional response probability by input lag.

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> lag_rank = fr.lag_rank(data)
>>> lag_rank.head()
             rank
subject
1        0.610953
2        0.635676
3        0.612607
4        0.667090
5        0.643923

psifr.fr.merge_free_recall(data, **kwargs)¶

Score free recall data by matching up study and recall events.

Parameters

data (pandas.DataFrame) – Free recall data in Psifr format. Must have subject, list, trial_type, position, and item columns.
merge_keys (list, optional) – Columns to use to designate events to merge. Default is [‘subject’, ‘list’, ‘item’], which will merge events related to the same item, but only within list.
list_keys (list, optional) – Columns that apply to both study and recall events.
study_keys (list, optional) – Columns that only apply to study events.
recall_keys (list, optional) – Columns that only apply to recall events.
position_key (str, optional) – Column indicating the position of each item in either the study list or the recall sequence.

Returns

merged – Merged information about study and recall events. Each row corresponds to one unique input/output pair.

The following columns will be added:

inputint: Position of each item in the input list (i.e., serial position).
outputint: Position of each item in the recall sequence.
studybool: True for rows corresponding to a unique study event.
recallbool: True for rows corresponding to a unique recall event.
repeatint: Number of times this recall event has been repeated (0 for the first recall of an item).
intrusionbool: True for recalls that do not correspond to any study event.
prior_listint: For prior-list intrusions, the list the item was presented.
prior_positionint: For prior-list intrusions, the position the item was presented.

Return type

pandas.DataFrame

See also

merge_lists(): Flexibly merge study events with recall events. Useful for recall phases that don’t match the typical free recall setup, like final free recall of all lists.

Examples

>>> from psifr import fr
>>> study = [['absence', 'hollow'], ['fountain', 'piano']]
>>> recall = [['absence'], ['piano', 'hollow']]
>>> raw = fr.table_from_lists([1, 1], study, recall)
>>> raw
   subject  list trial_type  position      item
0        1     1      study         1   absence
1        1     1      study         2    hollow
2        1     1     recall         1   absence
3        1     2      study         1  fountain
4        1     2      study         2     piano
5        1     2     recall         1     piano
6        1     2     recall         2    hollow

Score the data to create a table with matched study and recall events.

>>> data = fr.merge_free_recall(raw)
>>> data
   subject  list      item  input  output  study  recall  repeat  intrusion  prior_list  prior_input
0        1     1   absence    1.0     1.0   True    True       0      False         NaN          NaN
1        1     1    hollow    2.0     NaN   True   False       0      False         NaN          NaN
2        1     2  fountain    1.0     NaN   True   False       0      False         NaN          NaN
3        1     2     piano    2.0     1.0   True    True       0      False         NaN          NaN
4        1     2    hollow    NaN     2.0  False    True       0       True         1.0          2.0

You can also include non-standard columns. Information that only applies to study events (here, the encoding task used) can be indicated using the study_keys input.

>>> raw['task'] = np.array([1, 2, np.nan, 2, 1, np.nan, np.nan])
>>> fr.merge_free_recall(raw, study_keys=['task'])
   subject  list      item  input  output  study  recall  repeat  intrusion  task  prior_list  prior_input
0        1     1   absence    1.0     1.0   True    True       0      False   1.0         NaN          NaN
1        1     1    hollow    2.0     NaN   True   False       0      False   2.0         NaN          NaN
2        1     2  fountain    1.0     NaN   True   False       0      False   2.0         NaN          NaN
3        1     2     piano    2.0     1.0   True    True       0      False   1.0         NaN          NaN
4        1     2    hollow    NaN     2.0  False    True       0       True   NaN         1.0          2.0

Information that only applies to recall onsets (here, the time in seconds after the start of the recall phase that a recall attempt was made), can be indicated using the recall_keys input.

>>> raw['onset'] = np.array([np.nan, np.nan, 1.1, np.nan, np.nan, 1.4, 3.8])
>>> fr.merge_free_recall(raw, recall_keys=['onset'])
   subject  list      item  input  output  study  recall  repeat  intrusion  onset  prior_list  prior_input
0        1     1   absence    1.0     1.0   True    True       0      False    1.1         NaN          NaN
1        1     1    hollow    2.0     NaN   True   False       0      False    NaN         NaN          NaN
2        1     2  fountain    1.0     NaN   True   False       0      False    NaN         NaN          NaN
3        1     2     piano    2.0     1.0   True    True       0      False    1.4         NaN          NaN
4        1     2    hollow    NaN     2.0  False    True       0       True    3.8         1.0          2.0

Use list_keys to indicate columns that apply to both study and recall events. If list_keys do not match for a pair of study and recall events, they will not be matched in the output.

>>> raw['condition'] = np.array([1, 1, 1, 2, 2, 2, 2])
>>> fr.merge_free_recall(raw, list_keys=['condition'])
   subject  list      item  input  output  study  recall  repeat  intrusion  condition  prior_list  prior_input
0        1     1   absence    1.0     1.0   True    True       0      False          1         NaN          NaN
1        1     1    hollow    2.0     NaN   True   False       0      False          1         NaN          NaN
2        1     2  fountain    1.0     NaN   True   False       0      False          2         NaN          NaN
3        1     2     piano    2.0     1.0   True    True       0      False          2         NaN          NaN
4        1     2    hollow    NaN     2.0  False    True       0       True          2         1.0          2.0

psifr.fr.merge_lists(study, recall, merge_keys=None, list_keys=None, study_keys=None, recall_keys=None, position_key='position')¶

Merge study and recall events together for each list.

Parameters

study (pandas.DataFrame) – Information about all study events. Should have one row for each study event.
recall (pandas.DataFrame) – Information about all recall events. Should have one row for each recall attempt.
merge_keys (list, optional) – Columns to use to designate events to merge. Default is [‘subject’, ‘list’, ‘item’], which will merge events related to the same item, but only within list.
list_keys (list, optional) – Columns that apply to both study and recall events.
study_keys (list, optional) – Columns that only apply to study events.
recall_keys (list, optional) – Columns that only apply to recall events.
position_key (str, optional) – Column indicating the position of each item in either the study list or the recall sequence.

Returns

merged – Merged information about study and recall events. Each row corresponds to one unique input/output pair.

The following columns will be added:

inputint: Position of each item in the input list (i.e., serial position).
outputint: Position of each item in the recall sequence.
studybool: True for rows corresponding to a unique study event.
recallbool: True for rows corresponding to a unique recall event.
repeatint: Number of times this recall event has been repeated (0 for the first recall of an item).
intrusionbool: True for recalls that do not correspond to any study event.

Return type

pandas.DataFrame

See also

merge_free_recall(): Score standard free recall data.

Examples

>>> import pandas as pd
>>> from psifr import fr
>>> study = pd.DataFrame(
...    {'subject': [1, 1], 'list': [1, 1], 'position': [1, 2], 'item': ['a', 'b']}
... )
>>> recall = pd.DataFrame(
...    {'subject': [1], 'list': [1], 'position': [1], 'item': ['b']}
... )
>>> fr.merge_lists(study, recall)
   subject  list item  input  output  study  recall  repeat  intrusion
0        1     1    a      1     NaN   True   False       0      False
1        1     1    b      2     1.0   True    True       0      False

psifr.fr.pli_list_lag(df, max_lag)¶

List lag of prior-list intrusions.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_free_recall. Must have fields: subject, list, intrusion, prior_list. Lists must be numbered starting from 1 and all lists must be included.
max_lag (int) – Maximum list lag to consider. The intial max_lag lists for each subject will be excluded so that all considered lags are possible for all included lists.

Returns

results – For each subject and list lag, the proportion of intrusions at that lag, in the results['prob'] column.

Return type

pandas.DataFrame

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> fr.pli_list_lag(data, 3)
                  count  per_list      prob
subject list_lag
1       1             7  0.155556  0.259259
        2             5  0.111111  0.185185
        3             0  0.000000  0.000000
2       1             9  0.200000  0.191489
        2             2  0.044444  0.042553
...                 ...       ...       ...
46      2             1  0.022222  0.100000
        3             0  0.000000  0.000000
47      1             5  0.111111  0.277778
        2             1  0.022222  0.055556
        3             0  0.000000  0.000000

[120 rows x 3 columns]

psifr.fr.plot_distance_crp(crp, min_samples=None, **facet_kws)¶

Plot response probability by distance bin.

Parameters

crp (pandas.DataFrame) – Results from fr.distance_crp.
min_samples (int) – Minimum number of samples a bin must have per subject to include in the plot.
**facet_kws – Additional inputs to pass to seaborn.relplot.

psifr.fr.plot_lag_crp(recall, max_lag=5, split=True, **facet_kws)¶

Plot conditional response probability by lag.

Additional arguments are passed to seaborn.FacetGrid.

Parameters

recall (pandas.DataFrame) – Results from calling lag_crp.
max_lag (int) – Maximum absolute lag to plot.
split (bool, optional) – If true, will plot as two separate lines with a gap at lag 0.

psifr.fr.plot_raster(df, hue='input', palette=None, marker='s', intrusion_color=None, orientation='horizontal', length=6, aspect=None, legend='auto', **facet_kws)¶

Plot recalls in a raster plot.

Parameters

df (pandas.DataFrame) – Scored free recall data.
hue (str or None, optional) – Column to use to set marker color.
palette (optional) – Palette specification supported by Seaborn.
marker (str, optional) – Marker code supported by Seaborn.
intrusion_color (optional) – Color of intrusions.
orientation ({'horizontal', 'vertical'}, optional) – Whether lists should be stacked horizontally or vertically in the plot.
length (float, optional) – Size of the plot dimension along which list varies.
aspect (float, optional) – Aspect ratio of plot for lists over items.
legend (str, optional) – Legend setting. See seaborn.scatterplot for details.
facet_kws (optional) – Additional key words to pass to seaborn.FacetGrid.

psifr.fr.plot_spc(recall, **facet_kws)¶

Plot a serial position curve.

Additional arguments are passed to seaborn.relplot.

Parameters: recall (pandas.DataFrame) – Results from calling spc.

psifr.fr.plot_swarm_error(data, x=None, y=None, swarm_color=None, swarm_size=5, point_color='k', **facet_kws)¶

Plot points as a swarm plus mean with error bars.

Parameters

data (pandas.DataFrame) – DataFrame with statistics to plot.
x (str) – Name of variable to plot on x-axis.
y (str) – Name of variable to plot on y-axis.
swarm_color – Color for swarm plot points. May use any specification supported by seaborn.
swarm_size (float) – Size of swarm plot points.
point_color – Color for the point plot (error bars).
facet_kws – Additional keywords for the FacetGrid.

psifr.fr.pnr(df, item_query=None, test_key=None, test=None)¶

Probability of recall by serial position and output position.

Calculate probability of Nth recall, where N is each output position. Invalid recalls (repeats and intrusions) are ignored and not counted toward output position.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists. List length is assumed to be the same for all lists within each subject. Must have fields: subject, list, input, output, study, recall. Input position must be defined such that the first serial position is 1, not 0.
item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.
test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.
test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns

prob – Analysis results. Has fields: subject, output, input, prob, actual, possible. The prob column for output x and input y indicates the probability of recalling input position y at output position x. The actual and possible columns give the raw tallies for how many times an event actually occurred and how many times it was possible given the recall sequence.

Return type

pandas.DataFrame

See also

plot_spc(): Plot recall probability as a function of serial position.
spc(): Overall recall probability by serial position.

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> fr.pnr(data)
                          prob  actual  possible
subject output input
1       1      1      0.000000       0        48
               2      0.020833       1        48
               3      0.000000       0        48
               4      0.000000       0        48
               5      0.000000       0        48
...                        ...     ...       ...
47      24     20          NaN       0         0
               21          NaN       0         0
               22          NaN       0         0
               23          NaN       0         0
               24          NaN       0         0

[23040 rows x 3 columns]

psifr.fr.pool_index(trial_items, pool_items_list)¶

Get the index of each item in the full pool.

Parameters

trial_items (pandas.Series) – The item presented on each trial.
pool_items_list (list or numpy.ndarray) – List of items in the full pool.

Returns

item_index – Index of each item in the pool. Trials with items not in the pool will be <NA>.

Return type

pandas.Series

Examples

>>> import pandas as pd
>>> from psifr import fr
>>> trial_items = pd.Series(['b', 'a', 'z', 'c', 'd'])
>>> pool_items_list = ['a', 'b', 'c', 'd', 'e', 'f']
>>> fr.pool_index(trial_items, pool_items_list)
0       1
1       0
2    <NA>
3       2
4       3
dtype: Int64

psifr.fr.reset_list(df)¶

Reset list index in a DataFrame.

Parameters: df (pandas.DataFrame) – Raw or merged data. Must have subject and list fields.
Returns: Data with a renumbered list field, starting from 1.
Return type: pandas.DataFrame

Examples

>>> from psifr import fr
>>> subjects_list = [1, 1]
>>> study_lists = [['a', 'b'], ['c', 'd']]
>>> recall_lists = [['b'], ['c', 'd']]
>>> list_nos = [3, 4]
>>> raw = fr.table_from_lists(subjects_list, study_lists, recall_lists, lists=list_nos)
>>> raw
   subject  list trial_type  position item
0        1     3      study         1    a
1        1     3      study         2    b
2        1     3     recall         1    b
3        1     4      study         1    c
4        1     4      study         2    d
5        1     4     recall         1    c
6        1     4     recall         2    d

>>> fr.reset_list(raw)
   subject  list trial_type  position item
      1     1      study         1    a
      1     1      study         2    b
      1     1     recall         1    b
      1     2      study         1    c
      1     2      study         2    d
      1     2     recall         1    c
      1     2     recall         2    d

psifr.fr.sample_data(study)¶: Read sample data.

psifr.fr.sample_distances(study)¶: Read sample distances.

psifr.fr.spc(df)¶

Serial position curve.

Parameters

df (pandas.DataFrame) – Merged study and recall data. See merge_lists.

Returns

recall – Index includes:

subjecthashable: Subject identifier.
inputint: Serial position in the list.

Values are:

recallfloat: Recall probability for each serial position.

Return type

pandas.Series

See also

plot_spc(): Plot serial position curve results.
pnr(): Probability of nth recall.

Examples

>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> fr.spc(data)
                 recall
subject input
1       1.0    0.541667
        2.0    0.458333
        3.0    0.625000
        4.0    0.333333
        5.0    0.437500
...                 ...
47      20.0   0.500000
        21.0   0.770833
        22.0   0.729167
        23.0   0.895833
        24.0   0.958333

[960 rows x 1 columns]

psifr.fr.split_lists(frame, phase, keys=None, names=None, item_query=None, as_list=False)¶

Convert free recall data from one phase to split format.

Parameters

frame (pandas.DataFrame) – Free recall data with separate study and recall events.
phase ({'study', 'recall', 'raw'}) – Phase of recall to split. If ‘raw’, all trials will be included.
keys (list of str, optional) – Data columns to include in the split data. If not specified, all columns will be included.
names (list of str, optional) – Name for each column in the returned split data. Default is to use the same names as the input columns.
item_query (str, optional) – Query string to select study trials to include. See pandas.DataFrame.query for allowed format.
as_list (bool, optional) – If true, each column will be output as a list; otherwise, outputs will be numpy.ndarray.

Returns

split – Data in split format. Each included column will be a key in the dictionary, with a list of either numpy.ndarray (default) or lists, containing the values for that column.

Return type

dict of str: list

See also

table_from_lists(): Convert list-format data to a table.

Examples

>>> from psifr import fr
>>> study = [['absence', 'hollow'], ['fountain', 'piano']]
>>> recall = [['absence'], ['piano', 'fountain']]
>>> raw = fr.table_from_lists([1, 1], study, recall)
>>> data = fr.merge_free_recall(raw)
>>> data
   subject  list      item  input  output  study  recall  repeat  intrusion  prior_list  prior_input
0        1     1   absence      1     1.0   True    True       0      False         NaN          NaN
1        1     1    hollow      2     NaN   True   False       0      False         NaN          NaN
2        1     2  fountain      1     2.0   True    True       0      False         NaN          NaN
3        1     2     piano      2     1.0   True    True       0      False         NaN          NaN

Get study events split by list, just including the list and item fields.

>>> fr.split_lists(data, 'study', keys=['list', 'item'], as_list=True)
{'list': [[1, 1], [2, 2]], 'item': [['absence', 'hollow'], ['fountain', 'piano']]}

Export recall events, split by list.

>>> fr.split_lists(data, 'recall', keys=['item'], as_list=True)
{'item': [['absence'], ['piano', 'fountain']]}

Raw events (i.e., events that haven’t been scored) can also be exported to list format.

>>> fr.split_lists(raw, 'raw', keys=['position'])
{'position': [array([1, 2, 1]), array([1, 2, 1, 2])]}

psifr.fr.table_from_lists(subjects, study, recall, lists=None, **kwargs)¶

Create table format data from list format data.

Parameters

subjects (list of hashable) – Subject identifier for each list.
study (list of list of hashable) – List of items for each study list.
recall (list of list of hashable) – List of recalled items for each study list.
lists (list of hashable, optional) – List of list numbers. If not specified, lists for each subject will be numbered sequentially starting from one.

Returns

data – Data in table format.

Return type

pandas.DataFrame

See also

split_lists(): Split a table into list format.

Examples

>>> from psifr import fr
>>> subjects_list = [1, 1, 2, 2]
>>> study_lists = [['a', 'b'], ['c', 'd'], ['e', 'f'], ['g', 'h']]
>>> recall_lists = [['b'], ['d', 'c'], ['f', 'e'], []]
>>> fr.table_from_lists(subjects_list, study_lists, recall_lists)
    subject  list trial_type  position item
0         1     1      study         1    a
1         1     1      study         2    b
2         1     1     recall         1    b
3         1     2      study         1    c
4         1     2      study         2    d
5         1     2     recall         1    d
6         1     2     recall         2    c
7         2     1      study         1    e
8         2     1      study         2    f
9         2     1     recall         1    f
10        2     1     recall         2    e
11        2     2      study         1    g
12        2     2      study         2    h

>>> subjects_list = [1, 1]
>>> study_lists = [['a', 'b'], ['c', 'd']]
>>> recall_lists = [['b'], ['d', 'c']]
>>> col1 = ([[1, 2], [1, 2]], [[2], [2, 1]])
>>> col2 = ([[1, 1], [2, 2]], None)
>>> fr.table_from_lists(subjects_list, study_lists, recall_lists, col1=col1, col2=col2)
   subject  list trial_type  position item  col1  col2
0        1     1      study         1    a     1   1.0
1        1     1      study         2    b     2   1.0
2        1     1     recall         1    b     2   NaN
3        1     2      study         1    c     1   2.0
4        1     2      study         2    d     2   2.0
5        1     2     recall         1    d     2   NaN
6        1     2     recall         2    c     1   NaN