psifr.fr.distance_crp#

psifr.fr.distance_crp(df, index_key, distances, edges, centers=None, count_unique=False, item_query=None, test_key=None, test=None, drop_bin=False)#

Conditional response probability by distance bin.

Parameters:
  • df (pandas.DataFrame) – Merged free recall data.

  • index_key (str) – Name of column containing the index of each item in the distances matrix.

  • distances (numpy.array) – Items x items matrix of pairwise distances or similarities.

  • edges (array-like) – Edges of bins to apply to the distances.

  • centers (array-like, optional) – Centers to label each bin with. If not specified, the center point between edges will be used.

  • count_unique (bool, optional) – If true, possible transitions to a given distance bin will only count once for a given transition.

  • item_query (str, optional) – Query string to select items to include in the pool of possible recalls to be examined. See pandas.DataFrame.query for allowed format.

  • test_key (str, optional) – Name of column with labels to use when testing transitions for inclusion.

  • test (callable, optional) – Callable that takes in previous and current item values and returns True for transitions that should be included.

Returns:

crp – Has fields:

subjecthashable

Results are separated by each subject.

binint

Distance bin.

probfloat

Probability of each distance bin.

actualint

Total of actual transitions for each distance bin.

possibleint

Total of times each distance bin was possible, given the prior input position and the remaining items to be recalled.

Return type:

pandas.DataFrame

See also

pool_index

Given a list of presented items and an item pool, look up the pool index of each item.

distance_rank

Calculate rank of transition distances.

Examples

>>> import numpy as np
>>> from scipy.spatial.distance import squareform
>>> from psifr import fr
>>> raw = fr.sample_data('Morton2013')
>>> data = fr.merge_free_recall(raw)
>>> items, distances = fr.sample_distances('Morton2013')
>>> data['item_index'] = fr.pool_index(data['item'], items)
>>> edges = np.percentile(squareform(distances), np.linspace(1, 99, 10))
>>> fr.distance_crp(data, 'item_index', distances, edges)
     subject    center             bin      prob  actual  possible
0          1  0.467532  (0.352, 0.583]  0.085456     151      1767
1          1  0.617748  (0.583, 0.653]  0.067916      87      1281
2          1  0.673656  (0.653, 0.695]  0.062500      65      1040
3          1  0.711075  (0.695, 0.727]  0.051836      48       926
4          1  0.742069  (0.727, 0.757]  0.050633      44       869
..       ...       ...             ...       ...     ...       ...
355       47  0.742069  (0.727, 0.757]  0.062822      61       971
356       47  0.770867  (0.757, 0.785]  0.030682      27       880
357       47  0.800404  (0.785, 0.816]  0.040749      37       908
358       47  0.834473  (0.816, 0.853]  0.046651      39       836
359       47  0.897275  (0.853, 0.941]  0.028868      25       866

[360 rows x 6 columns]