Note
Go to the end to download the full example code.
Extracting and processing VisPubdata data with the Cartolabe API¶
Comparing the quality of embeddings using multiple methods¶
# %matplotlib inline
# %load_ext autoreload
# %autoreload 2
# %matplotlib widget
Download data¶
We will start by downloading the VisPubData dataset from Google Spreadsheet. See Petra Isenberg, Florian Heimerl, Steffen Koch, Tobias Isenberg, Panpan Xu, et al.. vispubdata.org: A Metadata Collection about IEEE Visualization (VIS) Publications. IEEE Transactions on Visualization and Computer Graphics, 2017, 23 (9), pp.2199-2206. ⟨[https://dx.doi.org/10.1109/TVCG.2016.2615308](10.1109/TVCG.2016.2615308)⟩. ⟨[https://dx.doi.org/10.1109/TVCG.2016.2615308](hal-01376597)⟩
SHEET_ID = '1xgoOPu28dQSSGPIp_HHQs0uvvcyLNdkMF9XtRajhhxU'
SHEET_NAME = 'Main%20dataset'
url = f'https://docs.google.com/spreadsheets/d/{SHEET_ID}/gviz/tq?tqx=out:csv&sheet={SHEET_NAME}'
min_df = 25
max_df = 0.1
max_words = 100000
vocab_sample = 250000
num_dims = 300
filt_min_score = 3
n_neighbors = 10
""
import pandas as pd # noqa
df = pd.read_csv(url)
df.AuthorKeywords.fillna('', inplace=True)
df.Abstract.fillna('', inplace=True)
df.AuthorAffiliation.fillna('', inplace=True)
df['text'] = df.Abstract + ' ' \
+ df.AuthorKeywords + ' ' \
+ df.Title
df.head()
/builds/2mk6rsew/0/hgozukan/cartolabe-data/examples/compare_projections_vispubdata.py:38: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df.AuthorKeywords.fillna('', inplace=True)
/builds/2mk6rsew/0/hgozukan/cartolabe-data/examples/compare_projections_vispubdata.py:39: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df.Abstract.fillna('', inplace=True)
/builds/2mk6rsew/0/hgozukan/cartolabe-data/examples/compare_projections_vispubdata.py:40: FutureWarning: A value is trying to be set on a copy of a DataFrame or Series through chained assignment using an inplace method.
The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.
For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.
df.AuthorAffiliation.fillna('', inplace=True)
Creating correspondance matrices for each entity type¶
From this table of articles, we want to extract matrices that will map the correspondance between these articles and the entities we want to use.
Filtering low score entities¶
A lot of the authors that we just extracted from the dataframe have a very low score, which means they’re only linked to one or two articles. To improve the quality of our data, we’ll filter the authors by removing those that appear less than 3 times.
To do this, we’ll use the filter_min_score function.
from cartodata.operations import filter_min_score # noqa
authors_before = len(authors_scores)
authors_mat, authors_scores = filter_min_score(authors_mat,
authors_scores,
filt_min_score)
print(f"Removed {authors_before - len(authors_scores)} authors with less "
f"than 3 articles from a total of {authors_before} authors.")
print(f"Working with {len(authors_scores)} authors.\n")
Removed 6267 authors with less than 3 articles from a total of 6987 authors.
Working with 720 authors.
Words¶
For the words, it’s a bit trickier because we want to extract n-grams (groups of n terms) instead of just comma separated values. We’ll call the load_text_column which uses scikit-learn’s CountVectorizer to create a vocabulary and map the tokens.
from cartodata.loading import load_text_column # noqa
from sklearn.feature_extraction import text as sktxt # noqa
with open('../datas/stopwords.txt', 'r') as stop_file:
stopwords = sktxt.ENGLISH_STOP_WORDS.union(
set(stop_file.read().splitlines()))
words_mat, words_scores = load_text_column(df['text'],
4,
min_df,
max_df,
stopwords=stopwords)
""
words_scores.head()
""
words_mat.shape
""
from cartodata.operations import normalize_tfidf # noqa
words_mat = normalize_tfidf(words_mat)
words_mat.shape
""
from cartodata.loading import load_identity_column # noqa
articles_mat, articles_scores = load_identity_column(df, 'Title')
articles_scores.head()
Design Patterns for Situated Visualization in Augmented Reality 1.0
ggdist: Visualizations of Distributions and Uncertainty in the Grammar of Graphics 1.0
PromptMagician: Interactive Prompt Engineering for Text-to-Image Creation 1.0
Challenges and Opportunities in Data Visualization Education: A Call to Action 1.0
Affective Visualization Design: Leveraging the Emotional Impact of Data 1.0
dtype: float64
Dimension reduction/Embeddings¶
One way to see the matrices that we created is as coordinates in the space of all articles. What we want to do is to reduce the dimension of this space to make it easier to work with and see.
Validation¶
We compute a score that counts the average number of times the 10 nearest neighbors of an article are from the same author as the article. For each author, we have a number between 1 (100%) and 0.1 (none of the articles are from the same author, except the initial article itself).
LSA projection¶
We’ll start by using the LSA (Latent Semantic Analysis) technique to identify keywords in our data and thus reduce the number of rows in our matrices. The lsa_projection method takes three arguments:
the number of dimensions you want to keep
the matrix of documents/words frequency
a list of matrices to project
It returns a list of the same length containing the matrices projected in the latent space.
We also apply an l2 normalization to each feature of the projected matrices.
from cartodata.projection import lsa_projection # noqa
from cartodata.operations import normalize_l2 # noqa
""
''
lsa_matrices = lsa_projection(num_dims, words_mat, [articles_mat, authors_mat, words_mat])
""
lsa_matrices = list(map(normalize_l2, lsa_matrices))
We’ve reduced the number of rows in each of articles_mat, authors_mat, words_mat and labs_mat to just 80.
print(f"articles_mat: {lsa_matrices[0].shape}")
print(f"authors_mat: {lsa_matrices[1].shape}")
print(f"words_mat: {lsa_matrices[2].shape}")
""
from cartodata.model_selection.scoring import Neighbors # noqa
NATURE = "articles"
SOURCE = "authors"
lsa_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=lsa_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
lsa_score.print()
articles_mat: (300, 3753)
authors_mat: (300, 720)
words_mat: (300, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1109
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Davide Ceneda : 0.40
1 Yi-Jen Chiang : 0.35
2 Mark A. Whiting : 0.33
3 Mario Jelovic : 0.30
4 David Gotz : 0.29
...
715 Enrico Gobbetti : 0.02
716 Donna L. Gresh : 0.02
717 Steven F. Roth : 0.01
718 Bruno Lévy 0001 : 0.00
719 Jinzhu Gao : 0.00
Length: 720, dtype: object
VALUE : 0.1109
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.117391
Dieter Schmalstieg 0.111111
Matthew Kay 0001 0.160000
Xingbo Wang 0001 0.116667
Minfeng Zhu 0.114286
...
Lisa M. Sobierajski 0.075000
Nahum D. Gershon 0.060000
Sidney W. Wang 0.025000
David A. Lane 0.066667
T. Todd Elvins 0.050000
Length: 720, dtype: float64
LDA projection¶
from cartodata.projection import lda_projection # noqa
""
''
lda_matrices = lda_projection(num_dims, 1, [articles_mat, authors_mat, words_mat])
""
lda_matrices = list(map(normalize_l2, lda_matrices))
""
print(f"articles_mat: {lda_matrices[0].shape}")
print(f"authors_mat: {lda_matrices[1].shape}")
print(f"words_mat: {lda_matrices[2].shape}")
""
lda_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=lda_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
lda_score.print()
articles_mat: (300, 3753)
authors_mat: (300, 720)
words_mat: (300, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.0013
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Hanspeter Pfister : 0.10
1 Johanna Schmidt : 0.10
2 Huamin Qu : 0.10
3 Jeff W. Lichtman : 0.10
4 Andrew J. Hanson : 0.10
...
715 Rainer Splechtna : 0.00
716 Denis Gracanin : 0.00
717 Helwig Hauser : 0.00
718 Kresimir Matkovic : 0.00
719 T. Todd Elvins : 0.00
Length: 720, dtype: object
VALUE : 0.0013
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.0
Dieter Schmalstieg 0.0
Matthew Kay 0001 0.0
Xingbo Wang 0001 0.0
Minfeng Zhu 0.0
...
Lisa M. Sobierajski 0.0
Nahum D. Gershon 0.0
Sidney W. Wang 0.0
David A. Lane 0.0
T. Todd Elvins 0.0
Length: 720, dtype: float64
DOC2Vec projection¶
from cartodata.projection import doc2vec_projection # noqa
""
''
doc2vec_matrices = doc2vec_projection(num_dims, 1, [articles_mat, authors_mat, words_mat], df['text'])
""
doc2vec_matrices = list(map(normalize_l2, doc2vec_matrices))
""
print(f"articles_mat: {doc2vec_matrices[0].shape}")
print(f"authors_mat: {doc2vec_matrices[1].shape}")
print(f"words_mat: {doc2vec_matrices[2].shape}")
""
doc2vec_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=doc2vec_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
doc2vec_score.print()
articles_mat: (300, 3753)
authors_mat: (300, 720)
words_mat: (300, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1200
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Steven Franconeri : 0.35
1 Cindy Xiong : 0.32
2 Mark A. Whiting : 0.30
3 Jean Scholtz : 0.30
4 Laurent Lessard : 0.25
...
715 Yarden Livnat : 0.10
716 Hongan Wang : 0.10
717 Matthew Cooper 0001 : 0.10
718 Jimmy Johansson 0001 : 0.10
719 T. Todd Elvins : 0.10
Length: 720, dtype: object
VALUE : 0.1200
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.126087
Dieter Schmalstieg 0.100000
Matthew Kay 0001 0.110000
Xingbo Wang 0001 0.150000
Minfeng Zhu 0.114286
...
Lisa M. Sobierajski 0.125000
Nahum D. Gershon 0.100000
Sidney W. Wang 0.100000
David A. Lane 0.166667
T. Todd Elvins 0.100000
Length: 720, dtype: float64
Specter2 projection¶
from cartodata.projection import bert_projection # noqa
""
''
specter2_matrices = bert_projection([articles_mat, authors_mat, words_mat], df['text'])
""
specter2_matrices = list(map(normalize_l2, specter2_matrices))
""
print(f"articles_mat: {specter2_matrices[0].shape}")
print(f"authors_mat: {specter2_matrices[1].shape}")
print(f"words_mat: {specter2_matrices[2].shape}")
""
specter2_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=specter2_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
specter2_score.print()
Using torch device: cpu
Fetching 4 files: 0%| | 0/4 [00:00<?, ?it/s]
Fetching 4 files: 100%|██████████| 4/4 [00:00<00:00, 1201.89it/s]
/usr/local/lib/python3.9/site-packages/adapters/loading.py:165: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state_dict = torch.load(weights_file, map_location="cpu")
Processing batches: 0%| | 0/376 [00:00<?, ?it/s]
Processing batches: 0%| | 1/376 [00:05<34:44, 5.56s/it]
Processing batches: 1%| | 2/376 [00:10<33:51, 5.43s/it]
Processing batches: 1%| | 3/376 [00:16<34:16, 5.51s/it]
Processing batches: 1%| | 4/376 [00:21<34:00, 5.48s/it]
Processing batches: 1%|▏ | 5/376 [00:27<34:23, 5.56s/it]
Processing batches: 2%|▏ | 6/376 [00:32<33:45, 5.47s/it]
Processing batches: 2%|▏ | 7/376 [00:38<33:25, 5.43s/it]
Processing batches: 2%|▏ | 8/376 [00:44<33:51, 5.52s/it]
Processing batches: 2%|▏ | 9/376 [00:49<33:28, 5.47s/it]
Processing batches: 3%|▎ | 10/376 [00:54<33:15, 5.45s/it]
Processing batches: 3%|▎ | 11/376 [01:00<33:32, 5.51s/it]
Processing batches: 3%|▎ | 12/376 [01:06<33:39, 5.55s/it]
Processing batches: 3%|▎ | 13/376 [01:11<33:32, 5.54s/it]
Processing batches: 4%|▎ | 14/376 [01:16<32:59, 5.47s/it]
Processing batches: 4%|▍ | 15/376 [01:22<32:45, 5.45s/it]
Processing batches: 4%|▍ | 16/376 [01:27<32:32, 5.42s/it]
Processing batches: 5%|▍ | 17/376 [01:32<32:14, 5.39s/it]
Processing batches: 5%|▍ | 18/376 [01:38<32:03, 5.37s/it]
Processing batches: 5%|▌ | 19/376 [01:44<32:34, 5.47s/it]
Processing batches: 5%|▌ | 20/376 [01:49<32:21, 5.45s/it]
Processing batches: 6%|▌ | 21/376 [01:55<32:30, 5.49s/it]
Processing batches: 6%|▌ | 22/376 [02:00<32:08, 5.45s/it]
Processing batches: 6%|▌ | 23/376 [02:06<32:29, 5.52s/it]
Processing batches: 6%|▋ | 24/376 [02:11<32:21, 5.52s/it]
Processing batches: 7%|▋ | 25/376 [02:17<32:32, 5.56s/it]
Processing batches: 7%|▋ | 26/376 [02:22<32:06, 5.50s/it]
Processing batches: 7%|▋ | 27/376 [02:27<31:41, 5.45s/it]
Processing batches: 7%|▋ | 28/376 [02:33<31:36, 5.45s/it]
Processing batches: 8%|▊ | 29/376 [02:38<31:24, 5.43s/it]
Processing batches: 8%|▊ | 30/376 [02:44<31:34, 5.47s/it]
Processing batches: 8%|▊ | 31/376 [02:49<31:34, 5.49s/it]
Processing batches: 9%|▊ | 32/376 [02:55<31:16, 5.46s/it]
Processing batches: 9%|▉ | 33/376 [03:00<31:02, 5.43s/it]
Processing batches: 9%|▉ | 34/376 [03:06<31:09, 5.47s/it]
Processing batches: 9%|▉ | 35/376 [03:11<31:05, 5.47s/it]
Processing batches: 10%|▉ | 36/376 [03:17<31:00, 5.47s/it]
Processing batches: 10%|▉ | 37/376 [03:22<30:47, 5.45s/it]
Processing batches: 10%|█ | 38/376 [03:28<31:01, 5.51s/it]
Processing batches: 10%|█ | 39/376 [03:33<30:47, 5.48s/it]
Processing batches: 11%|█ | 40/376 [03:39<30:59, 5.53s/it]
Processing batches: 11%|█ | 41/376 [03:44<30:38, 5.49s/it]
Processing batches: 11%|█ | 42/376 [03:50<30:31, 5.48s/it]
Processing batches: 11%|█▏ | 43/376 [03:55<30:22, 5.47s/it]
Processing batches: 12%|█▏ | 44/376 [04:01<30:45, 5.56s/it]
Processing batches: 12%|█▏ | 45/376 [04:06<30:23, 5.51s/it]
Processing batches: 12%|█▏ | 46/376 [04:12<30:38, 5.57s/it]
Processing batches: 12%|█▎ | 47/376 [04:18<30:38, 5.59s/it]
Processing batches: 13%|█▎ | 48/376 [04:23<30:14, 5.53s/it]
Processing batches: 13%|█▎ | 49/376 [04:28<29:54, 5.49s/it]
Processing batches: 13%|█▎ | 50/376 [04:34<29:44, 5.47s/it]
Processing batches: 14%|█▎ | 51/376 [04:39<30:04, 5.55s/it]
Processing batches: 14%|█▍ | 52/376 [04:45<29:43, 5.50s/it]
Processing batches: 14%|█▍ | 53/376 [04:50<29:36, 5.50s/it]
Processing batches: 14%|█▍ | 54/376 [04:56<29:27, 5.49s/it]
Processing batches: 15%|█▍ | 55/376 [05:01<29:36, 5.53s/it]
Processing batches: 15%|█▍ | 56/376 [05:07<29:17, 5.49s/it]
Processing batches: 15%|█▌ | 57/376 [05:13<29:34, 5.56s/it]
Processing batches: 15%|█▌ | 58/376 [05:18<29:14, 5.52s/it]
Processing batches: 16%|█▌ | 59/376 [05:24<29:27, 5.58s/it]
Processing batches: 16%|█▌ | 60/376 [05:29<29:06, 5.53s/it]
Processing batches: 16%|█▌ | 61/376 [05:35<29:12, 5.56s/it]
Processing batches: 16%|█▋ | 62/376 [05:40<28:45, 5.50s/it]
Processing batches: 17%|█▋ | 63/376 [05:46<28:57, 5.55s/it]
Processing batches: 17%|█▋ | 64/376 [05:51<28:34, 5.50s/it]
Processing batches: 17%|█▋ | 65/376 [05:57<28:53, 5.57s/it]
Processing batches: 18%|█▊ | 66/376 [06:03<28:50, 5.58s/it]
Processing batches: 18%|█▊ | 67/376 [06:08<28:54, 5.61s/it]
Processing batches: 18%|█▊ | 68/376 [06:14<28:49, 5.61s/it]
Processing batches: 18%|█▊ | 69/376 [06:19<28:40, 5.60s/it]
Processing batches: 19%|█▊ | 70/376 [06:25<28:28, 5.58s/it]
Processing batches: 19%|█▉ | 71/376 [06:31<28:26, 5.60s/it]
Processing batches: 19%|█▉ | 72/376 [06:36<28:27, 5.62s/it]
Processing batches: 19%|█▉ | 73/376 [06:42<28:26, 5.63s/it]
Processing batches: 20%|█▉ | 74/376 [06:47<27:57, 5.56s/it]
Processing batches: 20%|█▉ | 75/376 [06:53<27:53, 5.56s/it]
Processing batches: 20%|██ | 76/376 [06:58<27:48, 5.56s/it]
Processing batches: 20%|██ | 77/376 [07:04<27:25, 5.50s/it]
Processing batches: 21%|██ | 78/376 [07:09<27:34, 5.55s/it]
Processing batches: 21%|██ | 79/376 [07:15<27:36, 5.58s/it]
Processing batches: 21%|██▏ | 80/376 [07:21<27:34, 5.59s/it]
Processing batches: 22%|██▏ | 81/376 [07:26<27:29, 5.59s/it]
Processing batches: 22%|██▏ | 82/376 [07:32<27:25, 5.60s/it]
Processing batches: 22%|██▏ | 83/376 [07:37<27:15, 5.58s/it]
Processing batches: 22%|██▏ | 84/376 [07:43<27:07, 5.58s/it]
Processing batches: 23%|██▎ | 85/376 [07:49<27:02, 5.57s/it]
Processing batches: 23%|██▎ | 86/376 [07:54<26:40, 5.52s/it]
Processing batches: 23%|██▎ | 87/376 [08:00<26:38, 5.53s/it]
Processing batches: 23%|██▎ | 88/376 [08:05<26:15, 5.47s/it]
Processing batches: 24%|██▎ | 89/376 [08:10<26:17, 5.50s/it]
Processing batches: 24%|██▍ | 90/376 [08:16<25:57, 5.44s/it]
Processing batches: 24%|██▍ | 91/376 [08:21<25:57, 5.46s/it]
Processing batches: 24%|██▍ | 92/376 [08:27<25:40, 5.43s/it]
Processing batches: 25%|██▍ | 93/376 [08:32<25:44, 5.46s/it]
Processing batches: 25%|██▌ | 94/376 [08:38<25:41, 5.46s/it]
Processing batches: 25%|██▌ | 95/376 [08:43<25:46, 5.50s/it]
Processing batches: 26%|██▌ | 96/376 [08:49<25:26, 5.45s/it]
Processing batches: 26%|██▌ | 97/376 [08:54<25:27, 5.47s/it]
Processing batches: 26%|██▌ | 98/376 [08:59<25:08, 5.43s/it]
Processing batches: 26%|██▋ | 99/376 [09:05<25:16, 5.47s/it]
Processing batches: 27%|██▋ | 100/376 [09:10<24:55, 5.42s/it]
Processing batches: 27%|██▋ | 101/376 [09:16<25:08, 5.49s/it]
Processing batches: 27%|██▋ | 102/376 [09:21<24:56, 5.46s/it]
Processing batches: 27%|██▋ | 103/376 [09:27<24:49, 5.46s/it]
Processing batches: 28%|██▊ | 104/376 [09:32<24:56, 5.50s/it]
Processing batches: 28%|██▊ | 105/376 [09:38<24:41, 5.47s/it]
Processing batches: 28%|██▊ | 106/376 [09:43<24:54, 5.54s/it]
Processing batches: 28%|██▊ | 107/376 [09:49<24:52, 5.55s/it]
Processing batches: 29%|██▊ | 108/376 [09:54<24:39, 5.52s/it]
Processing batches: 29%|██▉ | 109/376 [10:00<24:50, 5.58s/it]
Processing batches: 29%|██▉ | 110/376 [10:06<24:41, 5.57s/it]
Processing batches: 30%|██▉ | 111/376 [10:11<24:39, 5.58s/it]
Processing batches: 30%|██▉ | 112/376 [10:17<24:18, 5.53s/it]
Processing batches: 30%|███ | 113/376 [10:22<24:25, 5.57s/it]
Processing batches: 30%|███ | 114/376 [10:28<24:16, 5.56s/it]
Processing batches: 31%|███ | 115/376 [10:34<24:29, 5.63s/it]
Processing batches: 31%|███ | 116/376 [10:39<24:16, 5.60s/it]
Processing batches: 31%|███ | 117/376 [10:45<24:17, 5.63s/it]
Processing batches: 31%|███▏ | 118/376 [10:51<24:18, 5.65s/it]
Processing batches: 32%|███▏ | 119/376 [10:56<24:12, 5.65s/it]
Processing batches: 32%|███▏ | 120/376 [11:02<24:12, 5.67s/it]
Processing batches: 32%|███▏ | 121/376 [11:08<24:13, 5.70s/it]
Processing batches: 32%|███▏ | 122/376 [11:14<24:09, 5.71s/it]
Processing batches: 33%|███▎ | 123/376 [11:19<23:44, 5.63s/it]
Processing batches: 33%|███▎ | 124/376 [11:25<23:41, 5.64s/it]
Processing batches: 33%|███▎ | 125/376 [11:30<23:29, 5.61s/it]
Processing batches: 34%|███▎ | 126/376 [11:36<23:30, 5.64s/it]
Processing batches: 34%|███▍ | 127/376 [11:41<23:13, 5.59s/it]
Processing batches: 34%|███▍ | 128/376 [11:47<23:12, 5.62s/it]
Processing batches: 34%|███▍ | 129/376 [11:53<23:07, 5.62s/it]
Processing batches: 35%|███▍ | 130/376 [11:58<23:01, 5.62s/it]
Processing batches: 35%|███▍ | 131/376 [12:04<22:46, 5.58s/it]
Processing batches: 35%|███▌ | 132/376 [12:09<22:45, 5.59s/it]
Processing batches: 35%|███▌ | 133/376 [12:15<22:41, 5.60s/it]
Processing batches: 36%|███▌ | 134/376 [12:21<22:45, 5.64s/it]
Processing batches: 36%|███▌ | 135/376 [12:26<22:40, 5.64s/it]
Processing batches: 36%|███▌ | 136/376 [12:32<22:19, 5.58s/it]
Processing batches: 36%|███▋ | 137/376 [12:37<22:07, 5.56s/it]
Processing batches: 37%|███▋ | 138/376 [12:43<21:47, 5.50s/it]
Processing batches: 37%|███▋ | 139/376 [12:48<21:32, 5.45s/it]
Processing batches: 37%|███▋ | 140/376 [12:53<21:20, 5.43s/it]
Processing batches: 38%|███▊ | 141/376 [12:59<21:15, 5.43s/it]
Processing batches: 38%|███▊ | 142/376 [13:04<21:18, 5.46s/it]
Processing batches: 38%|███▊ | 143/376 [13:10<20:57, 5.40s/it]
Processing batches: 38%|███▊ | 144/376 [13:15<20:58, 5.42s/it]
Processing batches: 39%|███▊ | 145/376 [13:20<20:45, 5.39s/it]
Processing batches: 39%|███▉ | 146/376 [13:26<20:34, 5.37s/it]
Processing batches: 39%|███▉ | 147/376 [13:31<20:35, 5.40s/it]
Processing batches: 39%|███▉ | 148/376 [13:37<20:28, 5.39s/it]
Processing batches: 40%|███▉ | 149/376 [13:42<20:29, 5.42s/it]
Processing batches: 40%|███▉ | 150/376 [13:48<20:30, 5.45s/it]
Processing batches: 40%|████ | 151/376 [13:53<20:23, 5.44s/it]
Processing batches: 40%|████ | 152/376 [13:59<20:44, 5.55s/it]
Processing batches: 41%|████ | 153/376 [14:04<20:43, 5.58s/it]
Processing batches: 41%|████ | 154/376 [14:10<20:50, 5.63s/it]
Processing batches: 41%|████ | 155/376 [14:16<20:32, 5.58s/it]
Processing batches: 41%|████▏ | 156/376 [14:21<20:35, 5.62s/it]
Processing batches: 42%|████▏ | 157/376 [14:27<20:23, 5.59s/it]
Processing batches: 42%|████▏ | 158/376 [14:32<20:12, 5.56s/it]
Processing batches: 42%|████▏ | 159/376 [14:38<20:14, 5.60s/it]
Processing batches: 43%|████▎ | 160/376 [14:44<20:22, 5.66s/it]
Processing batches: 43%|████▎ | 161/376 [14:49<20:09, 5.63s/it]
Processing batches: 43%|████▎ | 162/376 [14:55<20:18, 5.69s/it]
Processing batches: 43%|████▎ | 163/376 [15:01<20:04, 5.66s/it]
Processing batches: 44%|████▎ | 164/376 [15:07<20:19, 5.75s/it]
Processing batches: 44%|████▍ | 165/376 [15:12<20:07, 5.72s/it]
Processing batches: 44%|████▍ | 166/376 [15:18<19:39, 5.62s/it]
Processing batches: 44%|████▍ | 167/376 [15:24<19:39, 5.64s/it]
Processing batches: 45%|████▍ | 168/376 [15:29<19:11, 5.54s/it]
Processing batches: 45%|████▍ | 169/376 [15:34<19:07, 5.55s/it]
Processing batches: 45%|████▌ | 170/376 [15:40<18:53, 5.50s/it]
Processing batches: 45%|████▌ | 171/376 [15:45<18:58, 5.55s/it]
Processing batches: 46%|████▌ | 172/376 [15:51<18:53, 5.56s/it]
Processing batches: 46%|████▌ | 173/376 [15:56<18:42, 5.53s/it]
Processing batches: 46%|████▋ | 174/376 [16:02<18:49, 5.59s/it]
Processing batches: 47%|████▋ | 175/376 [16:08<18:51, 5.63s/it]
Processing batches: 47%|████▋ | 176/376 [16:12<16:44, 5.02s/it]
Processing batches: 47%|████▋ | 177/376 [16:17<17:21, 5.24s/it]
Processing batches: 47%|████▋ | 178/376 [16:23<17:24, 5.28s/it]
Processing batches: 48%|████▊ | 179/376 [16:28<17:27, 5.32s/it]
Processing batches: 48%|████▊ | 180/376 [16:34<17:29, 5.36s/it]
Processing batches: 48%|████▊ | 181/376 [16:39<17:40, 5.44s/it]
Processing batches: 48%|████▊ | 182/376 [16:45<17:32, 5.42s/it]
Processing batches: 49%|████▊ | 183/376 [16:50<17:20, 5.39s/it]
Processing batches: 49%|████▉ | 184/376 [16:55<17:19, 5.41s/it]
Processing batches: 49%|████▉ | 185/376 [17:01<17:15, 5.42s/it]
Processing batches: 49%|████▉ | 186/376 [17:06<17:19, 5.47s/it]
Processing batches: 50%|████▉ | 187/376 [17:12<17:15, 5.48s/it]
Processing batches: 50%|█████ | 188/376 [17:17<17:00, 5.43s/it]
Processing batches: 50%|█████ | 189/376 [17:22<16:41, 5.36s/it]
Processing batches: 51%|█████ | 190/376 [17:28<16:40, 5.38s/it]
Processing batches: 51%|█████ | 191/376 [17:33<16:42, 5.42s/it]
Processing batches: 51%|█████ | 192/376 [17:39<16:28, 5.37s/it]
Processing batches: 51%|█████▏ | 193/376 [17:44<16:11, 5.31s/it]
Processing batches: 52%|█████▏ | 194/376 [17:49<16:02, 5.29s/it]
Processing batches: 52%|█████▏ | 195/376 [17:54<15:58, 5.30s/it]
Processing batches: 52%|█████▏ | 196/376 [18:00<16:03, 5.35s/it]
Processing batches: 52%|█████▏ | 197/376 [18:05<16:14, 5.44s/it]
Processing batches: 53%|█████▎ | 198/376 [18:11<16:06, 5.43s/it]
Processing batches: 53%|█████▎ | 199/376 [18:16<15:58, 5.41s/it]
Processing batches: 53%|█████▎ | 200/376 [18:22<15:51, 5.41s/it]
Processing batches: 53%|█████▎ | 201/376 [18:27<15:46, 5.41s/it]
Processing batches: 54%|█████▎ | 202/376 [18:32<15:07, 5.22s/it]
Processing batches: 54%|█████▍ | 203/376 [18:38<15:32, 5.39s/it]
Processing batches: 54%|█████▍ | 204/376 [18:43<15:36, 5.44s/it]
Processing batches: 55%|█████▍ | 205/376 [18:49<15:28, 5.43s/it]
Processing batches: 55%|█████▍ | 206/376 [18:54<15:23, 5.43s/it]
Processing batches: 55%|█████▌ | 207/376 [19:00<15:24, 5.47s/it]
Processing batches: 55%|█████▌ | 208/376 [19:05<15:17, 5.46s/it]
Processing batches: 56%|█████▌ | 209/376 [19:10<15:10, 5.45s/it]
Processing batches: 56%|█████▌ | 210/376 [19:16<15:00, 5.42s/it]
Processing batches: 56%|█████▌ | 211/376 [19:21<14:52, 5.41s/it]
Processing batches: 56%|█████▋ | 212/376 [19:27<15:05, 5.52s/it]
Processing batches: 57%|█████▋ | 213/376 [19:33<15:09, 5.58s/it]
Processing batches: 57%|█████▋ | 214/376 [19:38<15:17, 5.66s/it]
Processing batches: 57%|█████▋ | 215/376 [19:43<14:28, 5.39s/it]
Processing batches: 57%|█████▋ | 216/376 [19:48<14:08, 5.30s/it]
Processing batches: 58%|█████▊ | 217/376 [19:54<14:11, 5.35s/it]
Processing batches: 58%|█████▊ | 218/376 [20:00<14:22, 5.46s/it]
Processing batches: 58%|█████▊ | 219/376 [20:05<14:29, 5.54s/it]
Processing batches: 59%|█████▊ | 220/376 [20:11<14:28, 5.57s/it]
Processing batches: 59%|█████▉ | 221/376 [20:17<14:27, 5.60s/it]
Processing batches: 59%|█████▉ | 222/376 [20:22<14:25, 5.62s/it]
Processing batches: 59%|█████▉ | 223/376 [20:28<14:17, 5.61s/it]
Processing batches: 60%|█████▉ | 224/376 [20:33<14:14, 5.62s/it]
Processing batches: 60%|█████▉ | 225/376 [20:39<14:09, 5.62s/it]
Processing batches: 60%|██████ | 226/376 [20:45<14:05, 5.63s/it]
Processing batches: 60%|██████ | 227/376 [20:51<14:05, 5.68s/it]
Processing batches: 61%|██████ | 228/376 [20:56<14:00, 5.68s/it]
Processing batches: 61%|██████ | 229/376 [21:02<13:56, 5.69s/it]
Processing batches: 61%|██████ | 230/376 [21:08<13:46, 5.66s/it]
Processing batches: 61%|██████▏ | 231/376 [21:13<13:44, 5.68s/it]
Processing batches: 62%|██████▏ | 232/376 [21:19<13:35, 5.67s/it]
Processing batches: 62%|██████▏ | 233/376 [21:25<13:29, 5.66s/it]
Processing batches: 62%|██████▏ | 234/376 [21:29<12:34, 5.31s/it]
Processing batches: 62%|██████▎ | 235/376 [21:34<12:35, 5.36s/it]
Processing batches: 63%|██████▎ | 236/376 [21:40<12:42, 5.45s/it]
Processing batches: 63%|██████▎ | 237/376 [21:46<12:49, 5.53s/it]
Processing batches: 63%|██████▎ | 238/376 [21:52<12:47, 5.56s/it]
Processing batches: 64%|██████▎ | 239/376 [21:57<12:55, 5.66s/it]
Processing batches: 64%|██████▍ | 240/376 [22:03<12:50, 5.66s/it]
Processing batches: 64%|██████▍ | 241/376 [22:09<12:45, 5.67s/it]
Processing batches: 64%|██████▍ | 242/376 [22:14<12:37, 5.66s/it]
Processing batches: 65%|██████▍ | 243/376 [22:20<12:38, 5.70s/it]
Processing batches: 65%|██████▍ | 244/376 [22:26<12:29, 5.68s/it]
Processing batches: 65%|██████▌ | 245/376 [22:31<12:23, 5.68s/it]
Processing batches: 65%|██████▌ | 246/376 [22:37<12:20, 5.69s/it]
Processing batches: 66%|██████▌ | 247/376 [22:43<12:18, 5.72s/it]
Processing batches: 66%|██████▌ | 248/376 [22:49<12:19, 5.78s/it]
Processing batches: 66%|██████▌ | 249/376 [22:53<10:58, 5.18s/it]
Processing batches: 66%|██████▋ | 250/376 [22:58<10:45, 5.12s/it]
Processing batches: 67%|██████▋ | 251/376 [23:02<10:15, 4.93s/it]
Processing batches: 67%|██████▋ | 252/376 [23:08<10:53, 5.27s/it]
Processing batches: 67%|██████▋ | 253/376 [23:14<11:00, 5.37s/it]
Processing batches: 68%|██████▊ | 254/376 [23:20<11:11, 5.50s/it]
Processing batches: 68%|██████▊ | 255/376 [23:25<11:08, 5.53s/it]
Processing batches: 68%|██████▊ | 256/376 [23:31<11:02, 5.52s/it]
Processing batches: 68%|██████▊ | 257/376 [23:36<11:01, 5.56s/it]
Processing batches: 69%|██████▊ | 258/376 [23:42<11:13, 5.71s/it]
Processing batches: 69%|██████▉ | 259/376 [23:48<11:14, 5.77s/it]
Processing batches: 69%|██████▉ | 260/376 [23:54<11:06, 5.75s/it]
Processing batches: 69%|██████▉ | 261/376 [24:00<10:53, 5.68s/it]
Processing batches: 70%|██████▉ | 262/376 [24:06<10:58, 5.77s/it]
Processing batches: 70%|██████▉ | 263/376 [24:12<11:00, 5.85s/it]
Processing batches: 70%|███████ | 264/376 [24:17<10:51, 5.82s/it]
Processing batches: 70%|███████ | 265/376 [24:23<10:55, 5.90s/it]
Processing batches: 71%|███████ | 266/376 [24:29<10:51, 5.93s/it]
Processing batches: 71%|███████ | 267/376 [24:35<10:38, 5.86s/it]
Processing batches: 71%|███████▏ | 268/376 [24:40<10:06, 5.62s/it]
Processing batches: 72%|███████▏ | 269/376 [24:46<10:03, 5.64s/it]
Processing batches: 72%|███████▏ | 270/376 [24:51<09:55, 5.61s/it]
Processing batches: 72%|███████▏ | 271/376 [24:57<09:46, 5.59s/it]
Processing batches: 72%|███████▏ | 272/376 [25:03<09:47, 5.65s/it]
Processing batches: 73%|███████▎ | 273/376 [25:08<09:43, 5.66s/it]
Processing batches: 73%|███████▎ | 274/376 [25:14<09:45, 5.74s/it]
Processing batches: 73%|███████▎ | 275/376 [25:18<08:47, 5.22s/it]
Processing batches: 73%|███████▎ | 276/376 [25:24<08:56, 5.36s/it]
Processing batches: 74%|███████▎ | 277/376 [25:29<08:40, 5.26s/it]
Processing batches: 74%|███████▍ | 278/376 [25:35<08:44, 5.35s/it]
Processing batches: 74%|███████▍ | 279/376 [25:38<07:49, 4.84s/it]
Processing batches: 74%|███████▍ | 280/376 [25:44<08:11, 5.12s/it]
Processing batches: 75%|███████▍ | 281/376 [25:50<08:25, 5.32s/it]
Processing batches: 75%|███████▌ | 282/376 [25:56<08:35, 5.48s/it]
Processing batches: 75%|███████▌ | 283/376 [26:01<08:23, 5.42s/it]
Processing batches: 76%|███████▌ | 284/376 [26:06<07:59, 5.22s/it]
Processing batches: 76%|███████▌ | 285/376 [26:10<07:17, 4.81s/it]
Processing batches: 76%|███████▌ | 286/376 [26:10<05:20, 3.56s/it]
Processing batches: 76%|███████▋ | 287/376 [26:17<06:29, 4.38s/it]
Processing batches: 77%|███████▋ | 288/376 [26:22<07:03, 4.81s/it]
Processing batches: 77%|███████▋ | 289/376 [26:28<07:24, 5.11s/it]
Processing batches: 77%|███████▋ | 290/376 [26:34<07:37, 5.32s/it]
Processing batches: 77%|███████▋ | 291/376 [26:40<07:41, 5.43s/it]
Processing batches: 78%|███████▊ | 292/376 [26:45<07:46, 5.56s/it]
Processing batches: 78%|███████▊ | 293/376 [26:51<07:48, 5.64s/it]
Processing batches: 78%|███████▊ | 294/376 [26:57<07:46, 5.69s/it]
Processing batches: 78%|███████▊ | 295/376 [27:01<07:04, 5.24s/it]
Processing batches: 79%|███████▊ | 296/376 [27:07<07:14, 5.43s/it]
Processing batches: 79%|███████▉ | 297/376 [27:13<07:12, 5.47s/it]
Processing batches: 79%|███████▉ | 298/376 [27:18<07:07, 5.48s/it]
Processing batches: 80%|███████▉ | 299/376 [27:23<06:35, 5.14s/it]
Processing batches: 80%|███████▉ | 300/376 [27:27<06:23, 5.04s/it]
Processing batches: 80%|████████ | 301/376 [27:33<06:35, 5.27s/it]
Processing batches: 80%|████████ | 302/376 [27:39<06:35, 5.35s/it]
Processing batches: 81%|████████ | 303/376 [27:44<06:36, 5.43s/it]
Processing batches: 81%|████████ | 304/376 [27:50<06:37, 5.52s/it]
Processing batches: 81%|████████ | 305/376 [27:55<06:27, 5.46s/it]
Processing batches: 81%|████████▏ | 306/376 [28:01<06:18, 5.40s/it]
Processing batches: 82%|████████▏ | 307/376 [28:06<06:17, 5.48s/it]
Processing batches: 82%|████████▏ | 308/376 [28:12<06:16, 5.54s/it]
Processing batches: 82%|████████▏ | 309/376 [28:17<05:51, 5.25s/it]
Processing batches: 82%|████████▏ | 310/376 [28:22<05:56, 5.40s/it]
Processing batches: 83%|████████▎ | 311/376 [28:28<05:58, 5.51s/it]
Processing batches: 83%|████████▎ | 312/376 [28:32<05:15, 4.93s/it]
Processing batches: 83%|████████▎ | 313/376 [28:37<05:23, 5.13s/it]
Processing batches: 84%|████████▎ | 314/376 [28:43<05:29, 5.32s/it]
Processing batches: 84%|████████▍ | 315/376 [28:49<05:33, 5.46s/it]
Processing batches: 84%|████████▍ | 316/376 [28:55<05:33, 5.56s/it]
Processing batches: 84%|████████▍ | 317/376 [29:00<05:32, 5.64s/it]
Processing batches: 85%|████████▍ | 318/376 [29:06<05:27, 5.64s/it]
Processing batches: 85%|████████▍ | 319/376 [29:12<05:25, 5.70s/it]
Processing batches: 85%|████████▌ | 320/376 [29:17<05:15, 5.63s/it]
Processing batches: 85%|████████▌ | 321/376 [29:23<05:03, 5.53s/it]
Processing batches: 86%|████████▌ | 322/376 [29:28<04:50, 5.39s/it]
Processing batches: 86%|████████▌ | 323/376 [29:32<04:32, 5.15s/it]
Processing batches: 86%|████████▌ | 324/376 [29:38<04:35, 5.30s/it]
Processing batches: 86%|████████▋ | 325/376 [29:44<04:38, 5.46s/it]
Processing batches: 87%|████████▋ | 326/376 [29:48<04:13, 5.07s/it]
Processing batches: 87%|████████▋ | 327/376 [29:54<04:17, 5.26s/it]
Processing batches: 87%|████████▋ | 328/376 [29:59<04:17, 5.35s/it]
Processing batches: 88%|████████▊ | 329/376 [30:05<04:17, 5.47s/it]
Processing batches: 88%|████████▊ | 330/376 [30:11<04:13, 5.51s/it]
Processing batches: 88%|████████▊ | 331/376 [30:14<03:39, 4.87s/it]
Processing batches: 88%|████████▊ | 332/376 [30:20<03:47, 5.18s/it]
Processing batches: 89%|████████▊ | 333/376 [30:23<03:18, 4.62s/it]
Processing batches: 89%|████████▉ | 334/376 [30:29<03:29, 4.98s/it]
Processing batches: 89%|████████▉ | 335/376 [30:35<03:32, 5.17s/it]
Processing batches: 89%|████████▉ | 336/376 [30:40<03:31, 5.29s/it]
Processing batches: 90%|████████▉ | 337/376 [30:46<03:29, 5.38s/it]
Processing batches: 90%|████████▉ | 338/376 [30:50<03:16, 5.17s/it]
Processing batches: 90%|█████████ | 339/376 [30:54<02:56, 4.78s/it]
Processing batches: 90%|█████████ | 340/376 [30:59<02:45, 4.60s/it]
Processing batches: 91%|█████████ | 341/376 [31:04<02:52, 4.92s/it]
Processing batches: 91%|█████████ | 342/376 [31:10<02:54, 5.12s/it]
Processing batches: 91%|█████████ | 343/376 [31:15<02:52, 5.24s/it]
Processing batches: 91%|█████████▏| 344/376 [31:19<02:34, 4.83s/it]
Processing batches: 92%|█████████▏| 345/376 [31:24<02:28, 4.78s/it]
Processing batches: 92%|█████████▏| 346/376 [31:29<02:28, 4.93s/it]
Processing batches: 92%|█████████▏| 347/376 [31:33<02:14, 4.62s/it]
Processing batches: 93%|█████████▎| 348/376 [31:38<02:10, 4.66s/it]
Processing batches: 93%|█████████▎| 349/376 [31:43<02:14, 4.98s/it]
Processing batches: 93%|█████████▎| 350/376 [31:49<02:14, 5.16s/it]
Processing batches: 93%|█████████▎| 351/376 [31:55<02:13, 5.32s/it]
Processing batches: 94%|█████████▎| 352/376 [32:01<02:11, 5.50s/it]
Processing batches: 94%|█████████▍| 353/376 [32:05<01:58, 5.16s/it]
Processing batches: 94%|█████████▍| 354/376 [32:11<01:55, 5.25s/it]
Processing batches: 94%|█████████▍| 355/376 [32:15<01:42, 4.89s/it]
Processing batches: 95%|█████████▍| 356/376 [32:20<01:40, 5.01s/it]
Processing batches: 95%|█████████▍| 357/376 [32:24<01:31, 4.82s/it]
Processing batches: 95%|█████████▌| 358/376 [32:30<01:33, 5.18s/it]
Processing batches: 95%|█████████▌| 359/376 [32:34<01:20, 4.74s/it]
Processing batches: 96%|█████████▌| 360/376 [32:37<01:09, 4.37s/it]
Processing batches: 96%|█████████▌| 361/376 [32:42<01:04, 4.28s/it]
Processing batches: 96%|█████████▋| 362/376 [32:46<01:00, 4.32s/it]
Processing batches: 97%|█████████▋| 363/376 [32:49<00:51, 3.92s/it]
Processing batches: 97%|█████████▋| 364/376 [32:54<00:51, 4.25s/it]
Processing batches: 97%|█████████▋| 365/376 [32:58<00:47, 4.31s/it]
Processing batches: 97%|█████████▋| 366/376 [33:02<00:41, 4.16s/it]
Processing batches: 98%|█████████▊| 367/376 [33:06<00:35, 3.99s/it]
Processing batches: 98%|█████████▊| 368/376 [33:09<00:30, 3.85s/it]
Processing batches: 98%|█████████▊| 369/376 [33:13<00:26, 3.75s/it]
Processing batches: 98%|█████████▊| 370/376 [33:16<00:22, 3.70s/it]
Processing batches: 99%|█████████▊| 371/376 [33:20<00:18, 3.78s/it]
Processing batches: 99%|█████████▉| 372/376 [33:25<00:16, 4.01s/it]
Processing batches: 99%|█████████▉| 373/376 [33:29<00:11, 3.88s/it]
Processing batches: 99%|█████████▉| 374/376 [33:32<00:07, 3.88s/it]
Processing batches: 100%|█████████▉| 375/376 [33:36<00:03, 3.78s/it]
Processing batches: 100%|██████████| 376/376 [33:37<00:00, 3.04s/it]
Processing batches: 100%|██████████| 376/376 [33:37<00:00, 5.37s/it]
articles_mat: (768, 3753)
authors_mat: (768, 720)
words_mat: (768, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1672
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Natalia V. Andrienko : 0.51
1 Gennady L. Andrienko : 0.49
2 Bernhard Preim : 0.46
3 Attila Gyulassy : 0.45
4 Martin Kraus 0001 : 0.40
...
715 Markus Rütten : 0.10
716 Jiawan Zhang : 0.10
717 Ayan Biswas : 0.10
718 Tera Marie Green : 0.10
719 T. Todd Elvins : 0.10
Length: 720, dtype: object
VALUE : 0.1672
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.147826
Dieter Schmalstieg 0.111111
Matthew Kay 0001 0.310000
Xingbo Wang 0001 0.133333
Minfeng Zhu 0.100000
...
Lisa M. Sobierajski 0.150000
Nahum D. Gershon 0.140000
Sidney W. Wang 0.100000
David A. Lane 0.300000
T. Todd Elvins 0.100000
Length: 720, dtype: float64
Scincl projection¶
scincl_matrices = bert_projection([articles_mat, authors_mat, words_mat], df['text'], family="scincl")
""
scincl_matrices = list(map(normalize_l2, scincl_matrices))
""
print(f"articles_mat: {scincl_matrices[0].shape}")
print(f"authors_mat: {scincl_matrices[1].shape}")
print(f"words_mat: {scincl_matrices[2].shape}")
""
scincl_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=scincl_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
scincl_score.print()
Using torch device: cpu
Processing batches: 0%| | 0/376 [00:00<?, ?it/s]
Processing batches: 0%| | 1/376 [00:05<36:01, 5.77s/it]
Processing batches: 1%| | 2/376 [00:11<35:27, 5.69s/it]
Processing batches: 1%| | 3/376 [00:16<34:42, 5.58s/it]
Processing batches: 1%| | 4/376 [00:22<34:19, 5.54s/it]
Processing batches: 1%|▏ | 5/376 [00:27<33:58, 5.49s/it]
Processing batches: 2%|▏ | 6/376 [00:33<33:46, 5.48s/it]
Processing batches: 2%|▏ | 7/376 [00:38<33:41, 5.48s/it]
Processing batches: 2%|▏ | 8/376 [00:44<33:35, 5.48s/it]
Processing batches: 2%|▏ | 9/376 [00:49<33:35, 5.49s/it]
Processing batches: 3%|▎ | 10/376 [00:55<33:23, 5.48s/it]
Processing batches: 3%|▎ | 11/376 [01:00<33:26, 5.50s/it]
Processing batches: 3%|▎ | 12/376 [01:06<33:30, 5.52s/it]
Processing batches: 3%|▎ | 13/376 [01:11<33:26, 5.53s/it]
Processing batches: 4%|▎ | 14/376 [01:17<33:12, 5.50s/it]
Processing batches: 4%|▍ | 15/376 [01:22<32:55, 5.47s/it]
Processing batches: 4%|▍ | 16/376 [01:27<32:35, 5.43s/it]
Processing batches: 5%|▍ | 17/376 [01:33<32:29, 5.43s/it]
Processing batches: 5%|▍ | 18/376 [01:38<32:24, 5.43s/it]
Processing batches: 5%|▌ | 19/376 [01:44<32:24, 5.45s/it]
Processing batches: 5%|▌ | 20/376 [01:49<32:22, 5.46s/it]
Processing batches: 6%|▌ | 21/376 [01:55<32:13, 5.45s/it]
Processing batches: 6%|▌ | 22/376 [02:00<32:16, 5.47s/it]
Processing batches: 6%|▌ | 23/376 [02:06<32:11, 5.47s/it]
Processing batches: 6%|▋ | 24/376 [02:11<31:55, 5.44s/it]
Processing batches: 7%|▋ | 25/376 [02:16<31:36, 5.40s/it]
Processing batches: 7%|▋ | 26/376 [02:22<31:41, 5.43s/it]
Processing batches: 7%|▋ | 27/376 [02:27<31:39, 5.44s/it]
Processing batches: 7%|▋ | 28/376 [02:33<31:37, 5.45s/it]
Processing batches: 8%|▊ | 29/376 [02:38<31:36, 5.47s/it]
Processing batches: 8%|▊ | 30/376 [02:44<31:32, 5.47s/it]
Processing batches: 8%|▊ | 31/376 [02:49<31:20, 5.45s/it]
Processing batches: 9%|▊ | 32/376 [02:55<31:15, 5.45s/it]
Processing batches: 9%|▉ | 33/376 [03:00<31:14, 5.47s/it]
Processing batches: 9%|▉ | 34/376 [03:06<31:10, 5.47s/it]
Processing batches: 9%|▉ | 35/376 [03:11<31:37, 5.57s/it]
Processing batches: 10%|▉ | 36/376 [03:17<31:21, 5.53s/it]
Processing batches: 10%|▉ | 37/376 [03:22<31:10, 5.52s/it]
Processing batches: 10%|█ | 38/376 [03:28<31:00, 5.50s/it]
Processing batches: 10%|█ | 39/376 [03:33<31:03, 5.53s/it]
Processing batches: 11%|█ | 40/376 [03:39<30:43, 5.49s/it]
Processing batches: 11%|█ | 41/376 [03:44<30:39, 5.49s/it]
Processing batches: 11%|█ | 42/376 [03:50<30:33, 5.49s/it]
Processing batches: 11%|█▏ | 43/376 [03:55<30:36, 5.52s/it]
Processing batches: 12%|█▏ | 44/376 [04:01<30:47, 5.56s/it]
Processing batches: 12%|█▏ | 45/376 [04:07<30:39, 5.56s/it]
Processing batches: 12%|█▏ | 46/376 [04:12<30:44, 5.59s/it]
Processing batches: 12%|█▎ | 47/376 [04:18<30:33, 5.57s/it]
Processing batches: 13%|█▎ | 48/376 [04:23<30:16, 5.54s/it]
Processing batches: 13%|█▎ | 49/376 [04:29<29:58, 5.50s/it]
Processing batches: 13%|█▎ | 50/376 [04:34<29:47, 5.48s/it]
Processing batches: 14%|█▎ | 51/376 [04:39<29:29, 5.44s/it]
Processing batches: 14%|█▍ | 52/376 [04:45<29:10, 5.40s/it]
Processing batches: 14%|█▍ | 53/376 [04:50<29:15, 5.43s/it]
Processing batches: 14%|█▍ | 54/376 [04:56<29:10, 5.44s/it]
Processing batches: 15%|█▍ | 55/376 [05:01<29:11, 5.46s/it]
Processing batches: 15%|█▍ | 56/376 [05:07<28:58, 5.43s/it]
Processing batches: 15%|█▌ | 57/376 [05:12<28:55, 5.44s/it]
Processing batches: 15%|█▌ | 58/376 [05:17<28:49, 5.44s/it]
Processing batches: 16%|█▌ | 59/376 [05:23<28:54, 5.47s/it]
Processing batches: 16%|█▌ | 60/376 [05:29<28:48, 5.47s/it]
Processing batches: 16%|█▌ | 61/376 [05:34<28:29, 5.43s/it]
Processing batches: 16%|█▋ | 62/376 [05:39<28:34, 5.46s/it]
Processing batches: 17%|█▋ | 63/376 [05:45<28:31, 5.47s/it]
Processing batches: 17%|█▋ | 64/376 [05:50<28:28, 5.48s/it]
Processing batches: 17%|█▋ | 65/376 [05:56<28:31, 5.50s/it]
Processing batches: 18%|█▊ | 66/376 [06:01<28:24, 5.50s/it]
Processing batches: 18%|█▊ | 67/376 [06:07<28:21, 5.50s/it]
Processing batches: 18%|█▊ | 68/376 [06:12<28:14, 5.50s/it]
Processing batches: 18%|█▊ | 69/376 [06:18<28:05, 5.49s/it]
Processing batches: 19%|█▊ | 70/376 [06:23<27:56, 5.48s/it]
Processing batches: 19%|█▉ | 71/376 [06:29<27:49, 5.47s/it]
Processing batches: 19%|█▉ | 72/376 [06:34<27:46, 5.48s/it]
Processing batches: 19%|█▉ | 73/376 [06:40<27:39, 5.48s/it]
Processing batches: 20%|█▉ | 74/376 [06:45<27:33, 5.47s/it]
Processing batches: 20%|█▉ | 75/376 [06:51<27:20, 5.45s/it]
Processing batches: 20%|██ | 76/376 [06:56<27:17, 5.46s/it]
Processing batches: 20%|██ | 77/376 [07:02<27:15, 5.47s/it]
Processing batches: 21%|██ | 78/376 [07:07<27:14, 5.48s/it]
Processing batches: 21%|██ | 79/376 [07:13<27:13, 5.50s/it]
Processing batches: 21%|██▏ | 80/376 [07:18<27:09, 5.50s/it]
Processing batches: 22%|██▏ | 81/376 [07:24<26:56, 5.48s/it]
Processing batches: 22%|██▏ | 82/376 [07:29<26:49, 5.48s/it]
Processing batches: 22%|██▏ | 83/376 [07:34<26:39, 5.46s/it]
Processing batches: 22%|██▏ | 84/376 [07:40<26:41, 5.49s/it]
Processing batches: 23%|██▎ | 85/376 [07:46<26:38, 5.49s/it]
Processing batches: 23%|██▎ | 86/376 [07:51<26:40, 5.52s/it]
Processing batches: 23%|██▎ | 87/376 [07:57<26:36, 5.52s/it]
Processing batches: 23%|██▎ | 88/376 [08:02<26:27, 5.51s/it]
Processing batches: 24%|██▎ | 89/376 [08:08<26:15, 5.49s/it]
Processing batches: 24%|██▍ | 90/376 [08:13<26:07, 5.48s/it]
Processing batches: 24%|██▍ | 91/376 [08:19<26:01, 5.48s/it]
Processing batches: 24%|██▍ | 92/376 [08:24<25:56, 5.48s/it]
Processing batches: 25%|██▍ | 93/376 [08:29<25:50, 5.48s/it]
Processing batches: 25%|██▌ | 94/376 [08:35<25:40, 5.46s/it]
Processing batches: 25%|██▌ | 95/376 [08:40<25:42, 5.49s/it]
Processing batches: 26%|██▌ | 96/376 [08:46<25:38, 5.49s/it]
Processing batches: 26%|██▌ | 97/376 [08:51<25:22, 5.46s/it]
Processing batches: 26%|██▌ | 98/376 [08:57<25:09, 5.43s/it]
Processing batches: 26%|██▋ | 99/376 [09:02<24:50, 5.38s/it]
Processing batches: 27%|██▋ | 100/376 [09:07<24:34, 5.34s/it]
Processing batches: 27%|██▋ | 101/376 [09:13<24:37, 5.37s/it]
Processing batches: 27%|██▋ | 102/376 [09:18<24:42, 5.41s/it]
Processing batches: 27%|██▋ | 103/376 [09:24<24:37, 5.41s/it]
Processing batches: 28%|██▊ | 104/376 [09:29<24:46, 5.47s/it]
Processing batches: 28%|██▊ | 105/376 [09:35<24:33, 5.44s/it]
Processing batches: 28%|██▊ | 106/376 [09:40<24:32, 5.45s/it]
Processing batches: 28%|██▊ | 107/376 [09:45<24:19, 5.43s/it]
Processing batches: 29%|██▊ | 108/376 [09:51<24:23, 5.46s/it]
Processing batches: 29%|██▉ | 109/376 [09:57<24:33, 5.52s/it]
Processing batches: 29%|██▉ | 110/376 [10:02<24:33, 5.54s/it]
Processing batches: 30%|██▉ | 111/376 [10:08<24:23, 5.52s/it]
Processing batches: 30%|██▉ | 112/376 [10:13<24:16, 5.52s/it]
Processing batches: 30%|███ | 113/376 [10:19<24:05, 5.49s/it]
Processing batches: 30%|███ | 114/376 [10:24<23:59, 5.49s/it]
Processing batches: 31%|███ | 115/376 [10:30<23:54, 5.50s/it]
Processing batches: 31%|███ | 116/376 [10:35<23:56, 5.52s/it]
Processing batches: 31%|███ | 117/376 [10:40<23:28, 5.44s/it]
Processing batches: 31%|███▏ | 118/376 [10:46<23:14, 5.40s/it]
Processing batches: 32%|███▏ | 119/376 [10:51<22:59, 5.37s/it]
Processing batches: 32%|███▏ | 120/376 [10:56<22:50, 5.35s/it]
Processing batches: 32%|███▏ | 121/376 [11:02<22:39, 5.33s/it]
Processing batches: 32%|███▏ | 122/376 [11:07<22:29, 5.31s/it]
Processing batches: 33%|███▎ | 123/376 [11:12<22:25, 5.32s/it]
Processing batches: 33%|███▎ | 124/376 [11:18<22:22, 5.33s/it]
Processing batches: 33%|███▎ | 125/376 [11:23<22:38, 5.41s/it]
Processing batches: 34%|███▎ | 126/376 [11:29<22:41, 5.44s/it]
Processing batches: 34%|███▍ | 127/376 [11:34<22:41, 5.47s/it]
Processing batches: 34%|███▍ | 128/376 [11:40<22:39, 5.48s/it]
Processing batches: 34%|███▍ | 129/376 [11:45<22:36, 5.49s/it]
Processing batches: 35%|███▍ | 130/376 [11:51<22:34, 5.51s/it]
Processing batches: 35%|███▍ | 131/376 [11:56<22:27, 5.50s/it]
Processing batches: 35%|███▌ | 132/376 [12:02<22:10, 5.45s/it]
Processing batches: 35%|███▌ | 133/376 [12:07<22:03, 5.45s/it]
Processing batches: 36%|███▌ | 134/376 [12:12<21:54, 5.43s/it]
Processing batches: 36%|███▌ | 135/376 [12:18<21:50, 5.44s/it]
Processing batches: 36%|███▌ | 136/376 [12:23<21:39, 5.42s/it]
Processing batches: 36%|███▋ | 137/376 [12:29<21:29, 5.39s/it]
Processing batches: 37%|███▋ | 138/376 [12:34<21:15, 5.36s/it]
Processing batches: 37%|███▋ | 139/376 [12:39<21:15, 5.38s/it]
Processing batches: 37%|███▋ | 140/376 [12:45<21:09, 5.38s/it]
Processing batches: 38%|███▊ | 141/376 [12:50<21:10, 5.41s/it]
Processing batches: 38%|███▊ | 142/376 [12:56<21:04, 5.40s/it]
Processing batches: 38%|███▊ | 143/376 [13:01<21:04, 5.43s/it]
Processing batches: 38%|███▊ | 144/376 [13:07<21:04, 5.45s/it]
Processing batches: 39%|███▊ | 145/376 [13:12<21:00, 5.46s/it]
Processing batches: 39%|███▉ | 146/376 [13:17<20:51, 5.44s/it]
Processing batches: 39%|███▉ | 147/376 [13:23<20:41, 5.42s/it]
Processing batches: 39%|███▉ | 148/376 [13:28<20:25, 5.38s/it]
Processing batches: 40%|███▉ | 149/376 [13:33<20:22, 5.39s/it]
Processing batches: 40%|███▉ | 150/376 [13:39<20:21, 5.41s/it]
Processing batches: 40%|████ | 151/376 [13:44<19:58, 5.33s/it]
Processing batches: 40%|████ | 152/376 [13:50<20:05, 5.38s/it]
Processing batches: 41%|████ | 153/376 [13:55<20:06, 5.41s/it]
Processing batches: 41%|████ | 154/376 [14:01<20:11, 5.46s/it]
Processing batches: 41%|████ | 155/376 [14:06<19:56, 5.41s/it]
Processing batches: 41%|████▏ | 156/376 [14:11<19:50, 5.41s/it]
Processing batches: 42%|████▏ | 157/376 [14:17<19:41, 5.39s/it]
Processing batches: 42%|████▏ | 158/376 [14:22<19:40, 5.41s/it]
Processing batches: 42%|████▏ | 159/376 [14:28<19:40, 5.44s/it]
Processing batches: 43%|████▎ | 160/376 [14:33<19:40, 5.47s/it]
Processing batches: 43%|████▎ | 161/376 [14:39<19:40, 5.49s/it]
Processing batches: 43%|████▎ | 162/376 [14:44<19:48, 5.55s/it]
Processing batches: 43%|████▎ | 163/376 [14:50<19:47, 5.57s/it]
Processing batches: 44%|████▎ | 164/376 [14:56<19:41, 5.58s/it]
Processing batches: 44%|████▍ | 165/376 [15:01<19:22, 5.51s/it]
Processing batches: 44%|████▍ | 166/376 [15:06<19:02, 5.44s/it]
Processing batches: 44%|████▍ | 167/376 [15:12<18:45, 5.38s/it]
Processing batches: 45%|████▍ | 168/376 [15:17<18:43, 5.40s/it]
Processing batches: 45%|████▍ | 169/376 [15:22<18:41, 5.42s/it]
Processing batches: 45%|████▌ | 170/376 [15:28<18:36, 5.42s/it]
Processing batches: 45%|████▌ | 171/376 [15:33<18:35, 5.44s/it]
Processing batches: 46%|████▌ | 172/376 [15:39<18:26, 5.42s/it]
Processing batches: 46%|████▌ | 173/376 [15:44<18:39, 5.52s/it]
Processing batches: 46%|████▋ | 174/376 [15:50<18:29, 5.49s/it]
Processing batches: 47%|████▋ | 175/376 [15:55<18:17, 5.46s/it]
Processing batches: 47%|████▋ | 176/376 [15:59<16:13, 4.87s/it]
Processing batches: 47%|████▋ | 177/376 [16:04<16:50, 5.08s/it]
Processing batches: 47%|████▋ | 178/376 [16:10<17:05, 5.18s/it]
Processing batches: 48%|████▊ | 179/376 [16:15<17:19, 5.27s/it]
Processing batches: 48%|████▊ | 180/376 [16:21<17:23, 5.33s/it]
Processing batches: 48%|████▊ | 181/376 [16:26<17:25, 5.36s/it]
Processing batches: 48%|████▊ | 182/376 [16:32<17:27, 5.40s/it]
Processing batches: 49%|████▊ | 183/376 [16:37<17:31, 5.45s/it]
Processing batches: 49%|████▉ | 184/376 [16:43<17:24, 5.44s/it]
Processing batches: 49%|████▉ | 185/376 [16:48<17:18, 5.44s/it]
Processing batches: 49%|████▉ | 186/376 [16:53<17:14, 5.44s/it]
Processing batches: 50%|████▉ | 187/376 [16:59<17:01, 5.41s/it]
Processing batches: 50%|█████ | 188/376 [17:04<17:00, 5.43s/it]
Processing batches: 50%|█████ | 189/376 [17:10<16:54, 5.43s/it]
Processing batches: 51%|█████ | 190/376 [17:15<16:58, 5.47s/it]
Processing batches: 51%|█████ | 191/376 [17:21<17:05, 5.55s/it]
Processing batches: 51%|█████ | 192/376 [17:26<16:53, 5.51s/it]
Processing batches: 51%|█████▏ | 193/376 [17:32<16:33, 5.43s/it]
Processing batches: 52%|█████▏ | 194/376 [17:37<16:29, 5.44s/it]
Processing batches: 52%|█████▏ | 195/376 [17:43<16:24, 5.44s/it]
Processing batches: 52%|█████▏ | 196/376 [17:48<16:26, 5.48s/it]
Processing batches: 52%|█████▏ | 197/376 [17:54<16:24, 5.50s/it]
Processing batches: 53%|█████▎ | 198/376 [17:59<16:18, 5.50s/it]
Processing batches: 53%|█████▎ | 199/376 [18:05<16:13, 5.50s/it]
Processing batches: 53%|█████▎ | 200/376 [18:10<15:59, 5.45s/it]
Processing batches: 53%|█████▎ | 201/376 [18:15<15:49, 5.42s/it]
Processing batches: 54%|█████▎ | 202/376 [18:20<15:03, 5.19s/it]
Processing batches: 54%|█████▍ | 203/376 [18:26<15:23, 5.34s/it]
Processing batches: 54%|█████▍ | 204/376 [18:31<15:30, 5.41s/it]
Processing batches: 55%|█████▍ | 205/376 [18:37<15:23, 5.40s/it]
Processing batches: 55%|█████▍ | 206/376 [18:42<15:15, 5.39s/it]
Processing batches: 55%|█████▌ | 207/376 [18:47<15:13, 5.40s/it]
Processing batches: 55%|█████▌ | 208/376 [18:53<15:12, 5.43s/it]
Processing batches: 56%|█████▌ | 209/376 [18:58<15:06, 5.43s/it]
Processing batches: 56%|█████▌ | 210/376 [19:04<15:00, 5.42s/it]
Processing batches: 56%|█████▌ | 211/376 [19:09<14:56, 5.43s/it]
Processing batches: 56%|█████▋ | 212/376 [19:15<14:55, 5.46s/it]
Processing batches: 57%|█████▋ | 213/376 [19:20<14:50, 5.47s/it]
Processing batches: 57%|█████▋ | 214/376 [19:26<14:51, 5.50s/it]
Processing batches: 57%|█████▋ | 215/376 [19:30<13:59, 5.22s/it]
Processing batches: 57%|█████▋ | 216/376 [19:35<13:48, 5.18s/it]
Processing batches: 58%|█████▊ | 217/376 [19:41<14:00, 5.29s/it]
Processing batches: 58%|█████▊ | 218/376 [19:46<14:01, 5.33s/it]
Processing batches: 58%|█████▊ | 219/376 [19:52<14:01, 5.36s/it]
Processing batches: 59%|█████▊ | 220/376 [19:57<14:06, 5.43s/it]
Processing batches: 59%|█████▉ | 221/376 [20:03<13:59, 5.42s/it]
Processing batches: 59%|█████▉ | 222/376 [20:08<13:59, 5.45s/it]
Processing batches: 59%|█████▉ | 223/376 [20:14<13:52, 5.44s/it]
Processing batches: 60%|█████▉ | 224/376 [20:19<13:41, 5.40s/it]
Processing batches: 60%|█████▉ | 225/376 [20:24<13:28, 5.35s/it]
Processing batches: 60%|██████ | 226/376 [20:30<13:19, 5.33s/it]
Processing batches: 60%|██████ | 227/376 [20:35<13:14, 5.33s/it]
Processing batches: 61%|██████ | 228/376 [20:40<13:08, 5.33s/it]
Processing batches: 61%|██████ | 229/376 [20:46<13:01, 5.31s/it]
Processing batches: 61%|██████ | 230/376 [20:51<12:57, 5.32s/it]
Processing batches: 61%|██████▏ | 231/376 [20:56<12:57, 5.36s/it]
Processing batches: 62%|██████▏ | 232/376 [21:02<12:53, 5.37s/it]
Processing batches: 62%|██████▏ | 233/376 [21:07<12:51, 5.40s/it]
Processing batches: 62%|██████▏ | 234/376 [21:12<12:08, 5.13s/it]
Processing batches: 62%|██████▎ | 235/376 [21:17<12:21, 5.26s/it]
Processing batches: 63%|██████▎ | 236/376 [21:23<12:28, 5.35s/it]
Processing batches: 63%|██████▎ | 237/376 [21:28<12:27, 5.38s/it]
Processing batches: 63%|██████▎ | 238/376 [21:34<12:24, 5.39s/it]
Processing batches: 64%|██████▎ | 239/376 [21:39<12:26, 5.45s/it]
Processing batches: 64%|██████▍ | 240/376 [21:45<12:18, 5.43s/it]
Processing batches: 64%|██████▍ | 241/376 [21:50<12:09, 5.41s/it]
Processing batches: 64%|██████▍ | 242/376 [21:55<12:02, 5.39s/it]
Processing batches: 65%|██████▍ | 243/376 [22:01<12:01, 5.42s/it]
Processing batches: 65%|██████▍ | 244/376 [22:06<11:56, 5.43s/it]
Processing batches: 65%|██████▌ | 245/376 [22:12<11:48, 5.41s/it]
Processing batches: 65%|██████▌ | 246/376 [22:17<11:41, 5.40s/it]
Processing batches: 66%|██████▌ | 247/376 [22:22<11:37, 5.41s/it]
Processing batches: 66%|██████▌ | 248/376 [22:28<11:33, 5.42s/it]
Processing batches: 66%|██████▌ | 249/376 [22:32<10:21, 4.89s/it]
Processing batches: 66%|██████▋ | 250/376 [22:36<10:10, 4.85s/it]
Processing batches: 67%|██████▋ | 251/376 [22:40<09:39, 4.64s/it]
Processing batches: 67%|██████▋ | 252/376 [22:46<10:15, 4.97s/it]
Processing batches: 67%|██████▋ | 253/376 [22:52<10:26, 5.09s/it]
Processing batches: 68%|██████▊ | 254/376 [22:57<10:28, 5.15s/it]
Processing batches: 68%|██████▊ | 255/376 [23:02<10:30, 5.21s/it]
Processing batches: 68%|██████▊ | 256/376 [23:08<10:34, 5.29s/it]
Processing batches: 68%|██████▊ | 257/376 [23:13<10:30, 5.30s/it]
Processing batches: 69%|██████▊ | 258/376 [23:19<10:39, 5.42s/it]
Processing batches: 69%|██████▉ | 259/376 [23:24<10:42, 5.49s/it]
Processing batches: 69%|██████▉ | 260/376 [23:30<10:44, 5.55s/it]
Processing batches: 69%|██████▉ | 261/376 [23:35<10:33, 5.51s/it]
Processing batches: 70%|██████▉ | 262/376 [23:41<10:40, 5.62s/it]
Processing batches: 70%|██████▉ | 263/376 [23:47<10:36, 5.64s/it]
Processing batches: 70%|███████ | 264/376 [23:53<10:27, 5.60s/it]
Processing batches: 70%|███████ | 265/376 [23:58<10:25, 5.64s/it]
Processing batches: 71%|███████ | 266/376 [24:04<10:13, 5.58s/it]
Processing batches: 71%|███████ | 267/376 [24:09<09:57, 5.48s/it]
Processing batches: 71%|███████▏ | 268/376 [24:14<09:33, 5.31s/it]
Processing batches: 72%|███████▏ | 269/376 [24:20<09:37, 5.40s/it]
Processing batches: 72%|███████▏ | 270/376 [24:25<09:31, 5.39s/it]
Processing batches: 72%|███████▏ | 271/376 [24:31<09:34, 5.47s/it]
Processing batches: 72%|███████▏ | 272/376 [24:36<09:27, 5.46s/it]
Processing batches: 73%|███████▎ | 273/376 [24:41<09:20, 5.44s/it]
Processing batches: 73%|███████▎ | 274/376 [24:47<09:23, 5.53s/it]
Processing batches: 73%|███████▎ | 275/376 [24:51<08:29, 5.05s/it]
Processing batches: 73%|███████▎ | 276/376 [24:57<08:37, 5.18s/it]
Processing batches: 74%|███████▎ | 277/376 [25:01<08:23, 5.08s/it]
Processing batches: 74%|███████▍ | 278/376 [25:07<08:29, 5.20s/it]
Processing batches: 74%|███████▍ | 279/376 [25:10<07:37, 4.72s/it]
Processing batches: 74%|███████▍ | 280/376 [25:16<08:00, 5.01s/it]
Processing batches: 75%|███████▍ | 281/376 [25:22<08:08, 5.14s/it]
Processing batches: 75%|███████▌ | 282/376 [25:27<08:17, 5.30s/it]
Processing batches: 75%|███████▌ | 283/376 [25:32<08:10, 5.27s/it]
Processing batches: 76%|███████▌ | 284/376 [25:37<07:48, 5.09s/it]
Processing batches: 76%|███████▌ | 285/376 [25:41<07:08, 4.71s/it]
Processing batches: 76%|███████▌ | 286/376 [25:42<05:14, 3.49s/it]
Processing batches: 76%|███████▋ | 287/376 [25:48<06:15, 4.22s/it]
Processing batches: 77%|███████▋ | 288/376 [25:53<06:46, 4.61s/it]
Processing batches: 77%|███████▋ | 289/376 [25:58<06:59, 4.82s/it]
Processing batches: 77%|███████▋ | 290/376 [26:04<07:14, 5.06s/it]
Processing batches: 77%|███████▋ | 291/376 [26:09<07:20, 5.19s/it]
Processing batches: 78%|███████▊ | 292/376 [26:15<07:33, 5.40s/it]
Processing batches: 78%|███████▊ | 293/376 [26:21<07:30, 5.42s/it]
Processing batches: 78%|███████▊ | 294/376 [26:27<07:32, 5.52s/it]
Processing batches: 78%|███████▊ | 295/376 [26:31<06:53, 5.10s/it]
Processing batches: 79%|███████▊ | 296/376 [26:36<07:05, 5.31s/it]
Processing batches: 79%|███████▉ | 297/376 [26:42<07:07, 5.42s/it]
Processing batches: 79%|███████▉ | 298/376 [26:48<07:07, 5.48s/it]
Processing batches: 80%|███████▉ | 299/376 [26:52<06:33, 5.11s/it]
Processing batches: 80%|███████▉ | 300/376 [26:57<06:22, 5.04s/it]
Processing batches: 80%|████████ | 301/376 [27:02<06:29, 5.19s/it]
Processing batches: 80%|████████ | 302/376 [27:08<06:30, 5.28s/it]
Processing batches: 81%|████████ | 303/376 [27:13<06:29, 5.33s/it]
Processing batches: 81%|████████ | 304/376 [27:19<06:30, 5.42s/it]
Processing batches: 81%|████████ | 305/376 [27:24<06:20, 5.36s/it]
Processing batches: 81%|████████▏ | 306/376 [27:30<06:14, 5.35s/it]
Processing batches: 82%|████████▏ | 307/376 [27:35<06:10, 5.36s/it]
Processing batches: 82%|████████▏ | 308/376 [27:41<06:10, 5.45s/it]
Processing batches: 82%|████████▏ | 309/376 [27:45<05:44, 5.14s/it]
Processing batches: 82%|████████▏ | 310/376 [27:51<05:48, 5.28s/it]
Processing batches: 83%|████████▎ | 311/376 [27:56<05:45, 5.31s/it]
Processing batches: 83%|████████▎ | 312/376 [28:00<05:05, 4.77s/it]
Processing batches: 83%|████████▎ | 313/376 [28:05<05:16, 5.02s/it]
Processing batches: 84%|████████▎ | 314/376 [28:11<05:23, 5.22s/it]
Processing batches: 84%|████████▍ | 315/376 [28:16<05:21, 5.27s/it]
Processing batches: 84%|████████▍ | 316/376 [28:22<05:21, 5.36s/it]
Processing batches: 84%|████████▍ | 317/376 [28:27<05:17, 5.38s/it]
Processing batches: 85%|████████▍ | 318/376 [28:33<05:16, 5.46s/it]
Processing batches: 85%|████████▍ | 319/376 [28:38<05:11, 5.46s/it]
Processing batches: 85%|████████▌ | 320/376 [28:44<05:03, 5.42s/it]
Processing batches: 85%|████████▌ | 321/376 [28:49<04:55, 5.38s/it]
Processing batches: 86%|████████▌ | 322/376 [28:54<04:43, 5.24s/it]
Processing batches: 86%|████████▌ | 323/376 [28:58<04:28, 5.06s/it]
Processing batches: 86%|████████▌ | 324/376 [29:04<04:31, 5.22s/it]
Processing batches: 86%|████████▋ | 325/376 [29:10<04:29, 5.29s/it]
Processing batches: 87%|████████▋ | 326/376 [29:14<04:05, 4.91s/it]
Processing batches: 87%|████████▋ | 327/376 [29:19<04:08, 5.07s/it]
Processing batches: 87%|████████▋ | 328/376 [29:25<04:12, 5.25s/it]
Processing batches: 88%|████████▊ | 329/376 [29:30<04:11, 5.34s/it]
Processing batches: 88%|████████▊ | 330/376 [29:36<04:07, 5.38s/it]
Processing batches: 88%|████████▊ | 331/376 [29:39<03:33, 4.75s/it]
Processing batches: 88%|████████▊ | 332/376 [29:45<03:41, 5.02s/it]
Processing batches: 89%|████████▊ | 333/376 [29:48<03:13, 4.50s/it]
Processing batches: 89%|████████▉ | 334/376 [29:53<03:22, 4.82s/it]
Processing batches: 89%|████████▉ | 335/376 [29:59<03:27, 5.06s/it]
Processing batches: 89%|████████▉ | 336/376 [30:05<03:29, 5.25s/it]
Processing batches: 90%|████████▉ | 337/376 [30:11<03:30, 5.41s/it]
Processing batches: 90%|████████▉ | 338/376 [30:15<03:19, 5.26s/it]
Processing batches: 90%|█████████ | 339/376 [30:19<02:58, 4.82s/it]
Processing batches: 90%|█████████ | 340/376 [30:23<02:46, 4.61s/it]
Processing batches: 91%|█████████ | 341/376 [30:29<02:50, 4.87s/it]
Processing batches: 91%|█████████ | 342/376 [30:35<02:54, 5.14s/it]
Processing batches: 91%|█████████ | 343/376 [30:40<02:54, 5.30s/it]
Processing batches: 91%|█████████▏| 344/376 [30:44<02:34, 4.83s/it]
Processing batches: 92%|█████████▏| 345/376 [30:49<02:27, 4.75s/it]
Processing batches: 92%|█████████▏| 346/376 [30:54<02:26, 4.89s/it]
Processing batches: 92%|█████████▏| 347/376 [30:58<02:12, 4.56s/it]
Processing batches: 93%|█████████▎| 348/376 [31:02<02:09, 4.64s/it]
Processing batches: 93%|█████████▎| 349/376 [31:08<02:12, 4.93s/it]
Processing batches: 93%|█████████▎| 350/376 [31:14<02:13, 5.14s/it]
Processing batches: 93%|█████████▎| 351/376 [31:19<02:11, 5.26s/it]
Processing batches: 94%|█████████▎| 352/376 [31:25<02:07, 5.30s/it]
Processing batches: 94%|█████████▍| 353/376 [31:29<01:53, 4.94s/it]
Processing batches: 94%|█████████▍| 354/376 [31:34<01:50, 5.01s/it]
Processing batches: 94%|█████████▍| 355/376 [31:38<01:37, 4.63s/it]
Processing batches: 95%|█████████▍| 356/376 [31:43<01:35, 4.77s/it]
Processing batches: 95%|█████████▍| 357/376 [31:47<01:26, 4.57s/it]
Processing batches: 95%|█████████▌| 358/376 [31:52<01:27, 4.88s/it]
Processing batches: 95%|█████████▌| 359/376 [31:56<01:15, 4.46s/it]
Processing batches: 96%|█████████▌| 360/376 [31:59<01:06, 4.13s/it]
Processing batches: 96%|█████████▌| 361/376 [32:03<01:01, 4.07s/it]
Processing batches: 96%|█████████▋| 362/376 [32:08<00:58, 4.14s/it]
Processing batches: 97%|█████████▋| 363/376 [32:10<00:49, 3.78s/it]
Processing batches: 97%|█████████▋| 364/376 [32:16<00:50, 4.18s/it]
Processing batches: 97%|█████████▋| 365/376 [32:20<00:46, 4.25s/it]
Processing batches: 97%|█████████▋| 366/376 [32:24<00:41, 4.11s/it]
Processing batches: 98%|█████████▊| 367/376 [32:27<00:35, 3.95s/it]
Processing batches: 98%|█████████▊| 368/376 [32:31<00:30, 3.79s/it]
Processing batches: 98%|█████████▊| 369/376 [32:34<00:25, 3.69s/it]
Processing batches: 98%|█████████▊| 370/376 [32:38<00:21, 3.63s/it]
Processing batches: 99%|█████████▊| 371/376 [32:42<00:18, 3.71s/it]
Processing batches: 99%|█████████▉| 372/376 [32:46<00:15, 3.99s/it]
Processing batches: 99%|█████████▉| 373/376 [32:50<00:11, 3.85s/it]
Processing batches: 99%|█████████▉| 374/376 [32:54<00:07, 3.86s/it]
Processing batches: 100%|█████████▉| 375/376 [32:57<00:03, 3.77s/it]
Processing batches: 100%|██████████| 376/376 [32:58<00:00, 3.01s/it]
Processing batches: 100%|██████████| 376/376 [32:58<00:00, 5.26s/it]
articles_mat: (768, 3753)
authors_mat: (768, 720)
words_mat: (768, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1590
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Natalia V. Andrienko : 0.52
1 Gennady L. Andrienko : 0.49
2 Attila Gyulassy : 0.42
3 Karen B. Schloss : 0.41
4 Aditi Majumder : 0.41
...
715 Andrew Vande Moere : 0.10
716 Peter Bak : 0.10
717 Mark W. Jones : 0.10
718 Ji Soo Yi : 0.10
719 Morteza Karimzadeh : 0.10
Length: 720, dtype: object
VALUE : 0.1590
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.126087
Dieter Schmalstieg 0.100000
Matthew Kay 0001 0.270000
Xingbo Wang 0001 0.133333
Minfeng Zhu 0.100000
...
Lisa M. Sobierajski 0.175000
Nahum D. Gershon 0.140000
Sidney W. Wang 0.100000
David A. Lane 0.216667
T. Todd Elvins 0.150000
Length: 720, dtype: float64
“all-MiniLM-L6-v2” projection¶
minilm_matrices = bert_projection([articles_mat, authors_mat, words_mat], df['text'], family="all-MiniLM-L6-v2")
""
minilm_matrices = list(map(normalize_l2, minilm_matrices))
""
print(f"articles_mat: {minilm_matrices[0].shape}")
print(f"authors_mat: {minilm_matrices[1].shape}")
print(f"words_mat: {minilm_matrices[2].shape}")
Processing batches: 0%| | 0/376 [00:00<?, ?it/s]
Processing batches: 0%| | 1/376 [00:03<20:29, 3.28s/it]
Processing batches: 1%| | 2/376 [00:06<18:38, 2.99s/it]
Processing batches: 1%| | 3/376 [00:08<17:05, 2.75s/it]
Processing batches: 1%| | 4/376 [00:12<19:36, 3.16s/it]
Processing batches: 1%|▏ | 5/376 [00:14<17:13, 2.79s/it]
Processing batches: 2%|▏ | 6/376 [00:17<16:47, 2.72s/it]
Processing batches: 2%|▏ | 7/376 [00:18<15:07, 2.46s/it]
Processing batches: 2%|▏ | 8/376 [00:21<15:41, 2.56s/it]
Processing batches: 2%|▏ | 9/376 [00:24<15:50, 2.59s/it]
Processing batches: 3%|▎ | 10/376 [00:27<17:18, 2.84s/it]
Processing batches: 3%|▎ | 11/376 [00:30<17:39, 2.90s/it]
Processing batches: 3%|▎ | 12/376 [00:32<15:38, 2.58s/it]
Processing batches: 3%|▎ | 13/376 [00:35<16:39, 2.75s/it]
Processing batches: 4%|▎ | 14/376 [00:38<16:57, 2.81s/it]
Processing batches: 4%|▍ | 15/376 [00:41<16:58, 2.82s/it]
Processing batches: 4%|▍ | 16/376 [00:45<18:05, 3.02s/it]
Processing batches: 5%|▍ | 17/376 [00:47<16:21, 2.73s/it]
Processing batches: 5%|▍ | 18/376 [00:49<16:28, 2.76s/it]
Processing batches: 5%|▌ | 19/376 [00:52<16:16, 2.74s/it]
Processing batches: 5%|▌ | 20/376 [00:54<15:16, 2.57s/it]
Processing batches: 6%|▌ | 21/376 [00:56<14:08, 2.39s/it]
Processing batches: 6%|▌ | 22/376 [00:59<14:22, 2.44s/it]
Processing batches: 6%|▌ | 23/376 [01:02<14:41, 2.50s/it]
Processing batches: 6%|▋ | 24/376 [01:05<15:38, 2.67s/it]
Processing batches: 7%|▋ | 25/376 [01:08<16:37, 2.84s/it]
Processing batches: 7%|▋ | 26/376 [01:10<15:15, 2.62s/it]
Processing batches: 7%|▋ | 27/376 [01:13<15:55, 2.74s/it]
Processing batches: 7%|▋ | 28/376 [01:15<14:58, 2.58s/it]
Processing batches: 8%|▊ | 29/376 [01:18<15:49, 2.74s/it]
Processing batches: 8%|▊ | 30/376 [01:22<16:53, 2.93s/it]
Processing batches: 8%|▊ | 31/376 [01:23<14:47, 2.57s/it]
Processing batches: 9%|▊ | 32/376 [01:27<16:35, 2.89s/it]
Processing batches: 9%|▉ | 33/376 [01:31<18:29, 3.23s/it]
Processing batches: 9%|▉ | 34/376 [01:33<16:49, 2.95s/it]
Processing batches: 9%|▉ | 35/376 [01:36<16:29, 2.90s/it]
Processing batches: 10%|▉ | 36/376 [01:39<15:53, 2.80s/it]
Processing batches: 10%|▉ | 37/376 [01:41<14:44, 2.61s/it]
Processing batches: 10%|█ | 38/376 [01:43<14:12, 2.52s/it]
Processing batches: 10%|█ | 39/376 [01:45<13:18, 2.37s/it]
Processing batches: 11%|█ | 40/376 [01:47<12:50, 2.29s/it]
Processing batches: 11%|█ | 41/376 [01:49<12:29, 2.24s/it]
Processing batches: 11%|█ | 42/376 [01:52<12:53, 2.32s/it]
Processing batches: 11%|█▏ | 43/376 [01:56<15:49, 2.85s/it]
Processing batches: 12%|█▏ | 44/376 [01:58<14:10, 2.56s/it]
Processing batches: 12%|█▏ | 45/376 [02:00<13:01, 2.36s/it]
Processing batches: 12%|█▏ | 46/376 [02:02<12:34, 2.29s/it]
Processing batches: 12%|█▎ | 47/376 [02:04<12:41, 2.32s/it]
Processing batches: 13%|█▎ | 48/376 [02:06<11:56, 2.18s/it]
Processing batches: 13%|█▎ | 49/376 [02:08<11:44, 2.15s/it]
Processing batches: 13%|█▎ | 50/376 [02:11<11:55, 2.19s/it]
Processing batches: 14%|█▎ | 51/376 [02:14<13:20, 2.46s/it]
Processing batches: 14%|█▍ | 52/376 [02:16<13:21, 2.47s/it]
Processing batches: 14%|█▍ | 53/376 [02:18<12:38, 2.35s/it]
Processing batches: 14%|█▍ | 54/376 [02:21<12:49, 2.39s/it]
Processing batches: 15%|█▍ | 55/376 [02:24<15:03, 2.82s/it]
Processing batches: 15%|█▍ | 56/376 [02:27<14:18, 2.68s/it]
Processing batches: 15%|█▌ | 57/376 [02:29<12:55, 2.43s/it]
Processing batches: 15%|█▌ | 58/376 [02:32<14:52, 2.81s/it]
Processing batches: 16%|█▌ | 59/376 [02:35<14:30, 2.75s/it]
Processing batches: 16%|█▌ | 60/376 [02:37<12:56, 2.46s/it]
Processing batches: 16%|█▌ | 61/376 [02:39<12:51, 2.45s/it]
Processing batches: 16%|█▋ | 62/376 [02:41<11:54, 2.28s/it]
Processing batches: 17%|█▋ | 63/376 [02:43<11:49, 2.27s/it]
Processing batches: 17%|█▋ | 64/376 [02:46<12:00, 2.31s/it]
Processing batches: 17%|█▋ | 65/376 [02:49<12:52, 2.48s/it]
Processing batches: 18%|█▊ | 66/376 [02:51<12:58, 2.51s/it]
Processing batches: 18%|█▊ | 67/376 [02:54<13:12, 2.57s/it]
Processing batches: 18%|█▊ | 68/376 [02:56<11:55, 2.32s/it]
Processing batches: 18%|█▊ | 69/376 [02:57<11:07, 2.18s/it]
Processing batches: 19%|█▊ | 70/376 [03:00<11:13, 2.20s/it]
Processing batches: 19%|█▉ | 71/376 [03:02<11:31, 2.27s/it]
Processing batches: 19%|█▉ | 72/376 [03:06<13:44, 2.71s/it]
Processing batches: 19%|█▉ | 73/376 [03:08<12:52, 2.55s/it]
Processing batches: 20%|█▉ | 74/376 [03:11<12:51, 2.56s/it]
Processing batches: 20%|█▉ | 75/376 [03:13<12:27, 2.48s/it]
Processing batches: 20%|██ | 76/376 [03:15<11:23, 2.28s/it]
Processing batches: 20%|██ | 77/376 [03:17<11:28, 2.30s/it]
Processing batches: 21%|██ | 78/376 [03:20<12:16, 2.47s/it]
Processing batches: 21%|██ | 79/376 [03:22<11:53, 2.40s/it]
Processing batches: 21%|██▏ | 80/376 [03:24<10:57, 2.22s/it]
Processing batches: 22%|██▏ | 81/376 [03:26<10:26, 2.12s/it]
Processing batches: 22%|██▏ | 82/376 [03:29<12:08, 2.48s/it]
Processing batches: 22%|██▏ | 83/376 [03:32<11:50, 2.43s/it]
Processing batches: 22%|██▏ | 84/376 [03:35<13:59, 2.88s/it]
Processing batches: 23%|██▎ | 85/376 [03:37<12:01, 2.48s/it]
Processing batches: 23%|██▎ | 86/376 [03:41<14:28, 2.99s/it]
Processing batches: 23%|██▎ | 87/376 [03:43<12:51, 2.67s/it]
Processing batches: 23%|██▎ | 88/376 [03:45<12:07, 2.52s/it]
Processing batches: 24%|██▎ | 89/376 [03:48<12:28, 2.61s/it]
Processing batches: 24%|██▍ | 90/376 [03:51<12:55, 2.71s/it]
Processing batches: 24%|██▍ | 91/376 [03:53<12:10, 2.56s/it]
Processing batches: 24%|██▍ | 92/376 [03:57<13:52, 2.93s/it]
Processing batches: 25%|██▍ | 93/376 [03:59<11:53, 2.52s/it]
Processing batches: 25%|██▌ | 94/376 [04:01<11:25, 2.43s/it]
Processing batches: 25%|██▌ | 95/376 [04:03<11:20, 2.42s/it]
Processing batches: 26%|██▌ | 96/376 [04:06<11:58, 2.57s/it]
Processing batches: 26%|██▌ | 97/376 [04:09<12:07, 2.61s/it]
Processing batches: 26%|██▌ | 98/376 [04:11<11:07, 2.40s/it]
Processing batches: 26%|██▋ | 99/376 [04:13<10:23, 2.25s/it]
Processing batches: 27%|██▋ | 100/376 [04:15<10:19, 2.24s/it]
Processing batches: 27%|██▋ | 101/376 [04:16<09:03, 1.98s/it]
Processing batches: 27%|██▋ | 102/376 [04:18<08:51, 1.94s/it]
Processing batches: 27%|██▋ | 103/376 [04:20<09:04, 1.99s/it]
Processing batches: 28%|██▊ | 104/376 [04:22<09:15, 2.04s/it]
Processing batches: 28%|██▊ | 105/376 [04:24<09:09, 2.03s/it]
Processing batches: 28%|██▊ | 106/376 [04:27<09:21, 2.08s/it]
Processing batches: 28%|██▊ | 107/376 [04:29<09:37, 2.15s/it]
Processing batches: 29%|██▊ | 108/376 [04:31<08:57, 2.01s/it]
Processing batches: 29%|██▉ | 109/376 [04:33<09:12, 2.07s/it]
Processing batches: 29%|██▉ | 110/376 [04:35<09:04, 2.05s/it]
Processing batches: 30%|██▉ | 111/376 [04:37<08:48, 2.00s/it]
Processing batches: 30%|██▉ | 112/376 [04:39<09:21, 2.13s/it]
Processing batches: 30%|███ | 113/376 [04:41<09:16, 2.12s/it]
Processing batches: 30%|███ | 114/376 [04:44<09:32, 2.19s/it]
Processing batches: 31%|███ | 115/376 [04:45<08:35, 1.98s/it]
Processing batches: 31%|███ | 116/376 [04:47<08:09, 1.88s/it]
Processing batches: 31%|███ | 117/376 [04:48<07:53, 1.83s/it]
Processing batches: 31%|███▏ | 118/376 [04:50<07:46, 1.81s/it]
Processing batches: 32%|███▏ | 119/376 [04:53<09:28, 2.21s/it]
Processing batches: 32%|███▏ | 120/376 [04:56<09:36, 2.25s/it]
Processing batches: 32%|███▏ | 121/376 [04:58<09:13, 2.17s/it]
Processing batches: 32%|███▏ | 122/376 [04:59<08:48, 2.08s/it]
Processing batches: 33%|███▎ | 123/376 [05:02<08:57, 2.13s/it]
Processing batches: 33%|███▎ | 124/376 [05:05<10:36, 2.52s/it]
Processing batches: 33%|███▎ | 125/376 [05:07<09:32, 2.28s/it]
Processing batches: 34%|███▎ | 126/376 [05:09<09:09, 2.20s/it]
Processing batches: 34%|███▍ | 127/376 [05:11<08:30, 2.05s/it]
Processing batches: 34%|███▍ | 128/376 [05:12<07:56, 1.92s/it]
Processing batches: 34%|███▍ | 129/376 [05:15<08:46, 2.13s/it]
Processing batches: 35%|███▍ | 130/376 [05:18<09:28, 2.31s/it]
Processing batches: 35%|███▍ | 131/376 [05:20<09:58, 2.44s/it]
Processing batches: 35%|███▌ | 132/376 [05:23<10:21, 2.55s/it]
Processing batches: 35%|███▌ | 133/376 [05:26<10:32, 2.60s/it]
Processing batches: 36%|███▌ | 134/376 [05:28<09:44, 2.41s/it]
Processing batches: 36%|███▌ | 135/376 [05:30<09:28, 2.36s/it]
Processing batches: 36%|███▌ | 136/376 [05:33<09:59, 2.50s/it]
Processing batches: 36%|███▋ | 137/376 [05:35<09:55, 2.49s/it]
Processing batches: 37%|███▋ | 138/376 [05:38<09:49, 2.48s/it]
Processing batches: 37%|███▋ | 139/376 [05:40<08:56, 2.26s/it]
Processing batches: 37%|███▋ | 140/376 [05:42<08:50, 2.25s/it]
Processing batches: 38%|███▊ | 141/376 [05:44<08:28, 2.16s/it]
Processing batches: 38%|███▊ | 142/376 [05:46<08:18, 2.13s/it]
Processing batches: 38%|███▊ | 143/376 [05:48<08:15, 2.13s/it]
Processing batches: 38%|███▊ | 144/376 [05:50<08:08, 2.11s/it]
Processing batches: 39%|███▊ | 145/376 [05:53<08:42, 2.26s/it]
Processing batches: 39%|███▉ | 146/376 [05:56<09:33, 2.49s/it]
Processing batches: 39%|███▉ | 147/376 [05:58<09:03, 2.38s/it]
Processing batches: 39%|███▉ | 148/376 [06:00<09:16, 2.44s/it]
Processing batches: 40%|███▉ | 149/376 [06:02<08:27, 2.24s/it]
Processing batches: 40%|███▉ | 150/376 [06:04<08:32, 2.27s/it]
Processing batches: 40%|████ | 151/376 [06:06<07:33, 2.02s/it]
Processing batches: 40%|████ | 152/376 [06:08<07:57, 2.13s/it]
Processing batches: 41%|████ | 153/376 [06:10<07:08, 1.92s/it]
Processing batches: 41%|████ | 154/376 [06:11<06:51, 1.85s/it]
Processing batches: 41%|████ | 155/376 [06:13<06:14, 1.70s/it]
Processing batches: 41%|████▏ | 156/376 [06:15<06:38, 1.81s/it]
Processing batches: 42%|████▏ | 157/376 [06:17<07:28, 2.05s/it]
Processing batches: 42%|████▏ | 158/376 [06:19<07:07, 1.96s/it]
Processing batches: 42%|████▏ | 159/376 [06:21<06:47, 1.88s/it]
Processing batches: 43%|████▎ | 160/376 [06:22<06:30, 1.81s/it]
Processing batches: 43%|████▎ | 161/376 [06:24<06:13, 1.74s/it]
Processing batches: 43%|████▎ | 162/376 [06:26<06:15, 1.76s/it]
Processing batches: 43%|████▎ | 163/376 [06:27<05:56, 1.67s/it]
Processing batches: 44%|████▎ | 164/376 [06:29<06:15, 1.77s/it]
Processing batches: 44%|████▍ | 165/376 [06:31<06:26, 1.83s/it]
Processing batches: 44%|████▍ | 166/376 [06:34<07:15, 2.07s/it]
Processing batches: 44%|████▍ | 167/376 [06:36<07:13, 2.07s/it]
Processing batches: 45%|████▍ | 168/376 [06:38<07:32, 2.18s/it]
Processing batches: 45%|████▍ | 169/376 [06:40<07:03, 2.05s/it]
Processing batches: 45%|████▌ | 170/376 [06:42<07:11, 2.10s/it]
Processing batches: 45%|████▌ | 171/376 [06:44<07:04, 2.07s/it]
Processing batches: 46%|████▌ | 172/376 [06:46<06:43, 1.98s/it]
Processing batches: 46%|████▌ | 173/376 [06:47<06:02, 1.78s/it]
Processing batches: 46%|████▋ | 174/376 [06:50<07:14, 2.15s/it]
Processing batches: 47%|████▋ | 175/376 [06:52<06:53, 2.06s/it]
Processing batches: 47%|████▋ | 176/376 [06:53<05:38, 1.69s/it]
Processing batches: 47%|████▋ | 177/376 [06:57<07:32, 2.28s/it]
Processing batches: 47%|████▋ | 178/376 [06:59<07:48, 2.37s/it]
Processing batches: 48%|████▊ | 179/376 [07:01<07:13, 2.20s/it]
Processing batches: 48%|████▊ | 180/376 [07:03<06:42, 2.05s/it]
Processing batches: 48%|████▊ | 181/376 [07:05<06:54, 2.13s/it]
Processing batches: 48%|████▊ | 182/376 [07:07<06:38, 2.05s/it]
Processing batches: 49%|████▊ | 183/376 [07:09<06:17, 1.96s/it]
Processing batches: 49%|████▉ | 184/376 [07:12<07:32, 2.36s/it]
Processing batches: 49%|████▉ | 185/376 [07:14<07:18, 2.30s/it]
Processing batches: 49%|████▉ | 186/376 [07:16<06:39, 2.10s/it]
Processing batches: 50%|████▉ | 187/376 [07:17<06:00, 1.91s/it]
Processing batches: 50%|█████ | 188/376 [07:19<05:36, 1.79s/it]
Processing batches: 50%|█████ | 189/376 [07:20<05:10, 1.66s/it]
Processing batches: 51%|█████ | 190/376 [07:22<05:12, 1.68s/it]
Processing batches: 51%|█████ | 191/376 [07:24<05:48, 1.88s/it]
Processing batches: 51%|█████ | 192/376 [07:27<06:54, 2.25s/it]
Processing batches: 51%|█████▏ | 193/376 [07:29<06:34, 2.15s/it]
Processing batches: 52%|█████▏ | 194/376 [07:32<06:35, 2.17s/it]
Processing batches: 52%|█████▏ | 195/376 [07:34<07:08, 2.37s/it]
Processing batches: 52%|█████▏ | 196/376 [07:38<08:40, 2.89s/it]
Processing batches: 52%|█████▏ | 197/376 [07:41<08:08, 2.73s/it]
Processing batches: 53%|█████▎ | 198/376 [07:43<07:08, 2.41s/it]
Processing batches: 53%|█████▎ | 199/376 [07:45<06:54, 2.34s/it]
Processing batches: 53%|█████▎ | 200/376 [07:46<06:03, 2.06s/it]
Processing batches: 53%|█████▎ | 201/376 [07:47<05:21, 1.84s/it]
Processing batches: 54%|█████▎ | 202/376 [07:49<04:46, 1.65s/it]
Processing batches: 54%|█████▍ | 203/376 [07:51<05:48, 2.01s/it]
Processing batches: 54%|█████▍ | 204/376 [07:54<06:26, 2.25s/it]
Processing batches: 55%|█████▍ | 205/376 [07:56<06:03, 2.12s/it]
Processing batches: 55%|█████▍ | 206/376 [07:58<06:08, 2.17s/it]
Processing batches: 55%|█████▌ | 207/376 [08:00<06:02, 2.15s/it]
Processing batches: 55%|█████▌ | 208/376 [08:04<06:57, 2.49s/it]
Processing batches: 56%|█████▌ | 209/376 [08:06<07:02, 2.53s/it]
Processing batches: 56%|█████▌ | 210/376 [08:10<08:01, 2.90s/it]
Processing batches: 56%|█████▌ | 211/376 [08:13<07:32, 2.74s/it]
Processing batches: 56%|█████▋ | 212/376 [08:14<06:47, 2.49s/it]
Processing batches: 57%|█████▋ | 213/376 [08:17<06:45, 2.49s/it]
Processing batches: 57%|█████▋ | 214/376 [08:19<06:26, 2.38s/it]
Processing batches: 57%|█████▋ | 215/376 [08:20<05:23, 2.01s/it]
Processing batches: 57%|█████▋ | 216/376 [08:21<04:41, 1.76s/it]
Processing batches: 58%|█████▊ | 217/376 [08:23<04:45, 1.80s/it]
Processing batches: 58%|█████▊ | 218/376 [08:25<04:55, 1.87s/it]
Processing batches: 58%|█████▊ | 219/376 [08:28<05:13, 2.00s/it]
Processing batches: 59%|█████▊ | 220/376 [08:30<05:17, 2.04s/it]
Processing batches: 59%|█████▉ | 221/376 [08:31<05:04, 1.96s/it]
Processing batches: 59%|█████▉ | 222/376 [08:34<05:21, 2.09s/it]
Processing batches: 59%|█████▉ | 223/376 [08:36<05:31, 2.17s/it]
Processing batches: 60%|█████▉ | 224/376 [08:38<05:05, 2.01s/it]
Processing batches: 60%|█████▉ | 225/376 [08:40<04:47, 1.90s/it]
Processing batches: 60%|██████ | 226/376 [08:41<04:42, 1.88s/it]
Processing batches: 60%|██████ | 227/376 [08:43<04:44, 1.91s/it]
Processing batches: 61%|██████ | 228/376 [08:45<04:34, 1.86s/it]
Processing batches: 61%|██████ | 229/376 [08:47<04:21, 1.78s/it]
Processing batches: 61%|██████ | 230/376 [08:49<04:36, 1.90s/it]
Processing batches: 61%|██████▏ | 231/376 [08:50<04:15, 1.76s/it]
Processing batches: 62%|██████▏ | 232/376 [08:52<04:22, 1.82s/it]
Processing batches: 62%|██████▏ | 233/376 [08:54<04:14, 1.78s/it]
Processing batches: 62%|██████▏ | 234/376 [08:55<03:45, 1.59s/it]
Processing batches: 62%|██████▎ | 235/376 [08:57<03:49, 1.63s/it]
Processing batches: 63%|██████▎ | 236/376 [08:58<03:33, 1.53s/it]
Processing batches: 63%|██████▎ | 237/376 [09:00<03:48, 1.65s/it]
Processing batches: 63%|██████▎ | 238/376 [09:02<04:09, 1.81s/it]
Processing batches: 64%|██████▎ | 239/376 [09:04<04:14, 1.86s/it]
Processing batches: 64%|██████▍ | 240/376 [09:06<04:13, 1.87s/it]
Processing batches: 64%|██████▍ | 241/376 [09:08<04:06, 1.83s/it]
Processing batches: 64%|██████▍ | 242/376 [09:11<04:51, 2.18s/it]
Processing batches: 65%|██████▍ | 243/376 [09:13<04:44, 2.14s/it]
Processing batches: 65%|██████▍ | 244/376 [09:15<04:25, 2.01s/it]
Processing batches: 65%|██████▌ | 245/376 [09:18<05:32, 2.54s/it]
Processing batches: 65%|██████▌ | 246/376 [09:20<04:50, 2.23s/it]
Processing batches: 66%|██████▌ | 247/376 [09:22<04:28, 2.08s/it]
Processing batches: 66%|██████▌ | 248/376 [09:23<03:59, 1.87s/it]
Processing batches: 66%|██████▌ | 249/376 [09:24<03:18, 1.56s/it]
Processing batches: 66%|██████▋ | 250/376 [09:25<02:58, 1.42s/it]
Processing batches: 67%|██████▋ | 251/376 [09:26<02:41, 1.29s/it]
Processing batches: 67%|██████▋ | 252/376 [09:27<02:45, 1.34s/it]
Processing batches: 67%|██████▋ | 253/376 [09:29<02:51, 1.40s/it]
Processing batches: 68%|██████▊ | 254/376 [09:31<03:00, 1.48s/it]
Processing batches: 68%|██████▊ | 255/376 [09:32<03:14, 1.61s/it]
Processing batches: 68%|██████▊ | 256/376 [09:34<03:20, 1.67s/it]
Processing batches: 68%|██████▊ | 257/376 [09:36<03:09, 1.59s/it]
Processing batches: 69%|██████▊ | 258/376 [09:38<03:22, 1.71s/it]
Processing batches: 69%|██████▉ | 259/376 [09:40<03:36, 1.85s/it]
Processing batches: 69%|██████▉ | 260/376 [09:43<04:05, 2.11s/it]
Processing batches: 69%|██████▉ | 261/376 [09:44<03:38, 1.90s/it]
Processing batches: 70%|██████▉ | 262/376 [09:46<03:32, 1.86s/it]
Processing batches: 70%|██████▉ | 263/376 [09:48<03:28, 1.85s/it]
Processing batches: 70%|███████ | 264/376 [09:50<03:43, 1.99s/it]
Processing batches: 70%|███████ | 265/376 [09:52<03:54, 2.11s/it]
Processing batches: 71%|███████ | 266/376 [09:54<03:50, 2.10s/it]
Processing batches: 71%|███████ | 267/376 [09:56<03:35, 1.97s/it]
Processing batches: 71%|███████▏ | 268/376 [09:57<03:13, 1.80s/it]
Processing batches: 72%|███████▏ | 269/376 [09:59<03:08, 1.76s/it]
Processing batches: 72%|███████▏ | 270/376 [10:01<02:58, 1.68s/it]
Processing batches: 72%|███████▏ | 271/376 [10:02<02:49, 1.61s/it]
Processing batches: 72%|███████▏ | 272/376 [10:05<03:21, 1.94s/it]
Processing batches: 73%|███████▎ | 273/376 [10:07<03:19, 1.93s/it]
Processing batches: 73%|███████▎ | 274/376 [10:08<03:12, 1.89s/it]
Processing batches: 73%|███████▎ | 275/376 [10:09<02:40, 1.59s/it]
Processing batches: 73%|███████▎ | 276/376 [10:12<03:15, 1.95s/it]
Processing batches: 74%|███████▎ | 277/376 [10:13<02:52, 1.75s/it]
Processing batches: 74%|███████▍ | 278/376 [10:16<03:30, 2.15s/it]
Processing batches: 74%|███████▍ | 279/376 [10:17<02:50, 1.76s/it]
Processing batches: 74%|███████▍ | 280/376 [10:19<02:41, 1.68s/it]
Processing batches: 75%|███████▍ | 281/376 [10:20<02:35, 1.64s/it]
Processing batches: 75%|███████▌ | 282/376 [10:22<02:41, 1.72s/it]
Processing batches: 75%|███████▌ | 283/376 [10:24<02:27, 1.59s/it]
Processing batches: 76%|███████▌ | 284/376 [10:25<02:13, 1.46s/it]
Processing batches: 76%|███████▌ | 285/376 [10:25<01:54, 1.26s/it]
Processing batches: 76%|███████▌ | 286/376 [10:26<01:22, 1.09it/s]
Processing batches: 76%|███████▋ | 287/376 [10:28<01:47, 1.21s/it]
Processing batches: 77%|███████▋ | 288/376 [10:29<01:55, 1.31s/it]
Processing batches: 77%|███████▋ | 289/376 [10:31<02:03, 1.42s/it]
Processing batches: 77%|███████▋ | 290/376 [10:32<02:07, 1.48s/it]
Processing batches: 77%|███████▋ | 291/376 [10:34<02:03, 1.45s/it]
Processing batches: 78%|███████▊ | 292/376 [10:35<02:08, 1.54s/it]
Processing batches: 78%|███████▊ | 293/376 [10:37<02:11, 1.58s/it]
Processing batches: 78%|███████▊ | 294/376 [10:39<02:11, 1.60s/it]
Processing batches: 78%|███████▊ | 295/376 [10:40<01:54, 1.41s/it]
Processing batches: 79%|███████▊ | 296/376 [10:41<01:52, 1.40s/it]
Processing batches: 79%|███████▉ | 297/376 [10:43<01:50, 1.40s/it]
Processing batches: 79%|███████▉ | 298/376 [10:44<01:58, 1.51s/it]
Processing batches: 80%|███████▉ | 299/376 [10:45<01:45, 1.37s/it]
Processing batches: 80%|███████▉ | 300/376 [10:46<01:37, 1.29s/it]
Processing batches: 80%|████████ | 301/376 [10:48<01:47, 1.43s/it]
Processing batches: 80%|████████ | 302/376 [10:50<01:53, 1.54s/it]
Processing batches: 81%|████████ | 303/376 [10:51<01:51, 1.52s/it]
Processing batches: 81%|████████ | 304/376 [10:53<01:45, 1.47s/it]
Processing batches: 81%|████████ | 305/376 [10:54<01:41, 1.43s/it]
Processing batches: 81%|████████▏ | 306/376 [10:55<01:35, 1.36s/it]
Processing batches: 82%|████████▏ | 307/376 [10:57<01:42, 1.49s/it]
Processing batches: 82%|████████▏ | 308/376 [10:59<01:43, 1.52s/it]
Processing batches: 82%|████████▏ | 309/376 [11:00<01:33, 1.39s/it]
Processing batches: 82%|████████▏ | 310/376 [11:01<01:31, 1.39s/it]
Processing batches: 83%|████████▎ | 311/376 [11:04<01:55, 1.77s/it]
Processing batches: 83%|████████▎ | 312/376 [11:05<01:36, 1.50s/it]
Processing batches: 83%|████████▎ | 313/376 [11:07<01:41, 1.61s/it]
Processing batches: 84%|████████▎ | 314/376 [11:08<01:36, 1.55s/it]
Processing batches: 84%|████████▍ | 315/376 [11:10<01:47, 1.76s/it]
Processing batches: 84%|████████▍ | 316/376 [11:12<01:38, 1.64s/it]
Processing batches: 84%|████████▍ | 317/376 [11:14<01:44, 1.77s/it]
Processing batches: 85%|████████▍ | 318/376 [11:15<01:42, 1.76s/it]
Processing batches: 85%|████████▍ | 319/376 [11:18<01:49, 1.93s/it]
Processing batches: 85%|████████▌ | 320/376 [11:19<01:36, 1.72s/it]
Processing batches: 85%|████████▌ | 321/376 [11:20<01:29, 1.62s/it]
Processing batches: 86%|████████▌ | 322/376 [11:22<01:21, 1.51s/it]
Processing batches: 86%|████████▌ | 323/376 [11:23<01:14, 1.40s/it]
Processing batches: 86%|████████▌ | 324/376 [11:25<01:20, 1.55s/it]
Processing batches: 86%|████████▋ | 325/376 [11:26<01:16, 1.51s/it]
Processing batches: 87%|████████▋ | 326/376 [11:27<01:06, 1.33s/it]
Processing batches: 87%|████████▋ | 327/376 [11:29<01:09, 1.42s/it]
Processing batches: 87%|████████▋ | 328/376 [11:31<01:16, 1.59s/it]
Processing batches: 88%|████████▊ | 329/376 [11:32<01:12, 1.54s/it]
Processing batches: 88%|████████▊ | 330/376 [11:34<01:19, 1.73s/it]
Processing batches: 88%|████████▊ | 331/376 [11:35<01:07, 1.49s/it]
Processing batches: 88%|████████▊ | 332/376 [11:37<01:08, 1.56s/it]
Processing batches: 89%|████████▊ | 333/376 [11:38<00:57, 1.33s/it]
Processing batches: 89%|████████▉ | 334/376 [11:39<01:00, 1.45s/it]
Processing batches: 89%|████████▉ | 335/376 [11:41<01:05, 1.60s/it]
Processing batches: 89%|████████▉ | 336/376 [11:43<01:06, 1.67s/it]
Processing batches: 90%|████████▉ | 337/376 [11:45<01:03, 1.63s/it]
Processing batches: 90%|████████▉ | 338/376 [11:46<00:56, 1.49s/it]
Processing batches: 90%|█████████ | 339/376 [11:47<00:48, 1.32s/it]
Processing batches: 90%|█████████ | 340/376 [11:48<00:45, 1.25s/it]
Processing batches: 91%|█████████ | 341/376 [11:49<00:44, 1.26s/it]
Processing batches: 91%|█████████ | 342/376 [11:51<00:51, 1.51s/it]
Processing batches: 91%|█████████ | 343/376 [11:53<00:50, 1.53s/it]
Processing batches: 91%|█████████▏| 344/376 [11:54<00:42, 1.34s/it]
Processing batches: 92%|█████████▏| 345/376 [11:55<00:39, 1.26s/it]
Processing batches: 92%|█████████▏| 346/376 [11:56<00:38, 1.28s/it]
Processing batches: 92%|█████████▏| 347/376 [11:57<00:34, 1.17s/it]
Processing batches: 93%|█████████▎| 348/376 [11:58<00:32, 1.17s/it]
Processing batches: 93%|█████████▎| 349/376 [12:00<00:33, 1.25s/it]
Processing batches: 93%|█████████▎| 350/376 [12:01<00:33, 1.28s/it]
Processing batches: 93%|█████████▎| 351/376 [12:02<00:32, 1.31s/it]
Processing batches: 94%|█████████▎| 352/376 [12:04<00:33, 1.41s/it]
Processing batches: 94%|█████████▍| 353/376 [12:05<00:28, 1.24s/it]
Processing batches: 94%|█████████▍| 354/376 [12:06<00:27, 1.23s/it]
Processing batches: 94%|█████████▍| 355/376 [12:07<00:23, 1.13s/it]
Processing batches: 95%|█████████▍| 356/376 [12:08<00:22, 1.14s/it]
Processing batches: 95%|█████████▍| 357/376 [12:09<00:20, 1.09s/it]
Processing batches: 95%|█████████▌| 358/376 [12:12<00:31, 1.75s/it]
Processing batches: 95%|█████████▌| 359/376 [12:13<00:24, 1.46s/it]
Processing batches: 96%|█████████▌| 360/376 [12:14<00:19, 1.24s/it]
Processing batches: 96%|█████████▌| 361/376 [12:15<00:17, 1.16s/it]
Processing batches: 96%|█████████▋| 362/376 [12:16<00:15, 1.14s/it]
Processing batches: 97%|█████████▋| 363/376 [12:17<00:12, 1.02it/s]
Processing batches: 97%|█████████▋| 364/376 [12:18<00:12, 1.07s/it]
Processing batches: 97%|█████████▋| 365/376 [12:19<00:12, 1.09s/it]
Processing batches: 97%|█████████▋| 366/376 [12:20<00:10, 1.04s/it]
Processing batches: 98%|█████████▊| 367/376 [12:21<00:08, 1.04it/s]
Processing batches: 98%|█████████▊| 368/376 [12:22<00:07, 1.10it/s]
Processing batches: 98%|█████████▊| 369/376 [12:22<00:06, 1.15it/s]
Processing batches: 98%|█████████▊| 370/376 [12:23<00:05, 1.18it/s]
Processing batches: 99%|█████████▊| 371/376 [12:24<00:04, 1.13it/s]
Processing batches: 99%|█████████▉| 372/376 [12:25<00:03, 1.07it/s]
Processing batches: 99%|█████████▉| 373/376 [12:26<00:02, 1.11it/s]
Processing batches: 99%|█████████▉| 374/376 [12:27<00:01, 1.10it/s]
Processing batches: 100%|█████████▉| 375/376 [12:28<00:00, 1.14it/s]
Processing batches: 100%|██████████| 376/376 [12:28<00:00, 1.42it/s]
Processing batches: 100%|██████████| 376/376 [12:28<00:00, 1.99s/it]
articles_mat: (384, 3753)
authors_mat: (384, 720)
words_mat: (384, 1981)
minilm_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=minilm_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
minilm_score.print()
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1653
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Natalia V. Andrienko : 0.45
1 Gennady L. Andrienko : 0.43
2 Mario Jelovic : 0.42
3 Aditi Majumder : 0.41
4 Bernhard Preim : 0.41
...
715 Chris Muelder : 0.10
716 Harald Obermaier : 0.10
717 Teng-Yok Lee : 0.10
718 Yixuan Zhang 0001 : 0.10
719 Jörn Kohlhammer : 0.10
Length: 720, dtype: object
VALUE : 0.1653
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.139130
Dieter Schmalstieg 0.144444
Matthew Kay 0001 0.290000
Xingbo Wang 0001 0.133333
Minfeng Zhu 0.100000
...
Lisa M. Sobierajski 0.175000
Nahum D. Gershon 0.140000
Sidney W. Wang 0.100000
David A. Lane 0.216667
T. Todd Elvins 0.150000
Length: 720, dtype: float64
“all-mpnet-base-v2” projection¶
mpnet_matrices = bert_projection([articles_mat, authors_mat, words_mat], df['text'], family="all-mpnet-base-v2")
""
mpnet_matrices = list(map(normalize_l2, mpnet_matrices))
""
print(f"articles_mat: {mpnet_matrices[0].shape}")
print(f"authors_mat: {mpnet_matrices[1].shape}")
print(f"words_mat: {mpnet_matrices[2].shape}")
""
mpnet_score = Neighbors.evaluate(
NATURE, SOURCE, authors_mat, authors_scores, dir_xD=".",
scores_nature=articles_scores, matrix_nature_xD=mpnet_matrices[0],
min_score=filt_min_score, n_neighbors=n_neighbors, recompute=True
)
mpnet_score.print()
Processing batches: 0%| | 0/376 [00:00<?, ?it/s]
Processing batches: 0%| | 1/376 [00:13<1:22:56, 13.27s/it]
Processing batches: 1%| | 2/376 [00:24<1:16:21, 12.25s/it]
Processing batches: 1%| | 3/376 [00:35<1:11:44, 11.54s/it]
Processing batches: 1%| | 4/376 [00:50<1:19:14, 12.78s/it]
Processing batches: 1%|▏ | 5/376 [00:59<1:10:41, 11.43s/it]
Processing batches: 2%|▏ | 6/376 [01:10<1:09:45, 11.31s/it]
Processing batches: 2%|▏ | 7/376 [01:18<1:04:03, 10.42s/it]
Processing batches: 2%|▏ | 8/376 [01:30<1:05:49, 10.73s/it]
Processing batches: 2%|▏ | 9/376 [01:41<1:05:40, 10.74s/it]
Processing batches: 3%|▎ | 10/376 [01:53<1:09:09, 11.34s/it]
Processing batches: 3%|▎ | 11/376 [02:05<1:10:27, 11.58s/it]
Processing batches: 3%|▎ | 12/376 [02:13<1:03:28, 10.46s/it]
Processing batches: 3%|▎ | 13/376 [02:25<1:06:33, 11.00s/it]
Processing batches: 4%|▎ | 14/376 [02:37<1:08:10, 11.30s/it]
Processing batches: 4%|▍ | 15/376 [02:49<1:08:29, 11.38s/it]
Processing batches: 4%|▍ | 16/376 [03:03<1:12:13, 12.04s/it]
Processing batches: 5%|▍ | 17/376 [03:12<1:06:35, 11.13s/it]
Processing batches: 5%|▍ | 18/376 [03:23<1:07:09, 11.25s/it]
Processing batches: 5%|▌ | 19/376 [03:34<1:06:59, 11.26s/it]
Processing batches: 5%|▌ | 20/376 [03:44<1:03:54, 10.77s/it]
Processing batches: 6%|▌ | 21/376 [03:53<1:00:00, 10.14s/it]
Processing batches: 6%|▌ | 22/376 [04:04<1:01:00, 10.34s/it]
Processing batches: 6%|▌ | 23/376 [04:15<1:02:48, 10.68s/it]
Processing batches: 6%|▋ | 24/376 [04:27<1:05:33, 11.17s/it]
Processing batches: 7%|▋ | 25/376 [04:41<1:09:28, 11.87s/it]
Processing batches: 7%|▋ | 26/376 [04:49<1:02:28, 10.71s/it]
Processing batches: 7%|▋ | 27/376 [05:01<1:04:49, 11.14s/it]
Processing batches: 7%|▋ | 28/376 [05:10<1:01:08, 10.54s/it]
Processing batches: 8%|▊ | 29/376 [05:22<1:03:55, 11.05s/it]
Processing batches: 8%|▊ | 30/376 [05:36<1:07:47, 11.76s/it]
Processing batches: 8%|▊ | 31/376 [05:43<1:00:16, 10.48s/it]
Processing batches: 9%|▊ | 32/376 [05:57<1:05:41, 11.46s/it]
Processing batches: 9%|▉ | 33/376 [06:13<1:12:43, 12.72s/it]
Processing batches: 9%|▉ | 34/376 [06:23<1:07:37, 11.86s/it]
Processing batches: 9%|▉ | 35/376 [06:34<1:06:41, 11.73s/it]
Processing batches: 10%|▉ | 36/376 [06:45<1:05:34, 11.57s/it]
Processing batches: 10%|▉ | 37/376 [06:55<1:01:54, 10.96s/it]
Processing batches: 10%|█ | 38/376 [07:05<1:00:08, 10.68s/it]
Processing batches: 10%|█ | 39/376 [07:13<56:27, 10.05s/it]
Processing batches: 11%|█ | 40/376 [07:23<55:06, 9.84s/it]
Processing batches: 11%|█ | 41/376 [07:32<53:36, 9.60s/it]
Processing batches: 11%|█ | 42/376 [07:42<55:20, 9.94s/it]
Processing batches: 11%|█▏ | 43/376 [07:58<1:04:38, 11.65s/it]
Processing batches: 12%|█▏ | 44/376 [08:05<57:16, 10.35s/it]
Processing batches: 12%|█▏ | 45/376 [08:14<53:29, 9.70s/it]
Processing batches: 12%|█▏ | 46/376 [08:22<51:49, 9.42s/it]
Processing batches: 12%|█▎ | 47/376 [08:32<52:35, 9.59s/it]
Processing batches: 13%|█▎ | 48/376 [08:41<50:06, 9.17s/it]
Processing batches: 13%|█▎ | 49/376 [08:50<49:48, 9.14s/it]
Processing batches: 13%|█▎ | 50/376 [09:00<50:53, 9.37s/it]
Processing batches: 14%|█▎ | 51/376 [09:12<55:45, 10.29s/it]
Processing batches: 14%|█▍ | 52/376 [09:23<56:04, 10.39s/it]
Processing batches: 14%|█▍ | 53/376 [09:31<53:17, 9.90s/it]
Processing batches: 14%|█▍ | 54/376 [09:42<54:14, 10.11s/it]
Processing batches: 15%|█▍ | 55/376 [09:56<1:01:05, 11.42s/it]
Processing batches: 15%|█▍ | 56/376 [10:07<59:17, 11.12s/it]
Processing batches: 15%|█▌ | 57/376 [10:15<54:47, 10.31s/it]
Processing batches: 15%|█▌ | 58/376 [10:30<1:01:26, 11.59s/it]
Processing batches: 16%|█▌ | 59/376 [10:41<1:00:11, 11.39s/it]
Processing batches: 16%|█▌ | 60/376 [10:48<52:55, 10.05s/it]
Processing batches: 16%|█▌ | 61/376 [10:58<52:50, 10.07s/it]
Processing batches: 16%|█▋ | 62/376 [11:06<49:57, 9.55s/it]
Processing batches: 17%|█▋ | 63/376 [11:15<49:33, 9.50s/it]
Processing batches: 17%|█▋ | 64/376 [11:26<50:40, 9.74s/it]
Processing batches: 17%|█▋ | 65/376 [11:38<53:45, 10.37s/it]
Processing batches: 18%|█▊ | 66/376 [11:49<54:35, 10.57s/it]
Processing batches: 18%|█▊ | 67/376 [12:00<55:54, 10.86s/it]
Processing batches: 18%|█▊ | 68/376 [12:08<51:12, 9.98s/it]
Processing batches: 18%|█▊ | 69/376 [12:16<48:26, 9.47s/it]
Processing batches: 19%|█▊ | 70/376 [12:26<48:47, 9.57s/it]
Processing batches: 19%|█▉ | 71/376 [12:37<49:59, 9.83s/it]
Processing batches: 19%|█▉ | 72/376 [12:51<56:42, 11.19s/it]
Processing batches: 19%|█▉ | 73/376 [13:00<53:22, 10.57s/it]
Processing batches: 20%|█▉ | 74/376 [13:11<53:42, 10.67s/it]
Processing batches: 20%|█▉ | 75/376 [13:21<53:01, 10.57s/it]
Processing batches: 20%|██ | 76/376 [13:30<49:12, 9.84s/it]
Processing batches: 20%|██ | 77/376 [13:40<49:39, 9.96s/it]
Processing batches: 21%|██ | 78/376 [13:52<52:20, 10.54s/it]
Processing batches: 21%|██ | 79/376 [14:01<50:36, 10.22s/it]
Processing batches: 21%|██▏ | 80/376 [14:09<47:26, 9.62s/it]
Processing batches: 22%|██▏ | 81/376 [14:18<45:43, 9.30s/it]
Processing batches: 22%|██▏ | 82/376 [14:31<51:41, 10.55s/it]
Processing batches: 22%|██▏ | 83/376 [14:41<50:18, 10.30s/it]
Processing batches: 22%|██▏ | 84/376 [14:57<57:40, 11.85s/it]
Processing batches: 23%|██▎ | 85/376 [15:03<50:00, 10.31s/it]
Processing batches: 23%|██▎ | 86/376 [15:19<57:58, 11.99s/it]
Processing batches: 23%|██▎ | 87/376 [15:27<51:34, 10.71s/it]
Processing batches: 23%|██▎ | 88/376 [15:36<49:48, 10.38s/it]
Processing batches: 24%|██▎ | 89/376 [15:48<50:55, 10.65s/it]
Processing batches: 24%|██▍ | 90/376 [16:00<53:21, 11.20s/it]
Processing batches: 24%|██▍ | 91/376 [16:10<51:21, 10.81s/it]
Processing batches: 24%|██▍ | 92/376 [16:25<56:47, 12.00s/it]
Processing batches: 25%|██▍ | 93/376 [16:32<49:30, 10.49s/it]
Processing batches: 25%|██▌ | 94/376 [16:42<48:44, 10.37s/it]
Processing batches: 25%|██▌ | 95/376 [16:53<48:47, 10.42s/it]
Processing batches: 26%|██▌ | 96/376 [17:05<50:54, 10.91s/it]
Processing batches: 26%|██▌ | 97/376 [17:16<51:37, 11.10s/it]
Processing batches: 26%|██▌ | 98/376 [17:25<48:53, 10.55s/it]
Processing batches: 26%|██▋ | 99/376 [17:34<45:31, 9.86s/it]
Processing batches: 27%|██▋ | 100/376 [17:43<44:31, 9.68s/it]
Processing batches: 27%|██▋ | 101/376 [17:49<39:26, 8.61s/it]
Processing batches: 27%|██▋ | 102/376 [17:57<38:00, 8.32s/it]
Processing batches: 27%|██▋ | 103/376 [18:05<38:31, 8.47s/it]
Processing batches: 28%|██▊ | 104/376 [18:15<39:32, 8.72s/it]
Processing batches: 28%|██▊ | 105/376 [18:23<39:10, 8.67s/it]
Processing batches: 28%|██▊ | 106/376 [18:33<40:02, 8.90s/it]
Processing batches: 28%|██▊ | 107/376 [18:43<41:16, 9.21s/it]
Processing batches: 29%|██▊ | 108/376 [18:50<38:32, 8.63s/it]
Processing batches: 29%|██▉ | 109/376 [18:59<39:07, 8.79s/it]
Processing batches: 29%|██▉ | 110/376 [19:08<38:29, 8.68s/it]
Processing batches: 30%|██▉ | 111/376 [19:16<37:35, 8.51s/it]
Processing batches: 30%|██▉ | 112/376 [19:26<39:52, 9.06s/it]
Processing batches: 30%|███ | 113/376 [19:35<39:59, 9.12s/it]
Processing batches: 30%|███ | 114/376 [19:45<40:48, 9.35s/it]
Processing batches: 31%|███ | 115/376 [19:52<37:12, 8.55s/it]
Processing batches: 31%|███ | 116/376 [19:59<35:21, 8.16s/it]
Processing batches: 31%|███ | 117/376 [20:07<34:18, 7.95s/it]
Processing batches: 31%|███▏ | 118/376 [20:14<34:07, 7.94s/it]
Processing batches: 32%|███▏ | 119/376 [20:27<40:21, 9.42s/it]
Processing batches: 32%|███▏ | 120/376 [20:37<41:00, 9.61s/it]
Processing batches: 32%|███▏ | 121/376 [20:46<39:26, 9.28s/it]
Processing batches: 32%|███▏ | 122/376 [20:54<38:00, 8.98s/it]
Processing batches: 33%|███▎ | 123/376 [21:04<38:29, 9.13s/it]
Processing batches: 33%|███▎ | 124/376 [21:17<43:48, 10.43s/it]
Processing batches: 33%|███▎ | 125/376 [21:25<40:01, 9.57s/it]
Processing batches: 34%|███▎ | 126/376 [21:34<39:39, 9.52s/it]
Processing batches: 34%|███▍ | 127/376 [21:42<37:26, 9.02s/it]
Processing batches: 34%|███▍ | 128/376 [21:50<35:29, 8.59s/it]
Processing batches: 34%|███▍ | 129/376 [22:01<39:19, 9.55s/it]
Processing batches: 35%|███▍ | 130/376 [22:13<42:05, 10.26s/it]
Processing batches: 35%|███▍ | 131/376 [22:26<44:20, 10.86s/it]
Processing batches: 35%|███▌ | 132/376 [22:36<44:01, 10.83s/it]
Processing batches: 35%|███▌ | 133/376 [22:48<44:26, 10.97s/it]
Processing batches: 36%|███▌ | 134/376 [22:56<41:07, 10.19s/it]
Processing batches: 36%|███▌ | 135/376 [23:06<40:48, 10.16s/it]
Processing batches: 36%|███▌ | 136/376 [23:18<42:41, 10.67s/it]
Processing batches: 36%|███▋ | 137/376 [23:29<42:39, 10.71s/it]
Processing batches: 37%|███▋ | 138/376 [23:39<42:15, 10.66s/it]
Processing batches: 37%|███▋ | 139/376 [23:47<38:53, 9.85s/it]
Processing batches: 37%|███▋ | 140/376 [23:56<37:46, 9.61s/it]
Processing batches: 38%|███▊ | 141/376 [24:04<35:47, 9.14s/it]
Processing batches: 38%|███▊ | 142/376 [24:13<35:30, 9.10s/it]
Processing batches: 38%|███▊ | 143/376 [24:22<35:12, 9.07s/it]
Processing batches: 38%|███▊ | 144/376 [24:31<35:09, 9.09s/it]
Processing batches: 39%|███▊ | 145/376 [24:42<37:01, 9.62s/it]
Processing batches: 39%|███▉ | 146/376 [24:54<39:34, 10.32s/it]
Processing batches: 39%|███▉ | 147/376 [25:03<37:51, 9.92s/it]
Processing batches: 39%|███▉ | 148/376 [25:14<38:27, 10.12s/it]
Processing batches: 40%|███▉ | 149/376 [25:21<35:30, 9.39s/it]
Processing batches: 40%|███▉ | 150/376 [25:31<35:52, 9.52s/it]
Processing batches: 40%|████ | 151/376 [25:38<32:31, 8.67s/it]
Processing batches: 40%|████ | 152/376 [25:48<33:54, 9.08s/it]
Processing batches: 41%|████ | 153/376 [25:55<30:55, 8.32s/it]
Processing batches: 41%|████ | 154/376 [26:02<29:40, 8.02s/it]
Processing batches: 41%|████ | 155/376 [26:08<27:38, 7.51s/it]
Processing batches: 41%|████▏ | 156/376 [26:17<29:14, 7.98s/it]
Processing batches: 42%|████▏ | 157/376 [26:28<32:36, 8.93s/it]
Processing batches: 42%|████▏ | 158/376 [26:37<31:30, 8.67s/it]
Processing batches: 42%|████▏ | 159/376 [26:44<30:11, 8.35s/it]
Processing batches: 43%|████▎ | 160/376 [26:52<29:01, 8.06s/it]
Processing batches: 43%|████▎ | 161/376 [26:58<27:41, 7.73s/it]
Processing batches: 43%|████▎ | 162/376 [27:06<27:25, 7.69s/it]
Processing batches: 43%|████▎ | 163/376 [27:13<26:02, 7.33s/it]
Processing batches: 44%|████▎ | 164/376 [27:21<26:43, 7.57s/it]
Processing batches: 44%|████▍ | 165/376 [27:29<27:34, 7.84s/it]
Processing batches: 44%|████▍ | 166/376 [27:40<30:45, 8.79s/it]
Processing batches: 44%|████▍ | 167/376 [27:49<30:54, 8.87s/it]
Processing batches: 45%|████▍ | 168/376 [28:00<32:21, 9.33s/it]
Processing batches: 45%|████▍ | 169/376 [28:07<30:26, 8.82s/it]
Processing batches: 45%|████▌ | 170/376 [28:17<30:57, 9.02s/it]
Processing batches: 45%|████▌ | 171/376 [28:25<30:26, 8.91s/it]
Processing batches: 46%|████▌ | 172/376 [28:33<29:15, 8.61s/it]
Processing batches: 46%|████▌ | 173/376 [28:40<26:47, 7.92s/it]
Processing batches: 46%|████▋ | 174/376 [28:52<30:56, 9.19s/it]
Processing batches: 47%|████▋ | 175/376 [29:00<29:52, 8.92s/it]
Processing batches: 47%|████▋ | 176/376 [29:04<24:25, 7.33s/it]
Processing batches: 47%|████▋ | 177/376 [29:18<31:17, 9.44s/it]
Processing batches: 47%|████▋ | 178/376 [29:28<31:54, 9.67s/it]
Processing batches: 48%|████▊ | 179/376 [29:36<30:14, 9.21s/it]
Processing batches: 48%|████▊ | 180/376 [29:44<28:55, 8.86s/it]
Processing batches: 48%|████▊ | 181/376 [29:54<29:36, 9.11s/it]
Processing batches: 48%|████▊ | 182/376 [30:02<28:42, 8.88s/it]
Processing batches: 49%|████▊ | 183/376 [30:10<27:30, 8.55s/it]
Processing batches: 49%|████▉ | 184/376 [30:23<31:44, 9.92s/it]
Processing batches: 49%|████▉ | 185/376 [30:33<30:53, 9.71s/it]
Processing batches: 49%|████▉ | 186/376 [30:40<28:38, 9.04s/it]
Processing batches: 50%|████▉ | 187/376 [30:47<26:18, 8.35s/it]
Processing batches: 50%|█████ | 188/376 [30:54<24:46, 7.91s/it]
Processing batches: 50%|█████ | 189/376 [31:00<22:56, 7.36s/it]
Processing batches: 51%|█████ | 190/376 [31:08<23:19, 7.52s/it]
Processing batches: 51%|█████ | 191/376 [31:18<25:22, 8.23s/it]
Processing batches: 51%|█████ | 192/376 [31:31<29:36, 9.65s/it]
Processing batches: 51%|█████▏ | 193/376 [31:39<28:22, 9.31s/it]
Processing batches: 52%|█████▏ | 194/376 [31:49<28:32, 9.41s/it]
Processing batches: 52%|█████▏ | 195/376 [32:00<30:23, 10.07s/it]
Processing batches: 52%|█████▏ | 196/376 [32:16<35:35, 11.86s/it]
Processing batches: 52%|█████▏ | 197/376 [32:26<33:44, 11.31s/it]
Processing batches: 53%|█████▎ | 198/376 [32:34<30:17, 10.21s/it]
Processing batches: 53%|█████▎ | 199/376 [32:43<29:29, 10.00s/it]
Processing batches: 53%|█████▎ | 200/376 [32:50<26:17, 8.96s/it]
Processing batches: 53%|█████▎ | 201/376 [32:56<23:16, 7.98s/it]
Processing batches: 54%|█████▎ | 202/376 [33:01<20:26, 7.05s/it]
Processing batches: 54%|█████▍ | 203/376 [33:12<24:24, 8.47s/it]
Processing batches: 54%|█████▍ | 204/376 [33:24<26:58, 9.41s/it]
Processing batches: 55%|█████▍ | 205/376 [33:32<25:52, 9.08s/it]
Processing batches: 55%|█████▍ | 206/376 [33:42<26:24, 9.32s/it]
Processing batches: 55%|█████▌ | 207/376 [33:51<26:04, 9.26s/it]
Processing batches: 55%|█████▌ | 208/376 [34:05<29:31, 10.55s/it]
Processing batches: 56%|█████▌ | 209/376 [34:16<29:50, 10.72s/it]
Processing batches: 56%|█████▌ | 210/376 [34:31<33:28, 12.10s/it]
Processing batches: 56%|█████▌ | 211/376 [34:41<31:40, 11.52s/it]
Processing batches: 56%|█████▋ | 212/376 [34:50<28:57, 10.60s/it]
Processing batches: 57%|█████▋ | 213/376 [35:01<28:53, 10.63s/it]
Processing batches: 57%|█████▋ | 214/376 [35:10<27:20, 10.12s/it]
Processing batches: 57%|█████▋ | 215/376 [35:14<22:50, 8.51s/it]
Processing batches: 57%|█████▋ | 216/376 [35:19<19:51, 7.45s/it]
Processing batches: 58%|█████▊ | 217/376 [35:28<20:31, 7.75s/it]
Processing batches: 58%|█████▊ | 218/376 [35:37<21:17, 8.08s/it]
Processing batches: 58%|█████▊ | 219/376 [35:46<22:18, 8.53s/it]
Processing batches: 59%|█████▊ | 220/376 [35:55<22:43, 8.74s/it]
Processing batches: 59%|█████▉ | 221/376 [36:03<22:02, 8.53s/it]
Processing batches: 59%|█████▉ | 222/376 [36:14<23:10, 9.03s/it]
Processing batches: 59%|█████▉ | 223/376 [36:24<23:48, 9.34s/it]
Processing batches: 60%|█████▉ | 224/376 [36:31<22:10, 8.75s/it]
Processing batches: 60%|█████▉ | 225/376 [36:39<21:06, 8.39s/it]
Processing batches: 60%|██████ | 226/376 [36:47<20:46, 8.31s/it]
Processing batches: 60%|██████ | 227/376 [36:55<20:47, 8.37s/it]
Processing batches: 61%|██████ | 228/376 [37:03<20:18, 8.23s/it]
Processing batches: 61%|██████ | 229/376 [37:11<19:33, 7.98s/it]
Processing batches: 61%|██████ | 230/376 [37:20<20:35, 8.46s/it]
Processing batches: 61%|██████▏ | 231/376 [37:27<19:10, 7.94s/it]
Processing batches: 62%|██████▏ | 232/376 [37:36<19:40, 8.19s/it]
Processing batches: 62%|██████▏ | 233/376 [37:43<19:11, 8.05s/it]
Processing batches: 62%|██████▏ | 234/376 [37:48<16:42, 7.06s/it]
Processing batches: 62%|██████▎ | 235/376 [37:56<17:08, 7.30s/it]
Processing batches: 63%|██████▎ | 236/376 [38:02<16:11, 6.94s/it]
Processing batches: 63%|██████▎ | 237/376 [38:11<17:23, 7.51s/it]
Processing batches: 63%|██████▎ | 238/376 [38:21<18:46, 8.16s/it]
Processing batches: 64%|██████▎ | 239/376 [38:29<19:05, 8.36s/it]
Processing batches: 64%|██████▍ | 240/376 [38:38<19:06, 8.43s/it]
Processing batches: 64%|██████▍ | 241/376 [38:46<18:36, 8.27s/it]
Processing batches: 64%|██████▍ | 242/376 [38:59<21:25, 9.59s/it]
Processing batches: 65%|██████▍ | 243/376 [39:08<20:52, 9.42s/it]
Processing batches: 65%|██████▍ | 244/376 [39:15<19:41, 8.95s/it]
Processing batches: 65%|██████▌ | 245/376 [39:31<23:46, 10.89s/it]
Processing batches: 65%|██████▌ | 246/376 [39:37<20:49, 9.61s/it]
Processing batches: 66%|██████▌ | 247/376 [39:45<19:09, 8.91s/it]
Processing batches: 66%|██████▌ | 248/376 [39:50<16:51, 7.90s/it]
Processing batches: 66%|██████▌ | 249/376 [39:54<14:02, 6.63s/it]
Processing batches: 66%|██████▋ | 250/376 [39:59<12:39, 6.03s/it]
Processing batches: 67%|██████▋ | 251/376 [40:03<11:32, 5.54s/it]
Processing batches: 67%|██████▋ | 252/376 [40:09<11:45, 5.69s/it]
Processing batches: 67%|██████▋ | 253/376 [40:16<12:23, 6.04s/it]
Processing batches: 68%|██████▊ | 254/376 [40:24<13:14, 6.51s/it]
Processing batches: 68%|██████▊ | 255/376 [40:32<14:21, 7.12s/it]
Processing batches: 68%|██████▊ | 256/376 [40:40<14:46, 7.39s/it]
Processing batches: 68%|██████▊ | 257/376 [40:46<13:57, 7.04s/it]
Processing batches: 69%|██████▊ | 258/376 [40:55<14:47, 7.52s/it]
Processing batches: 69%|██████▉ | 259/376 [41:04<15:37, 8.02s/it]
Processing batches: 69%|██████▉ | 260/376 [41:15<17:12, 8.90s/it]
Processing batches: 69%|██████▉ | 261/376 [41:21<15:09, 7.91s/it]
Processing batches: 70%|██████▉ | 262/376 [41:28<14:42, 7.74s/it]
Processing batches: 70%|██████▉ | 263/376 [41:36<14:39, 7.79s/it]
Processing batches: 70%|███████ | 264/376 [41:46<15:44, 8.43s/it]
Processing batches: 70%|███████ | 265/376 [41:55<16:14, 8.78s/it]
Processing batches: 71%|███████ | 266/376 [42:04<16:05, 8.78s/it]
Processing batches: 71%|███████ | 267/376 [42:12<15:17, 8.42s/it]
Processing batches: 71%|███████▏ | 268/376 [42:18<14:00, 7.78s/it]
Processing batches: 72%|███████▏ | 269/376 [42:26<13:45, 7.72s/it]
Processing batches: 72%|███████▏ | 270/376 [42:32<13:03, 7.39s/it]
Processing batches: 72%|███████▏ | 271/376 [42:39<12:22, 7.07s/it]
Processing batches: 72%|███████▏ | 272/376 [42:50<14:24, 8.31s/it]
Processing batches: 73%|███████▎ | 273/376 [42:58<14:19, 8.35s/it]
Processing batches: 73%|███████▎ | 274/376 [43:06<13:52, 8.16s/it]
Processing batches: 73%|███████▎ | 275/376 [43:10<11:33, 6.86s/it]
Processing batches: 73%|███████▎ | 276/376 [43:21<13:32, 8.12s/it]
Processing batches: 74%|███████▎ | 277/376 [43:26<11:53, 7.21s/it]
Processing batches: 74%|███████▍ | 278/376 [43:38<14:23, 8.81s/it]
Processing batches: 74%|███████▍ | 279/376 [43:43<11:57, 7.39s/it]
Processing batches: 74%|███████▍ | 280/376 [43:49<11:28, 7.17s/it]
Processing batches: 75%|███████▍ | 281/376 [43:56<11:14, 7.10s/it]
Processing batches: 75%|███████▌ | 282/376 [44:05<11:42, 7.48s/it]
Processing batches: 75%|███████▌ | 283/376 [44:10<10:34, 6.82s/it]
Processing batches: 76%|███████▌ | 284/376 [44:15<09:32, 6.22s/it]
Processing batches: 76%|███████▌ | 285/376 [44:18<08:12, 5.42s/it]
Processing batches: 76%|███████▌ | 286/376 [44:19<05:58, 3.98s/it]
Processing batches: 76%|███████▋ | 287/376 [44:27<07:41, 5.19s/it]
Processing batches: 77%|███████▋ | 288/376 [44:34<08:19, 5.68s/it]
Processing batches: 77%|███████▋ | 289/376 [44:41<09:07, 6.29s/it]
Processing batches: 77%|███████▋ | 290/376 [44:49<09:32, 6.66s/it]
Processing batches: 77%|███████▋ | 291/376 [44:55<09:09, 6.46s/it]
Processing batches: 78%|███████▊ | 292/376 [45:02<09:23, 6.70s/it]
Processing batches: 78%|███████▊ | 293/376 [45:10<09:40, 6.99s/it]
Processing batches: 78%|███████▊ | 294/376 [45:17<09:45, 7.14s/it]
Processing batches: 78%|███████▊ | 295/376 [45:22<08:28, 6.27s/it]
Processing batches: 79%|███████▊ | 296/376 [45:27<08:05, 6.07s/it]
Processing batches: 79%|███████▉ | 297/376 [45:33<07:53, 6.00s/it]
Processing batches: 79%|███████▉ | 298/376 [45:41<08:28, 6.52s/it]
Processing batches: 80%|███████▉ | 299/376 [45:45<07:32, 5.87s/it]
Processing batches: 80%|███████▉ | 300/376 [45:50<06:58, 5.51s/it]
Processing batches: 80%|████████ | 301/376 [45:58<07:45, 6.21s/it]
Processing batches: 80%|████████ | 302/376 [46:06<08:19, 6.75s/it]
Processing batches: 81%|████████ | 303/376 [46:12<08:04, 6.64s/it]
Processing batches: 81%|████████ | 304/376 [46:17<07:32, 6.29s/it]
Processing batches: 81%|████████ | 305/376 [46:23<07:06, 6.00s/it]
Processing batches: 81%|████████▏ | 306/376 [46:28<06:40, 5.73s/it]
Processing batches: 82%|████████▏ | 307/376 [46:36<07:23, 6.42s/it]
Processing batches: 82%|████████▏ | 308/376 [46:43<07:28, 6.59s/it]
Processing batches: 82%|████████▏ | 309/376 [46:48<06:42, 6.01s/it]
Processing batches: 82%|████████▏ | 310/376 [46:53<06:35, 5.99s/it]
Processing batches: 83%|████████▎ | 311/376 [47:04<08:02, 7.42s/it]
Processing batches: 83%|████████▎ | 312/376 [47:08<06:43, 6.30s/it]
Processing batches: 83%|████████▎ | 313/376 [47:16<07:09, 6.81s/it]
Processing batches: 84%|████████▎ | 314/376 [47:22<06:49, 6.61s/it]
Processing batches: 84%|████████▍ | 315/376 [47:32<07:37, 7.50s/it]
Processing batches: 84%|████████▍ | 316/376 [47:38<07:06, 7.11s/it]
Processing batches: 84%|████████▍ | 317/376 [47:47<07:29, 7.61s/it]
Processing batches: 85%|████████▍ | 318/376 [47:55<07:29, 7.75s/it]
Processing batches: 85%|████████▍ | 319/376 [48:04<07:55, 8.33s/it]
Processing batches: 85%|████████▌ | 320/376 [48:10<06:55, 7.42s/it]
Processing batches: 85%|████████▌ | 321/376 [48:16<06:27, 7.04s/it]
Processing batches: 86%|████████▌ | 322/376 [48:21<05:49, 6.47s/it]
Processing batches: 86%|████████▌ | 323/376 [48:26<05:17, 5.99s/it]
Processing batches: 86%|████████▌ | 324/376 [48:34<05:50, 6.73s/it]
Processing batches: 86%|████████▋ | 325/376 [48:40<05:32, 6.52s/it]
Processing batches: 87%|████████▋ | 326/376 [48:44<04:48, 5.78s/it]
Processing batches: 87%|████████▋ | 327/376 [48:52<05:05, 6.23s/it]
Processing batches: 87%|████████▋ | 328/376 [49:01<05:40, 7.09s/it]
Processing batches: 88%|████████▊ | 329/376 [49:07<05:21, 6.84s/it]
Processing batches: 88%|████████▊ | 330/376 [49:16<05:46, 7.53s/it]
Processing batches: 88%|████████▊ | 331/376 [49:20<04:43, 6.29s/it]
Processing batches: 88%|████████▊ | 332/376 [49:26<04:34, 6.24s/it]
Processing batches: 89%|████████▊ | 333/376 [49:29<03:53, 5.44s/it]
Processing batches: 89%|████████▉ | 334/376 [49:37<04:16, 6.11s/it]
Processing batches: 89%|████████▉ | 335/376 [49:46<04:42, 6.89s/it]
Processing batches: 89%|████████▉ | 336/376 [49:54<04:48, 7.22s/it]
Processing batches: 90%|████████▉ | 337/376 [50:00<04:34, 7.05s/it]
Processing batches: 90%|████████▉ | 338/376 [50:05<04:01, 6.36s/it]
Processing batches: 90%|█████████ | 339/376 [50:09<03:29, 5.66s/it]
Processing batches: 90%|█████████ | 340/376 [50:14<03:11, 5.33s/it]
Processing batches: 91%|█████████ | 341/376 [50:19<03:06, 5.33s/it]
Processing batches: 91%|█████████ | 342/376 [50:28<03:39, 6.45s/it]
Processing batches: 91%|█████████ | 343/376 [50:35<03:40, 6.69s/it]
Processing batches: 91%|█████████▏| 344/376 [50:39<03:08, 5.90s/it]
Processing batches: 92%|█████████▏| 345/376 [50:44<02:54, 5.61s/it]
Processing batches: 92%|█████████▏| 346/376 [50:50<02:52, 5.75s/it]
Processing batches: 92%|█████████▏| 347/376 [50:55<02:33, 5.29s/it]
Processing batches: 93%|█████████▎| 348/376 [51:00<02:26, 5.23s/it]
Processing batches: 93%|█████████▎| 349/376 [51:06<02:31, 5.63s/it]
Processing batches: 93%|█████████▎| 350/376 [51:12<02:28, 5.72s/it]
Processing batches: 93%|█████████▎| 351/376 [51:18<02:25, 5.83s/it]
Processing batches: 94%|█████████▎| 352/376 [51:25<02:28, 6.18s/it]
Processing batches: 94%|█████████▍| 353/376 [51:29<02:05, 5.46s/it]
Processing batches: 94%|█████████▍| 354/376 [51:34<01:58, 5.38s/it]
Processing batches: 94%|█████████▍| 355/376 [51:38<01:45, 5.03s/it]
Processing batches: 95%|█████████▍| 356/376 [51:43<01:40, 5.03s/it]
Processing batches: 95%|█████████▍| 357/376 [51:48<01:31, 4.79s/it]
Processing batches: 95%|█████████▌| 358/376 [52:01<02:12, 7.37s/it]
Processing batches: 95%|█████████▌| 359/376 [52:05<01:45, 6.23s/it]
Processing batches: 96%|█████████▌| 360/376 [52:08<01:25, 5.37s/it]
Processing batches: 96%|█████████▌| 361/376 [52:12<01:14, 4.99s/it]
Processing batches: 96%|█████████▋| 362/376 [52:17<01:08, 4.88s/it]
Processing batches: 97%|█████████▋| 363/376 [52:20<00:56, 4.36s/it]
Processing batches: 97%|█████████▋| 364/376 [52:25<00:55, 4.62s/it]
Processing batches: 97%|█████████▋| 365/376 [52:30<00:51, 4.69s/it]
Processing batches: 97%|█████████▋| 366/376 [52:34<00:44, 4.48s/it]
Processing batches: 98%|█████████▊| 367/376 [52:37<00:37, 4.21s/it]
Processing batches: 98%|█████████▊| 368/376 [52:41<00:32, 4.00s/it]
Processing batches: 98%|█████████▊| 369/376 [52:45<00:27, 3.87s/it]
Processing batches: 98%|█████████▊| 370/376 [52:48<00:22, 3.82s/it]
Processing batches: 99%|█████████▊| 371/376 [52:52<00:19, 3.93s/it]
Processing batches: 99%|█████████▉| 372/376 [52:57<00:16, 4.13s/it]
Processing batches: 99%|█████████▉| 373/376 [53:01<00:12, 4.02s/it]
Processing batches: 99%|█████████▉| 374/376 [53:05<00:08, 4.04s/it]
Processing batches: 100%|█████████▉| 375/376 [53:09<00:03, 3.92s/it]
Processing batches: 100%|██████████| 376/376 [53:10<00:00, 3.13s/it]
Processing batches: 100%|██████████| 376/376 [53:10<00:00, 8.48s/it]
articles_mat: (768, 3753)
authors_mat: (768, 720)
words_mat: (768, 1981)
================================
Run params : {}
--------------------------------
Scoring params : {}
================================
----------- Scores -----------
neighbors_articles_authors : 0.1670
--------- Desc Scores ---------
neighbors_articles_authors_det
0 Mario Jelovic : 0.46
1 Aditi Majumder : 0.43
2 Attila Gyulassy : 0.42
3 Natalia V. Andrienko : 0.42
4 Jeff W. Lichtman : 0.42
...
715 Nam Wook Kim : 0.10
716 Yuanzhe Chen : 0.10
717 Bahador Saket : 0.10
718 Meichun Hsu : 0.10
719 Scott Barlowe : 0.10
Length: 720, dtype: object
VALUE : 0.1670
--------- Raw Scores ---------
neighbors_articles_authors
Michael Sedlmair 0.156522
Dieter Schmalstieg 0.144444
Matthew Kay 0001 0.280000
Xingbo Wang 0001 0.150000
Minfeng Zhu 0.100000
...
Lisa M. Sobierajski 0.150000
Nahum D. Gershon 0.120000
Sidney W. Wang 0.150000
David A. Lane 0.250000
T. Todd Elvins 0.150000
Length: 720, dtype: float64
Total running time of the script: (135 minutes 12.862 seconds)